Intelligent CIO North America Issue 60 | Page 40

FEATURE: INVESTMENTS pipelines can’ t feed information quickly enough to their AI services, for example, or when the cost of prompts suddenly surges, or response quality plummets.
AI resilience challenges
To plug this gap, businesses must implement novel types of solutions that can guarantee AI resilience. Doing so starts with establishing holistic quality checks for AI.
A variety of problems may cause AI systems to fail. Common AI resilience challenges include:
• Failure of the infrastructure or data centers that host AI services
• Disruptions to the data pipelines that move data into and out of AI systems
• Data quality problems, which disrupt the ability to feed accurate, complete data to AI systems
• A lack of effective prompts for generating accurate, consistent results from AI systems
• Unexpected increases in the cost of operating AI systems, which could undercut the organization’ s ability to use AI solutions effectively
This means systematically tracking interactions with AI models and services across the entire company and tracking the performance and cost of each transaction.
When you can do this, you can quickly identify problems like slow data pipelines or data quality issues.
You can also comprehensively track cost and performance – and you can do so in a granular way that allows you to compare how different models respond to the same prompt, which in turn makes it possible to determine which model delivers the best balance between cost and performance for a given type of prompt.
As I noted above, the first item on this list – disruption to the infrastructure that hosts AI – tends not to be a major challenge for enterprises today because few organizations host their own AI models.
At present, commercial solutions that deliver functionality like this are challenging to find. But implementing an AI quality monitoring solution inhouse is more feasible than it might sound.
Most are instead using AI services provided by hyperscale platforms, which almost never go down and which provide multiple cloud availability zones and regions to mitigate the impact of outages when they do occur. But consuming AI services from a hyperscale provider doesn’ t mitigate the other resilience challenges listed above.
At my company, we’ ve done it by using multiple AI models to evaluate the performance of other models, allowing us to identify resilience issues quickly while also optimizing our costs.
Getting started with AI resilience today
Your data pipelines could fail or you could experience a major degradation in data quality if the infrastructure that collects and stores data within your business has a problem.
Likewise, if you rely on a prompt library( meaning a collection of approved prompts) to interact with AI models, the library could crash, be hacked or become unavailable. And your AI quality or costs could spiral out of control due to factors like cost-inefficient prompts or changes to the pricing of the AI models your business uses.
What’ s worse, you may not know about these problems until it’ s too late.
Unlike traditional types of workloads, whose performance and availability are easy to track in real time using conventional monitoring and observability software, it’ s much rarer for organizations to have tools in place that automatically generate alerts when their data
Given that AI solutions and enterprise AI strategies are still rapidly evolving, it might seem reasonable to wait until the adoption process is complete and the technology has fully matured to worry about AI resilience. But that would be a major mistake.
The very fact that enterprise AI technology remains so fluid is part of the reason why businesses need to be thinking about, and acting on, ways to mitigate resilience risks starting today.
AI’ s fast-evolving nature breeds substantial risk, and the only way to manage that risk is to develop a resilience strategy.
The sooner organizations do this by comprehensively monitoring the performance and quality of the AI systems they use, the better positioned they will be to leverage AI as a driver of innovation while steeling themselves against the threat of disruption to the AI-powered services businesses increasingly depend on. p
40 INTELLIGENTCIO NORTH AMERICA www www.. intelligentcio. com. com