EDITOR’ S QUESTION
Brian Sathianathan, Chief Technology Officer and Co-Founder, Iterate. ai
By now, every business has heard the pitch for LLMs. What’ s still less obvious is understanding when a smaller, faster, cheaper model might do the job better. CIOs are under a lot of pressure to figure out if they need large LLMs, small language models( SLMs) or both.
But the choice isn’ t binary and, increasingly, the most effective enterprise AI strategies rely on a combination of the two.
LLMs grab the headlines. Their ability to generate natural-sounding responses across a wide range of prompts makes them attractive for customerfacing use cases and complex reasoning tasks. But SLMs are becoming just as important to business outcomes. These smaller, more targeted models are ideal for faster development, domain-specific use, and efficient scaling. Both models have a role to play, and using them together can lead to smarter, more sustainable outcomes.
In its recent coverage of top challenges for CIOs in 2025, Intelligent CIO emphasized the need for proactive, pragmatic decision-making about AI adoption.
As the pressure grows to deliver both innovation and ROI, the ability to match the right tool to the right task will separate the leaders from the laggards. SLMs are particularly compelling for organizations looking to move quickly. Because of their smaller size( often under 10 billion parameters), they train faster, require fewer resources and can run on common CPUs or NPUs rather than high-end GPUs. Some of the most streamlined models, especially those used in retrievalaugmented generation( RAG) systems, weigh in at just 70,000 parameters. That compact profile translates to lower infrastructure costs, less energy consumption and faster time to value.
Many enterprises are already leveraging SLMs through internal development efforts or platforms that make small model deployment easier. At Iterate. ai, for example, our team works with organizations using Generate, a private AI platform that supports both large and small model architecture for rapid prototyping and production AI workflows. What we’ re seeing firsthand is that many problems CIOs are trying to solve don’ t require massive scale. They require speed, context and precision.
That’ s not to say LLMs don’ t have a place. Their strength lies in general-purpose capabilities and the depth of their training. LLMs can handle broad or poorly defined queries, support richer interactions, and adapt more easily to evolving use cases. But that power comes at a cost – and it’ s often a steep one.
Running an LLM at enterprise scale requires expensive GPUs, high energy use and ongoing investment to stay current. As major providers compete in the LLM arms race, many enterprises are discovering that mimicking those tactics isn’ t going to be all that sustainable.
I see today’ s situation mirroring what we’ ve seen in early cloud adoption patterns. Enterprises jumped in quickly, then realized they needed to rein in costs and rethink long-term strategy. The same course correction is now playing out with GenAI. CIOs that rushed into LLM deployment are now pausing to assess whether their infrastructure can handle the load, whether their applications are differentiated enough to justify the cost and whether better options exist.
www. intelligentcio. com INTELLIGENTCIO NORTH AMERICA 33