Intelligent CIO North America Issue 46 | Page 71

INTELLIGENT BRANDS // Software for Business

DataStax to deliver highperformance RAG solution using NVIDIA Microservices

Cutting-edge collaboration enables enterprises to Use DataStax Astra DB with NVIDIA Inference Microservices to create instantaneous vector embeddings to fuel real-time genAI use cases .

DataStax is supporting enterprise retrieval-augmented generation ( RAG ) use cases by integrating the new NVIDIA NIM inference microservices and NeMo Retriever microservices with Astra DB to deliver high-performance RAG data solutions for superior customer experiences .

With this integration , users will be able to create instantaneous vector embeddings 20x faster than other popular cloud embedding services and benefit from an 80 % reduction in cost for services .
With embedded inferencing built on NVIDIA NeMo and NVIDIA Triton Inference Server software , DataStax AstraDB vector performance of RAG use cases running on NVIDIA H100 Tensor Core GPUs achieved 9.48ms latency embedding and indexing documents , which is a 20x improvement .
When combined with NVIDIA NeMo Retriever , Astra DB and DataStax Enterprise ( DataStax ’ s on-premise offering ) provide a fast vector database RAG solution that ’ s built on a scalable NoSQL database that can run on any storage medium .
Out-of-the-box integration with RAGStack ( powered by LangChain and LlamaIndex ) makes it easy for developers to replace their existing embedding model with NIM . In addition , using the RAGStack compatibility matrix tester , enterprises can validate the availability and performance of various combinations of embedding and LLM models for common RAG pipelines .
DataStax is also launching , in developer preview , a new feature called Vectorize which performs embedding generations at the database tier , enabling customers to leverage Astra DB to easily generate embeddings using its own NeMo microservices instance , instead of their own , passing the cost savings directly to the customer .
“ In today ' s dynamic landscape of AI innovation , RAG has emerged as the pivotal differentiator for enterprises building genAI applications with popular large language frameworks ,” said Chet Kapoor , chairman and CEO , DataStax .
“ With a wealth of unstructured data at their disposal , ranging from software logs to customer chat history , enterprises hold a cache of valuable domain knowledge and real-time insights essential for generative AI applications , but still face challenges .
Integrating NVIDIA NIM into RAGStack cuts down the barriers enterprises are facing to bring them the high-performing RAG solutions they need to make significant strides in their genAI application development .”
“ At Skypoint , we have a strict SLA of five seconds to generate responses for our frontline healthcare providers ,” said Tisson Mathew , CEO and founder , Skypoint . “ Hitting this SLA is especially difficult in the scenario that there are multiple LLM and vector search queries . Being able to shave off time from generating embeddings is of vast importance to improving the user experience .”
“ Enterprises are looking to leverage their vast amounts of unstructured data to build more advanced generative AI applications ,” said Kari Briski , Vice President of AI software , NVIDIA . “ Using the integration of NVIDIA NIM and NeMo Retriever microservices with the DataStax Astra DB , businesses can significantly reduce latency and harness the full power of AIdriven data solutions .” p
www . intelligentcio . com INTELLIGENTCIO NORTH AMERICA 71