Amazon ElastiCache now supports real-time hybrid search with vector and full-text
Amazon Web Services has announced that ElastiCache now supports real-time hybrid search capabilities that combine vector similarity search with full-text search in a single query. The new feature allows applications to merge semantic meaning with exact keyword matching without requiring a separate search service, delivering more relevant results than either search method used independently. ElastiCache can now handle hybrid searches across billions of embeddings from major AI providers including Amazon Bedrock, Amazon SageMaker, Anthropic, and OpenAI, with latency as low as microseconds and up to 99% recall rates. The service makes data searchable immediately upon write completion, ensuring applications always search against the most current vectors and text. Developers can leverage the hybrid search functionality to build AI agent memory systems and Retrieval-Augmented Generation (RAG) implementations that retrieve relevant context through both exact terms and semantic meaning, potentially improving generative AI responses while reducing token costs. E-commerce and streaming platforms can use the technology to surface relevant product matches regardless of whether users search by exact product names or descriptions. The hybrid search feature is now available across all commercial AWS Regions, AWS GovCloud regions, and China Regions for node-based clusters running ElastiCache version 9.0 for Valkey. AWS claims that ElastiCache for Valkey delivers the lowest latency vector search with the highest throughput and best price-performance at 95%+ recall rates among popular vector databases on the AWS platform.
Why It Matters
This announcement represents a significant advancement in real-time search capabilities for enterprise applications, particularly those implementing AI and machine learning workflows. By combining vector and full-text search in a single service, AWS eliminates the complexity of managing separate search infrastructures while enabling more sophisticated semantic search applications. The microsecond-level latency and high recall rates make this particularly valuable for real-time applications like recommendation engines, AI agents, and RAG systems that require both speed and accuracy. This positions AWS to compete more effectively in the growing vector database market while providing existing ElastiCache customers with advanced AI-ready search capabilities without requiring additional services.
This summary is generated using AI analysis of the original press release. Always refer to the original source for complete details.