AWS looks to cut storage costs for LLM embeddings with Amazon S3 Vectors

mercredi 16 juillet 2025, 16:46 , par InfoWorld

AWS is previewing a specialized storage offering, Amazon S3 Vectors, that it claims can cut the cost of uploading, storing, and querying vectors by up to 90% compared to using a vector database, a move likely to be of interest to those running generative AI or agentic AI applications in the cloud.

Machine learning models typically represent data as vectors — groups of parameters describing an object — and AI systems use these vector embeddings to efficiently search and reason between different pieces of data.

The vectors are typically stored in specialty vector databases or databases with vector capabilities for similarity search and retrieval at scale.

In contrast, AWS is proposing that enterprises use a new type of S3 bucket purpose-built for storing and querying vector data via a dedicated set of APIs, Amazon S3 Vector, that it says eliminates the need for provisioning infrastructure for a vector database.

Raya Mukherjee, senior analyst at Everest Group, said Amazon S3 or any other cloud-based object storage is cheaper to run and maintain compared to vector databases due to differences in their structure and hardware requirements and thus will help enterprises simplify architecture, reduce operational overhead and decrease cost, said.

While object storage is designed to handle vast volumes of unstructured data using a flat architecture that minimizes overhead and supports efficient retrieval of individual files, vector databases are engineered for high-performance similarity search across complex, high-dimensional data and often rely on specialized indexing methods and hardware acceleration that can drive up infrastructure and operational expenses.

Each Amazon S3 Vectors bucket can support up to 10,000 individual vector indexes, and each index is capable of storing tens of millions of vectors, automatically optimizing storage for performance and cost as vectors are written, updated, or deleted, according to AWS.

Additionally, AWS has integrated S3 Vectors with Amazon Bedrock Knowledge Bases, Amazon SageMaker Unified Studio, and Amazon OpenSearch Service.

This should ensure efficient use of resources even as datasets grow and evolve, Mukherjee said.

While the integration with Bedrock Knowledge Bases and SageMaker Unified Studio will help developers to build RAG applications that ensures cost efficiency of fine-tuning LLMs and reduces hallucinations, the OpenSearch integration will provide flexibility for enterprises to store rarely accessed vectors to save costs.

When these vectors are needed, developers can dynamically shift them to OpenSearch for real-time, low-latency search, the company said.

Enterprises and developers can try out Amazon S3 Vectors, and its integrations with Amazon Bedrock, Amazon OpenSearch Service, and Amazon SageMaker across the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Frankfurt), and Asia Pacific (Sydney) regions.

Lire la suite sur InfoWorld