{{CANONICAL}}
← Back to Tech News

Amazon Bedrock expands support for Service Quotas

Amazon Web Services has expanded Service Quotas support to include the bedrock-mantle endpoint within Amazon Bedrock, its fully managed generative AI service. The enhancement allows customers to monitor inference quotas for the bedrock-mantle endpoint through the familiar AWS Service Quotas console, providing the same visibility and management capabilities they already have for the bedrock-runtime endpoint and other AWS services. The bedrock-mantle endpoint supports OpenAI Responses API, OpenAI Chat Completions API, and Anthropic Messages API, enabling customers to run existing OpenAI or Anthropic applications on Amazon Bedrock with minimal code changes. AWS Service Quotas now exposes per-model input-tokens-per-minute and output-tokens-per-minute quotas for supported models on this endpoint, giving organizations clear visibility into their usage limits and capacity planning needs. The feature is available across all AWS regions where the bedrock-mantle endpoint operates, including US East and West, Asia Pacific, Europe, and South America regions. Customers can access their quotas through the AWS Service Quotas console and follow standard Amazon Bedrock processes to request limit increases when needed for production scaling.

Why It Matters

This enhancement addresses a critical operational need for enterprises deploying generative AI applications at scale. By providing granular visibility into token-per-minute quotas through a centralized console, organizations can better plan capacity, avoid service throttling, and proactively manage their AI workloads. The integration with existing AWS Service Quotas workflows reduces operational complexity for teams already familiar with AWS resource management, making enterprise AI deployments more predictable and manageable.

Read Original Release →
Note

This summary is generated using AI analysis of the original press release. Always refer to the original source for complete details.