Key takeaways

  • Policy in Amazon Bedrock AgentCore actively blocks unauthorized agent actions through real-time, deterministic controls that operate outside of the agent code.
  • AgentCore Evaluations helps developers continuously inspect the quality of an agent based on its behavior.
  • AgentCore Memory introduces episodic functionality that helps agents learn from experiences, improving decision-making.
  • Organizations of all sizes and regulatory requirements—including Amazon Devices Operations & Supply Chain, Archera.ai, Cohere Health, Cox Automotive, Druva, Heroku, Natera, NTT Data, MongoDB, PGA TOUR, Pulumi, Thomson Reuters, Workday, Snorkel.ai, Swisscom, and S&P Global Market Intelligence—trust AgentCore to accelerate their AI agents into production.

Today, we announced new innovations in Amazon Bedrock AgentCore, the most advanced platform for building and deploying agents securely at scale. Policy in AgentCore allows teams to set boundaries on what agents can do with tools, and AgentCore Evaluations help teams understand how their agents will perform in the real world. Additionally, AWS launched an enhanced memory capability that enables agents to learn from experience and improve over time, providing more tailored insights to customers.

Develop enterprise AI agents that know their power and their limits

While the ability for agents to reason and act autonomously makes them powerful, organizations must establish robust controls to prevent unauthorized data access, inappropriate interactions, and system-level mistakes that could impact business operations. Even with careful prompting, agents make real-world mistakes that can have serious consequences.
Today, we are launching Policy in Amazon Bedrock AgentCore, which helps organizations set clear boundaries for agent actions. Using natural language, teams can now give agents boundaries by defining which tools and data they can access, what actions they can perform, and under what conditions. These tools could be APIs, Lambda functions, MCP servers, or popular third-party services like Salesforce and Slack. To ensure agents stay fast and responsive, Policy is integrated into AgentCore Gateway to instantly check agent actions against policies in milliseconds. This ensures agents stay within defined boundaries while operating autonomously. The natural language-based policy authoring provides a more accessible and user-friendly way for customers to create fine-grained policies by allowing them to describe rules in natural language instead of writing formal policy code. For example, a simple policy like “Block all refunds from customers when the reimbursement amount is greater than $1,000” can be implemented and enforced consistently, following Amazon's “trust, but verify” principle. This will allow agents to operate autonomously while maintaining appropriate oversight.
Druva is a leading provider of data security solutions. "Typically, customers can spend hours manually checking logs across dozens of systems when data backups fail,” said David Gildea, vice president of product AI at Druva. “However, with our AI agents, they can get instant analysis and step-by-step remediation for data recovery. We are excited to get started with Policy in AgentCore as it will help our customers set clear boundaries for agent access to internal tools and data like backup systems, security logs, and monitoring dashboards. With appropriate policies in place, our developers can innovate confidently, knowing agents will stay within defined compliance boundaries. This enables us to expand our agent platform while maintaining the strict security standards our enterprise customers expect."

Gain complete visibility into AI agent behavior and results

Unlike traditional software metrics, evaluating AI agent quality requires complex data science pipelines, subjective assessments, and continuous real-time monitoring, a challenge that compounds with each agent update or model change.
AgentCore Evaluations simplifies complicated processes and eliminates complex infrastructure management with 13 pre-built evaluators for common quality dimensions such as correctness, helpfulness, tool selection accuracy, safety, goal success rate, and context relevance. Additionally, developers have the flexibility to write their own custom evaluators using their preferred LLMs and prompts. Previously, this required months of data science work to build just the evaluation systems. The new service continuously samples live agent interactions to analyze agent behavior for pre-identified criteria like correctness, helpfulness, and safety. Development teams can set up alerts for proactive quality monitoring, using evaluations both during testing and in production. For example, if a customer service agent's satisfaction scores drop by 10% over eight hours, the system triggers immediate alerts, enabling swift response before customer experience is impacted.
Natera is a leader in genetic testing and diagnostics. “At Natera, we're transforming oncology patient care through AI agents,” said Mirko Buholzer, software engineering lead, Natera. “Our teams are currently undertaking a substantial effort to uphold consistent quality and performance across our AI agents while meeting strict health care compliance standards. AgentCore Evaluations will play a key role in this work by continuously monitoring our agents' performance by using essential metrics such as accuracy, helpfulness, and patient satisfaction. We expect this real-time quality intelligence to help us quickly identify and address issues preemptively. With AgentCore Evaluations, we aim to confidently deploy reliable agents that maintain our high standards and support the delivery of transformative patient care at scale.”

Build agents that get smarter with every interaction

Most AI agents today lack critical memory capabilities because "memory" is often limited to a short-term context window that is reset with each new interaction, preventing them from learning from past successes or failures in production environments.
AgentCore Memory provides this critical feature, allowing an agent to build a coherent understanding of users over time. Today, AgentCore Memory is making a new episodic functionality generally available that allows agents to learn from past experiences and apply those insights to future interactions. Through structured episodes that capture context, reasoning, actions, and outcomes, another agent automatically analyzes patterns to improve decision-making. When agents encounter similar tasks, they can quickly access relevant historical data, reducing processing time and eliminating the need for extensive custom instructions. For example, an agent books airport transportation 45 minutes before the flight when you are traveling alone. Three months later, when you are traveling to the same destination—with kids this time—it automatically schedules pickup two hours early, remembering previous family trip challenges. This targeted learning approach helps agents make more consistent decisions based on actual performance data rather than relying on predetermined guidelines.
S&P Global Market Intelligence provides insights and leading data and technology solutions to institutional investors, banks, and corporations. “We recently developed Astra, an internal general-purpose agentic workflow platform, but faced challenges orchestrating complex multi-agent workflows across our distributed organization,” said Astier Helen, head of Technology, MI Enterprise Technology and Sustainability, at S&P Global Market Intelligence. “As hundreds of specialized agents emerged, managing state and maintaining consistent context became increasingly difficult, highlighting the need for a unified memory layer. Amazon Bedrock AgentCore Memory provided the solution through seamless, centralized state checkpointing across our multi-agent orchestration stack. With the new episodic memory functionality, our agents will learn from prior analyses to generate more intelligent insights. Previously, deploying agents onto the Astra platform took weeks. Now, with AgentCore we can create and deploy an agent or MCP server within minutes.”
Today’s innovations give you purpose-built agent infrastructure that lets you focus on innovation rather than building AI foundations.
For more details on the new Amazon Bedrock AgentCore innovations, visit:
Get the latest news from AWS re:Invent, including all things agentic and generative AI, product and service announcements, and more.