MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
production
Recherche

AWS takes aim at the PoC-to-production gap holding back enterprise AI

lundi 8 décembre 2025, 19:58 , par InfoWorld
Enterprises are testing AI in all sorts of applications, but too few of their proofs of concept (PoCs) are making into production: just 12%, according to an IDC study.

Amazon Web Services is concerned about this too, with VP of agentic AI Swami Sivasubramanian devoting much of his keynote speech to it at AWS re:Invent last week.

The failures are not down to lack of talent or investment, but how organizations plan and build their PoCs, he said: “Most experiments and PoCs are not designed to be production ready.”

Production workloads, for one, require development teams to deploy not just a handful of agent instances, but often hundreds or thousands of them simultaneously — each performing coordinated tasks, passing context between one another, and interacting with a sprawling web of enterprise systems.

This is a far cry from most PoCs, which might be built around a single agent executing a narrow workflow.

Another hurdle, according to Sivasubramanian, is the complexity that agents in production workloads must contend with, including “a massive amount of data and edge cases”.  

This is unlike PoCs which operate in artificially clean environments and run on sanitized datasets with handcrafted prompts and predictable inputs — all of which hide the realities of live data, such as inconsistent formats, missing fields, conflicting records, and unexpected behaviours.

Then there’s identity and access management. A prototype might get by with a single over-permissioned test account. Production can’t.

“In production, you need rock-solid identity and access management to authenticate users, authorize which tools agents can access on their behalf, and manage these credentials across AWS and third-party services,” Sivasubramanian said.

Even if those hurdles are cleared, the integration of agents into production workloads still remains a key challenge.

“And then of course as you move to production, your agent is not going to live in isolation. It will be part of a wider system, one that can’t fall apart if an integration breaks,” Sivasubramanian said.

Typically, in a PoC, engineers can manually wire data flows, push inputs, and dump outputs to a file or a test interface. If something breaks, they reboot it and move on. That workflow collapses under production conditions: Agents become part of a larger, interdependent system that cannot fall apart every time an integration hiccups.

Moving from PoC to production

Yet Sivasubramanian argued that the gulf between PoC and production can be narrowed.

In his view, enterprises can close the gap by equipping teams with tooling that bakes production readiness into the development process itself, focusing on agility while still being accurate and reliable.

To address concerns around the agility of building agentic systems with accuracy, AWS added an episodic memory feature to Bedrock AgentCore, which lifts the burden of building custom memory scaffolding off developers.

Instead of expecting teams to stitch together their own vector stores, summarization logic, and retrieval layers, the managed module automatically captures interaction traces, compresses them into reusable “episodes,” and brings forward the right context as agents work through new tasks.

In a similar vein, Sivasubramanian also announced the serverless model customization capability in SageMaker AI to help developers automate data prep, training, evaluation, and deployment.

This automation, according to Scott Wheeler, cloud practice leader at AI and data consultancy firm Asperitas, will remove the heavy infrastructure and MLops overhead that often stall fine-tuning efforts, accelerating agentic systems deployment.

The push toward reducing MLops didn’t stop there. Sivasubramanian said that AWS is adding Reinforcement Fine-Tuning (RFT) in Bedrock, enabling developers to shape model behaviour using an automated reinforcement learning (RL) stack.

Wheeler welcomed this, saying it will remove most of the complexity of building a RL stack, including infrastructure, math, and training-pipelines.

SageMaker HyperPod also gained checkpointless training, which enables developers to accelerate the model training process.

To address reliability, Sivasubramanian said that AWS is adding Policy and Evaluations capabilities to Bedrock AgentCore’s Gateway. While Policy will help developers enforce guardrails by intercepting tool calls, Evaluations will help developers simulates real-world agent behavior to catch issues before deployment.

Challenges remain

However, analysts warn that operationalizing autonomous agents remains far from frictionless.

Episodic memory, though a conceptually important feature, is not magic, said David Linthicum, independent consultant and retired chief cloud strategy officer at Deloitte. “It’s impact is proportional to how well enterprises capture, label, and govern behavioural data. That’s the real bottleneck.”

“Without serious data engineering and telemetry work, it risks becoming sophisticated shelfware,” Linthicum said.

He also found fault with RFT in Bedrock, saying that though the feature tries to abstract complexity from RL workflows, it doesn’t remove the most complex parts of the process, such as defining rewards that reflect business value, building robust evaluation, and managing drift.

“That’s where PoCs usually die,” he said.

It is a similar story with the model customization capability in SageMaker AI.

Although it collapses MLOps complexity, it amplified Linthicum’s and Wheeler’s concerns in other areas.

“Now that you have automated not just inference, but design choices, data synthesis, and evaluation, governance teams will demand line-of-sight into what was tuned, which data was generated, and why a given model was selected,” Linthicum said.

Wheeler said that industry sectors with strict regulatory expectations will probably treat the capability as an assistive tool that still requires human review, not a set-and-forget automation: “In short, the value is real, but trust and auditability, not automation, will determine adoption speed,” he said.
https://www.infoworld.com/article/4102632/aws-takes-aim-at-the-poc-to-production-gap-holding-back-en...

Voir aussi

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Date Actuelle
lun. 8 déc. - 21:37 CET