Agentic Commerce in Production: What Teams Miss

By

Komy A.

June 25, 2026

9 min read

Agentic Commerce in Production: What Enterprise Teams Don't See Coming

The Moment Is Real

McKinsey published its agentic commerce report in October 2025. Stripe launched its Agentic Commerce Suite shortly after. Mastercard, Visa, PayPal, Google Cloud, IBM all planted their flags. The category is real, the money is moving, and the early enterprise pilots are turning into serious build programs.

The pitch is straightforward: AI agents browse, compare, select, and buy on behalf of customers or business users. An employee types a request like "reorder our standard office supplies when we drop below 30 days of inventory" and an agent handles it end to end. A consumer says "find me the cheapest flight to Singapore that departs before 8am" and an agent books it. No more twelve-tab product comparisons. No more manual purchase approvals for commodity spend.

Most enterprise teams we talk to have seen a demo that works beautifully. And they want to know why their own build is harder than it looked.

What Agentic Commerce Actually Requires

Agentic commerce is not a chatbot with a buy button. The distinction matters because most of the production failures we see come from teams who architected it like one.

A shopping chatbot takes input, retrieves information, and presents options. A human clicks buy. The agent is advisory. An agentic commerce system is goal-directed. It receives an intent, breaks it into sub-tasks, executes those sub-tasks across external systems, handles failures mid-stream, and completes the transaction without waiting for a human at each step.

That shift from advisory to autonomous changes every technical requirement. Authentication, session management, error handling, audit logging, fallback behavior, spend controls. None of these are optional. All of them are significantly harder than they appear in a prototype.

The Trust Architecture Problem

In a standard checkout, a human authenticates once, confirms their intent explicitly, and the platform executes. The liability model is clear. In agentic commerce, an agent executes autonomously. Which raises an immediate question: how does the merchant's system know the agent is acting with real authorization, and how does the agent prove it is not being spoofed mid-session?

This is the hardest problem most POCs skip over. Stripe's Agentic Commerce spec begins to address this with agent identity tokens. Mastercard's Agent Pay framework approaches it from the payment rails side. But neither framework covers the full stack of what needs to happen inside your own system.

You need a delegation model. The human user grants the agent a scoped permission set. Those permissions need to be stored, validated at execution time, and revocable. You need a way to pass that authorization context across system boundaries without leaking credentials. And you need audit logs that reconstruct exactly what the agent did, in what order, and under whose authorization, so you can answer the inevitable question when something goes wrong.

Skipping this and using shared API keys or broad OAuth scopes is how demos ship and production incidents happen. A POC works beautifully in a sandboxed environment, gets promoted to a real payment environment, and the first production failure triggers an investigation that reveals no meaningful audit trail and credentials that were far too permissive. We have seen this exact pattern more than once.

Session State and the Multi-Step Execution Problem

A typical agentic commerce flow is not a single API call. It looks more like this: interpret intent, search catalog, filter by availability and price, check loyalty account balance, apply eligible promotions, construct cart, validate shipping address, confirm payment method is authorized for this merchant category, submit order, confirm receipt, update internal tracking system.

That is eight to twelve discrete operations, some of which depend on the output of earlier ones, and some of which call external systems that have their own latency profiles, rate limits, and failure modes. In a demo, all of these systems are stubbed or use sample data. In production, each one is a real dependency.

The session state problem is that you need to handle partial completion. What happens when the agent completes steps one through six and then the payment system returns a timeout? Does it retry? Does it roll back? Does it notify the human? Does it wait and try again later? Each of these paths needs to be explicitly designed and explicitly tested. Most POC architectures have no answer for this because the happy path always completes.

This is where production-grade agentic systems diverge sharply from prototypes. The orchestration layer needs to be stateful, resumable, and idempotent. Workflows that cannot be safely retried at any step become reliability liabilities the moment real transaction volumes hit them.

Catalog and Inventory Synchronization

This one surprises teams more than it should. An agentic commerce system is only as good as its understanding of what is actually available, at what price, right now. In demo environments, that data is static and curated. In production, it changes constantly.

Consider a retailer with 500,000 SKUs across multiple warehouse locations, third-party sellers, and a mix of in-stock and made-to-order items. An agent that books a product that is actually out of stock, or quotes a price that expired three minutes ago, or misses a regional availability constraint creates downstream problems that are harder to fix than they would have been to prevent. Returns, customer service escalations, manual reconciliation.

The architecture decision here is whether your agent queries inventory at decision time or relies on a synchronization layer that keeps an up-to-date local representation. Neither approach is universally correct. Query-at-decision-time is accurate but slow and expensive under load. Synchronized local state is fast but introduces staleness risk. The right answer depends on your catalog volatility, your transaction volume, and your tolerance for stale data. That decision needs to be made explicitly rather than falling out of whichever was easiest to build first.

Spend Controls and Authorization Limits

Enterprise procurement teams learned this lesson decades ago with corporate cards: autonomous spend without controls is a governance problem. Agentic commerce relearns this lesson at the AI layer.

Production agentic commerce deployments for enterprise need hard spend limits at the agent level. Not soft guardrails that the agent is prompted to respect. Hard limits enforced at the execution layer that cannot be overridden by prompt injection or by an unusually creative interpretation of a user's request.

They also need approval workflows for transactions above threshold. Merchant category controls so an agent authorized for office supply purchases cannot be manipulated into making purchases outside that category. Real-time alerting when the agent approaches or hits those limits, not batch reporting that surfaces the information hours later.

Most of this is engineering work that has nothing to do with AI. It is the same controls infrastructure that governs any automated spend system. But teams building agentic commerce for the first time often treat it as an afterthought because the demo worked without it.

What the Agentic Commerce Protocol Actually Gives You

The Agentic Commerce Protocol, developed by Stripe and OpenAI, is genuinely useful. It standardizes how agents authenticate to merchant systems, how they express purchase intent, and how merchants signal what agent interactions they support. If you are building a merchant-side integration, implementing this protocol means you can accept purchases from any standards-compliant shopping agent without bespoke integrations for each one.

What it does not give you is an orchestration framework, a session state management layer, a delegation model for your internal authorization system, a spend controls implementation, or a fallback strategy for partial execution failures. It is a communication protocol, not an end-to-end system architecture.

Teams that conflate "we support the protocol" with "we have shipped agentic commerce" are in for a difficult few months.

The Integration Reality

Agentic e-commerce does not exist in isolation. In any real enterprise deployment, the agent needs to connect to some combination of: an ERP, a warehouse management system, a product information manager, a loyalty platform, a payment processor, an identity and authorization service, and a CRM. Some of these integrations will have well-documented APIs. Others will be legacy systems with SOAP interfaces, undocumented rate limits, and authentication methods that predate OAuth by a decade.

The agent architecture has to absorb all of this. Each integration point is a potential failure mode, a data consistency problem, and a latency contributor. The system's overall reliability is bounded by the least reliable integration in the chain. Building for this reality is one of the core engineering challenges that separates a stable enterprise deployment from a pilot that worked in a controlled environment.

McKinsey's agentic commerce research frames the opportunity as reaching 10 to 15 percent of e-commerce transactions within five years. The companies that capture that share are the ones that solve the integration and reliability problems first, not the ones that built the most convincing demo.

Compliance and Data Governance in Singapore and the US

In Singapore, enterprise commerce deployments operate under constraints that most generic framework documentation ignores. PDPA compliance requires that any agent handling consumer personal data has clear data minimization policies and documented consent flows. If your agentic commerce system logs conversations that include shipping addresses, purchase histories, or behavioral data to train or improve the agent, that logging needs to be compliant with PDPA and, for any cross-border data flows, the relevant US state privacy laws.

For US teams operating in regulated categories, including healthcare procurement, financial services purchasing, or anything touching HIPAA-covered entities, the agent's access to PII and PHI needs to be scoped, audited, and governed the same way any other automated system would be. The AI label does not change the compliance obligations.

Singapore's Personal Data Protection Commission has published increasingly specific guidance on automated decision-making systems, and agentic commerce systems that trigger purchases on behalf of consumers fall squarely into that category. Teams in Singapore building consumer-facing deployments should have legal review in the architecture loop early, not after the system is in production.

What Actually Gets Cut Going From POC to Production

In almost every agentic commerce build that starts as an internal POC, the same things get cut when pressure mounts to ship something. Audit logging gets simplified. The delegation model gets replaced with a broad permission scope because it was faster. Spend controls get implemented as prompt instructions rather than enforced limits. Fallback behavior for partial execution gets documented as a future sprint item and stays there.

None of these cuts feel significant at first. The system still works in the primary case. But each one creates a specific risk that surfaces at production scale. Simplified audit logging makes incident investigation painful. Broad permission scopes create attack surfaces. Prompt-based spend controls fail when the prompt is manipulated. Missing fallback logic turns a vendor timeout into a stuck transaction with no clear resolution path.

The teams that get agentic commerce into stable production tend to be the ones that treat these as first-class engineering problems from day one, not as polish to add later. That often requires resisting the pressure to demo early and often, because a polished demo can mask significant architectural gaps that only surface under real load.

At Genta AI Solutions, we build production agentic systems for enterprise clients in Singapore and the US. The pattern we see consistently is that the gap between a working pilot and a shippable production system is mostly infrastructure, not intelligence. The agent reasoning usually works. The hard part is everything around it.

If you are working through the architecture decisions on an agentic commerce deployment and want to compare notes with a team that has shipped this, we are happy to talk.

View all

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

By

Komy A.

June 25, 2026

9 min read

Agentic Commerce in Production: What Enterprise Teams Don't See Coming

The Moment Is Real

McKinsey published its agentic commerce report in October 2025. Stripe launched its Agentic Commerce Suite shortly after. Mastercard, Visa, PayPal, Google Cloud, IBM all planted their flags. The category is real, the money is moving, and the early enterprise pilots are turning into serious build programs.

The pitch is straightforward: AI agents browse, compare, select, and buy on behalf of customers or business users. An employee types a request like "reorder our standard office supplies when we drop below 30 days of inventory" and an agent handles it end to end. A consumer says "find me the cheapest flight to Singapore that departs before 8am" and an agent books it. No more twelve-tab product comparisons. No more manual purchase approvals for commodity spend.

Most enterprise teams we talk to have seen a demo that works beautifully. And they want to know why their own build is harder than it looked.

What Agentic Commerce Actually Requires

Agentic commerce is not a chatbot with a buy button. The distinction matters because most of the production failures we see come from teams who architected it like one.

A shopping chatbot takes input, retrieves information, and presents options. A human clicks buy. The agent is advisory. An agentic commerce system is goal-directed. It receives an intent, breaks it into sub-tasks, executes those sub-tasks across external systems, handles failures mid-stream, and completes the transaction without waiting for a human at each step.

That shift from advisory to autonomous changes every technical requirement. Authentication, session management, error handling, audit logging, fallback behavior, spend controls. None of these are optional. All of them are significantly harder than they appear in a prototype.

The Trust Architecture Problem

In a standard checkout, a human authenticates once, confirms their intent explicitly, and the platform executes. The liability model is clear. In agentic commerce, an agent executes autonomously. Which raises an immediate question: how does the merchant's system know the agent is acting with real authorization, and how does the agent prove it is not being spoofed mid-session?

This is the hardest problem most POCs skip over. Stripe's Agentic Commerce spec begins to address this with agent identity tokens. Mastercard's Agent Pay framework approaches it from the payment rails side. But neither framework covers the full stack of what needs to happen inside your own system.

You need a delegation model. The human user grants the agent a scoped permission set. Those permissions need to be stored, validated at execution time, and revocable. You need a way to pass that authorization context across system boundaries without leaking credentials. And you need audit logs that reconstruct exactly what the agent did, in what order, and under whose authorization, so you can answer the inevitable question when something goes wrong.

Skipping this and using shared API keys or broad OAuth scopes is how demos ship and production incidents happen. A POC works beautifully in a sandboxed environment, gets promoted to a real payment environment, and the first production failure triggers an investigation that reveals no meaningful audit trail and credentials that were far too permissive. We have seen this exact pattern more than once.

Session State and the Multi-Step Execution Problem

A typical agentic commerce flow is not a single API call. It looks more like this: interpret intent, search catalog, filter by availability and price, check loyalty account balance, apply eligible promotions, construct cart, validate shipping address, confirm payment method is authorized for this merchant category, submit order, confirm receipt, update internal tracking system.

That is eight to twelve discrete operations, some of which depend on the output of earlier ones, and some of which call external systems that have their own latency profiles, rate limits, and failure modes. In a demo, all of these systems are stubbed or use sample data. In production, each one is a real dependency.

The session state problem is that you need to handle partial completion. What happens when the agent completes steps one through six and then the payment system returns a timeout? Does it retry? Does it roll back? Does it notify the human? Does it wait and try again later? Each of these paths needs to be explicitly designed and explicitly tested. Most POC architectures have no answer for this because the happy path always completes.

This is where production-grade agentic systems diverge sharply from prototypes. The orchestration layer needs to be stateful, resumable, and idempotent. Workflows that cannot be safely retried at any step become reliability liabilities the moment real transaction volumes hit them.

Catalog and Inventory Synchronization

This one surprises teams more than it should. An agentic commerce system is only as good as its understanding of what is actually available, at what price, right now. In demo environments, that data is static and curated. In production, it changes constantly.

Consider a retailer with 500,000 SKUs across multiple warehouse locations, third-party sellers, and a mix of in-stock and made-to-order items. An agent that books a product that is actually out of stock, or quotes a price that expired three minutes ago, or misses a regional availability constraint creates downstream problems that are harder to fix than they would have been to prevent. Returns, customer service escalations, manual reconciliation.

The architecture decision here is whether your agent queries inventory at decision time or relies on a synchronization layer that keeps an up-to-date local representation. Neither approach is universally correct. Query-at-decision-time is accurate but slow and expensive under load. Synchronized local state is fast but introduces staleness risk. The right answer depends on your catalog volatility, your transaction volume, and your tolerance for stale data. That decision needs to be made explicitly rather than falling out of whichever was easiest to build first.

Spend Controls and Authorization Limits

Enterprise procurement teams learned this lesson decades ago with corporate cards: autonomous spend without controls is a governance problem. Agentic commerce relearns this lesson at the AI layer.

Production agentic commerce deployments for enterprise need hard spend limits at the agent level. Not soft guardrails that the agent is prompted to respect. Hard limits enforced at the execution layer that cannot be overridden by prompt injection or by an unusually creative interpretation of a user's request.

They also need approval workflows for transactions above threshold. Merchant category controls so an agent authorized for office supply purchases cannot be manipulated into making purchases outside that category. Real-time alerting when the agent approaches or hits those limits, not batch reporting that surfaces the information hours later.

Most of this is engineering work that has nothing to do with AI. It is the same controls infrastructure that governs any automated spend system. But teams building agentic commerce for the first time often treat it as an afterthought because the demo worked without it.

What the Agentic Commerce Protocol Actually Gives You

The Agentic Commerce Protocol, developed by Stripe and OpenAI, is genuinely useful. It standardizes how agents authenticate to merchant systems, how they express purchase intent, and how merchants signal what agent interactions they support. If you are building a merchant-side integration, implementing this protocol means you can accept purchases from any standards-compliant shopping agent without bespoke integrations for each one.

What it does not give you is an orchestration framework, a session state management layer, a delegation model for your internal authorization system, a spend controls implementation, or a fallback strategy for partial execution failures. It is a communication protocol, not an end-to-end system architecture.

Teams that conflate "we support the protocol" with "we have shipped agentic commerce" are in for a difficult few months.

The Integration Reality

Agentic e-commerce does not exist in isolation. In any real enterprise deployment, the agent needs to connect to some combination of: an ERP, a warehouse management system, a product information manager, a loyalty platform, a payment processor, an identity and authorization service, and a CRM. Some of these integrations will have well-documented APIs. Others will be legacy systems with SOAP interfaces, undocumented rate limits, and authentication methods that predate OAuth by a decade.

The agent architecture has to absorb all of this. Each integration point is a potential failure mode, a data consistency problem, and a latency contributor. The system's overall reliability is bounded by the least reliable integration in the chain. Building for this reality is one of the core engineering challenges that separates a stable enterprise deployment from a pilot that worked in a controlled environment.

McKinsey's agentic commerce research frames the opportunity as reaching 10 to 15 percent of e-commerce transactions within five years. The companies that capture that share are the ones that solve the integration and reliability problems first, not the ones that built the most convincing demo.

Compliance and Data Governance in Singapore and the US

In Singapore, enterprise commerce deployments operate under constraints that most generic framework documentation ignores. PDPA compliance requires that any agent handling consumer personal data has clear data minimization policies and documented consent flows. If your agentic commerce system logs conversations that include shipping addresses, purchase histories, or behavioral data to train or improve the agent, that logging needs to be compliant with PDPA and, for any cross-border data flows, the relevant US state privacy laws.

For US teams operating in regulated categories, including healthcare procurement, financial services purchasing, or anything touching HIPAA-covered entities, the agent's access to PII and PHI needs to be scoped, audited, and governed the same way any other automated system would be. The AI label does not change the compliance obligations.

Singapore's Personal Data Protection Commission has published increasingly specific guidance on automated decision-making systems, and agentic commerce systems that trigger purchases on behalf of consumers fall squarely into that category. Teams in Singapore building consumer-facing deployments should have legal review in the architecture loop early, not after the system is in production.

What Actually Gets Cut Going From POC to Production

In almost every agentic commerce build that starts as an internal POC, the same things get cut when pressure mounts to ship something. Audit logging gets simplified. The delegation model gets replaced with a broad permission scope because it was faster. Spend controls get implemented as prompt instructions rather than enforced limits. Fallback behavior for partial execution gets documented as a future sprint item and stays there.

None of these cuts feel significant at first. The system still works in the primary case. But each one creates a specific risk that surfaces at production scale. Simplified audit logging makes incident investigation painful. Broad permission scopes create attack surfaces. Prompt-based spend controls fail when the prompt is manipulated. Missing fallback logic turns a vendor timeout into a stuck transaction with no clear resolution path.

The teams that get agentic commerce into stable production tend to be the ones that treat these as first-class engineering problems from day one, not as polish to add later. That often requires resisting the pressure to demo early and often, because a polished demo can mask significant architectural gaps that only surface under real load.

At Genta AI Solutions, we build production agentic systems for enterprise clients in Singapore and the US. The pattern we see consistently is that the gap between a working pilot and a shippable production system is mostly infrastructure, not intelligence. The agent reasoning usually works. The hard part is everything around it.

If you are working through the architecture decisions on an agentic commerce deployment and want to compare notes with a team that has shipped this, we are happy to talk.

View all

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

By

Komy A.

June 25, 2026

9 min read

Agentic Commerce in Production: What Enterprise Teams Don't See Coming

The Moment Is Real

McKinsey published its agentic commerce report in October 2025. Stripe launched its Agentic Commerce Suite shortly after. Mastercard, Visa, PayPal, Google Cloud, IBM all planted their flags. The category is real, the money is moving, and the early enterprise pilots are turning into serious build programs.

The pitch is straightforward: AI agents browse, compare, select, and buy on behalf of customers or business users. An employee types a request like "reorder our standard office supplies when we drop below 30 days of inventory" and an agent handles it end to end. A consumer says "find me the cheapest flight to Singapore that departs before 8am" and an agent books it. No more twelve-tab product comparisons. No more manual purchase approvals for commodity spend.

Most enterprise teams we talk to have seen a demo that works beautifully. And they want to know why their own build is harder than it looked.

What Agentic Commerce Actually Requires

Agentic commerce is not a chatbot with a buy button. The distinction matters because most of the production failures we see come from teams who architected it like one.

A shopping chatbot takes input, retrieves information, and presents options. A human clicks buy. The agent is advisory. An agentic commerce system is goal-directed. It receives an intent, breaks it into sub-tasks, executes those sub-tasks across external systems, handles failures mid-stream, and completes the transaction without waiting for a human at each step.

That shift from advisory to autonomous changes every technical requirement. Authentication, session management, error handling, audit logging, fallback behavior, spend controls. None of these are optional. All of them are significantly harder than they appear in a prototype.

The Trust Architecture Problem

In a standard checkout, a human authenticates once, confirms their intent explicitly, and the platform executes. The liability model is clear. In agentic commerce, an agent executes autonomously. Which raises an immediate question: how does the merchant's system know the agent is acting with real authorization, and how does the agent prove it is not being spoofed mid-session?

This is the hardest problem most POCs skip over. Stripe's Agentic Commerce spec begins to address this with agent identity tokens. Mastercard's Agent Pay framework approaches it from the payment rails side. But neither framework covers the full stack of what needs to happen inside your own system.

You need a delegation model. The human user grants the agent a scoped permission set. Those permissions need to be stored, validated at execution time, and revocable. You need a way to pass that authorization context across system boundaries without leaking credentials. And you need audit logs that reconstruct exactly what the agent did, in what order, and under whose authorization, so you can answer the inevitable question when something goes wrong.

Skipping this and using shared API keys or broad OAuth scopes is how demos ship and production incidents happen. A POC works beautifully in a sandboxed environment, gets promoted to a real payment environment, and the first production failure triggers an investigation that reveals no meaningful audit trail and credentials that were far too permissive. We have seen this exact pattern more than once.

Session State and the Multi-Step Execution Problem

A typical agentic commerce flow is not a single API call. It looks more like this: interpret intent, search catalog, filter by availability and price, check loyalty account balance, apply eligible promotions, construct cart, validate shipping address, confirm payment method is authorized for this merchant category, submit order, confirm receipt, update internal tracking system.

That is eight to twelve discrete operations, some of which depend on the output of earlier ones, and some of which call external systems that have their own latency profiles, rate limits, and failure modes. In a demo, all of these systems are stubbed or use sample data. In production, each one is a real dependency.

The session state problem is that you need to handle partial completion. What happens when the agent completes steps one through six and then the payment system returns a timeout? Does it retry? Does it roll back? Does it notify the human? Does it wait and try again later? Each of these paths needs to be explicitly designed and explicitly tested. Most POC architectures have no answer for this because the happy path always completes.

This is where production-grade agentic systems diverge sharply from prototypes. The orchestration layer needs to be stateful, resumable, and idempotent. Workflows that cannot be safely retried at any step become reliability liabilities the moment real transaction volumes hit them.

Catalog and Inventory Synchronization

This one surprises teams more than it should. An agentic commerce system is only as good as its understanding of what is actually available, at what price, right now. In demo environments, that data is static and curated. In production, it changes constantly.

Consider a retailer with 500,000 SKUs across multiple warehouse locations, third-party sellers, and a mix of in-stock and made-to-order items. An agent that books a product that is actually out of stock, or quotes a price that expired three minutes ago, or misses a regional availability constraint creates downstream problems that are harder to fix than they would have been to prevent. Returns, customer service escalations, manual reconciliation.

The architecture decision here is whether your agent queries inventory at decision time or relies on a synchronization layer that keeps an up-to-date local representation. Neither approach is universally correct. Query-at-decision-time is accurate but slow and expensive under load. Synchronized local state is fast but introduces staleness risk. The right answer depends on your catalog volatility, your transaction volume, and your tolerance for stale data. That decision needs to be made explicitly rather than falling out of whichever was easiest to build first.

Spend Controls and Authorization Limits

Enterprise procurement teams learned this lesson decades ago with corporate cards: autonomous spend without controls is a governance problem. Agentic commerce relearns this lesson at the AI layer.

Production agentic commerce deployments for enterprise need hard spend limits at the agent level. Not soft guardrails that the agent is prompted to respect. Hard limits enforced at the execution layer that cannot be overridden by prompt injection or by an unusually creative interpretation of a user's request.

They also need approval workflows for transactions above threshold. Merchant category controls so an agent authorized for office supply purchases cannot be manipulated into making purchases outside that category. Real-time alerting when the agent approaches or hits those limits, not batch reporting that surfaces the information hours later.

Most of this is engineering work that has nothing to do with AI. It is the same controls infrastructure that governs any automated spend system. But teams building agentic commerce for the first time often treat it as an afterthought because the demo worked without it.

What the Agentic Commerce Protocol Actually Gives You

The Agentic Commerce Protocol, developed by Stripe and OpenAI, is genuinely useful. It standardizes how agents authenticate to merchant systems, how they express purchase intent, and how merchants signal what agent interactions they support. If you are building a merchant-side integration, implementing this protocol means you can accept purchases from any standards-compliant shopping agent without bespoke integrations for each one.

What it does not give you is an orchestration framework, a session state management layer, a delegation model for your internal authorization system, a spend controls implementation, or a fallback strategy for partial execution failures. It is a communication protocol, not an end-to-end system architecture.

Teams that conflate "we support the protocol" with "we have shipped agentic commerce" are in for a difficult few months.

The Integration Reality

Agentic e-commerce does not exist in isolation. In any real enterprise deployment, the agent needs to connect to some combination of: an ERP, a warehouse management system, a product information manager, a loyalty platform, a payment processor, an identity and authorization service, and a CRM. Some of these integrations will have well-documented APIs. Others will be legacy systems with SOAP interfaces, undocumented rate limits, and authentication methods that predate OAuth by a decade.

The agent architecture has to absorb all of this. Each integration point is a potential failure mode, a data consistency problem, and a latency contributor. The system's overall reliability is bounded by the least reliable integration in the chain. Building for this reality is one of the core engineering challenges that separates a stable enterprise deployment from a pilot that worked in a controlled environment.

McKinsey's agentic commerce research frames the opportunity as reaching 10 to 15 percent of e-commerce transactions within five years. The companies that capture that share are the ones that solve the integration and reliability problems first, not the ones that built the most convincing demo.

Compliance and Data Governance in Singapore and the US

In Singapore, enterprise commerce deployments operate under constraints that most generic framework documentation ignores. PDPA compliance requires that any agent handling consumer personal data has clear data minimization policies and documented consent flows. If your agentic commerce system logs conversations that include shipping addresses, purchase histories, or behavioral data to train or improve the agent, that logging needs to be compliant with PDPA and, for any cross-border data flows, the relevant US state privacy laws.

For US teams operating in regulated categories, including healthcare procurement, financial services purchasing, or anything touching HIPAA-covered entities, the agent's access to PII and PHI needs to be scoped, audited, and governed the same way any other automated system would be. The AI label does not change the compliance obligations.

Singapore's Personal Data Protection Commission has published increasingly specific guidance on automated decision-making systems, and agentic commerce systems that trigger purchases on behalf of consumers fall squarely into that category. Teams in Singapore building consumer-facing deployments should have legal review in the architecture loop early, not after the system is in production.

What Actually Gets Cut Going From POC to Production

In almost every agentic commerce build that starts as an internal POC, the same things get cut when pressure mounts to ship something. Audit logging gets simplified. The delegation model gets replaced with a broad permission scope because it was faster. Spend controls get implemented as prompt instructions rather than enforced limits. Fallback behavior for partial execution gets documented as a future sprint item and stays there.

None of these cuts feel significant at first. The system still works in the primary case. But each one creates a specific risk that surfaces at production scale. Simplified audit logging makes incident investigation painful. Broad permission scopes create attack surfaces. Prompt-based spend controls fail when the prompt is manipulated. Missing fallback logic turns a vendor timeout into a stuck transaction with no clear resolution path.

The teams that get agentic commerce into stable production tend to be the ones that treat these as first-class engineering problems from day one, not as polish to add later. That often requires resisting the pressure to demo early and often, because a polished demo can mask significant architectural gaps that only surface under real load.

At Genta AI Solutions, we build production agentic systems for enterprise clients in Singapore and the US. The pattern we see consistently is that the gap between a working pilot and a shippable production system is mostly infrastructure, not intelligence. The agent reasoning usually works. The hard part is everything around it.

If you are working through the architecture decisions on an agentic commerce deployment and want to compare notes with a team that has shipped this, we are happy to talk.

View all

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

Let's Connect