Agentic AI: autonomy, failure modes, and liability
Agentic AI: autonomy, failure modes, and liability
In late 2024 and throughout 2025, senior leaders at major AI companies and chip makers publicly framed “agentic AI” as the next mainstream phase of generative AI. (Gartner)
Actual adoption has been uneven. McKinsey’s State of AI 2025 survey reported that 62% of respondents were experimenting with AI agents, but no more than 10% were scaling agents across any single business function. (McKinsey & Company) Gartner has also predicted that over 40% of agentic AI projects will be cancelled by the end of 2027 due to escalating costs, unclear business value, or inadequate risk controls. (Gartner)
Despite mixed outcomes, the direction of travel is clear. Vendors are expanding agent capabilities beyond software engineering into broader office and enterprise workflows (for example, Anthropic’s “Cowork” research preview). (claude.com) As organisations move from pilots to production, the key question becomes less “what can the agent do?” and more “how can it fail, what damage can it cause, and who carries the liability?”
This article focuses on risk and decision support for business and commercial leaders.
What “agentic AI” means in practice
An agentic system is an AI system that can use tools to pursue a goal with some degree of independence. Those tools might include web search, access to internal databases, calling business systems via APIs, executing code, sending emails, or performing actions in applications.
A useful operational distinction is:
Workflow systems: predefined steps orchestrate model calls and tools. These are typically more predictable and easier to test.
Autonomous agents: the model chooses which tools to call, interprets results, and decides next steps (often in an iterative loop). This increases flexibility, but reduces predictability and expands the space of possible errors.
As autonomy increases, so do the potential consequences: the system is not only generating text but also changing systems, moving data, communicating externally, and potentially committing the organisation to outcomes.
Where agents are being deployed
Common categories of agentic tools now marketed to businesses include:
Research agents: multi-step research and synthesis across sources. (e.g., “deep research” style tools) (McKinsey & Company)
Coding agents: writing, debugging, and refactoring code, sometimes with access to repositories and terminals. (Reuters)
Computer-use agents: operating a desktop/browser to complete tasks end-to-end. (Reuters)
Enterprise workflow agents: automating business processes inside platforms such as CRM/ERP environments. (Gartner)
A practical caution: the term “agent” is often used loosely. Gartner has warned that “agent washing” (rebranding assistants or simple automation as agents) is widespread. (Reuters) For risk assessment, focus on capabilities: which systems it can access, what actions it can take, and how much it can do without human approval.
A concrete case study: an agent “running a business”
Anthropic’s Project Vend is a useful illustration of what can go wrong when an agent is given real operational responsibility, even in a constrained environment. In the first phase, an AI system (“Claudius”) operated a small office vending setup and was tasked with stocking, pricing, and responding to customers. Anthropic reported significant underperformance versus a competent human manager, including susceptibility to manipulation, incorrect handling of payment details, and poor commercial judgment (for example, fixating on “metal cubes” and selling them at a loss). (anthropic.com)
In the second phase, with improved tools and models, performance improved, and profits became more consistent, but vulnerabilities remained, including overly generous refunds/credits and continued susceptibility to manipulation and hallucinations. (anthropic.com)
The decision support takeaway is not that agents “fail”, but that agents can fail in ways that are commercially and legally meaningful, particularly when they are authorised to transact, communicate, or change systems.
Key risk categories that increase with agent autonomy
Below are common risk categories that become more significant as agents gain access to tools and independence.
1) Data alteration or destruction
What it looks like: the agent deletes records, changes configurations, alters permissions, or applies infrastructure changes that cause outages or security exposure.
Why it matters: agents built to “resolve obstacles” may take action that is locally rational but organisationally damaging, especially if given administrative tools.
2) Data exfiltration and cyber-enabled misuse
What it looks like: the agent is induced—through malicious content, compromised tools, or indirect prompt injection—to disclose confidential data externally.
Why it matters: giving an agent access to sensitive data plus outbound channels creates a clear exfiltration pathway. A widely cited example in the Microsoft 365 ecosystem is “EchoLeak”, which researchers described as a zero-click chain enabling data exfiltration from Microsoft 365 Copilot via an email. (Cato Networks)
3) External communications on the organisation’s behalf
What it looks like: emails, CRM notes, tickets, public posts, or messages to customers/regulators are sent without appropriate review.
Why it matters: communications can create compliance exposure, contractual commitments, admissions, or reputational harm. This is particularly sensitive when the agent interacts with customers or regulators.
4) Brittleness and drift (models, prompts, tools, and environments)
What it looks like: the agent behaves differently after a model upgrade, tool change, policy update, or a shift in the operational environment.
Why it matters: agent behaviour depends on a chain of components (model + prompts + tools + permissions + data sources). Small upstream changes can produce materially different outputs and actions.
5) Unfair or non-transparent decision-making
What it looks like: an agent makes or influences decisions affecting individuals (customers or staff) in ways that are inconsistent, biased, outside scope, or difficult to explain.
Why it matters: tool-retrieved information can shape outcomes in unpredictable ways, and if decisions materially affect rights or interests, organisations need defensible governance, logging, and review.
6) Unauthorised transactions or financial commitments
What it looks like: purchases, orders, refunds, credits, subscriptions, or contractual acceptances are made outside authority or limits.
Why it matters: even “small” unauthorised commitments can aggregate quickly, particularly at scale. Project Vend illustrates how easily an agent can be led into poor commercial decisions. (anthropic.com)
7) Physical damage or injury (where agents control physical systems)
What it looks like: incorrect commands in smart facilities, robotics, industrial maintenance, or other operational technology.
Why it matters: Once agents enter physical control systems, the consequences of failure can include property damage and personal injury, raising the stakes for testing, oversight, and duty of care.
“Rogue agents” and organisational liability
A common question is whether an organisation can distance itself from an agent’s actions by treating the agent as a separate “actor”. From a legal risk perspective, that is not a safe assumption.
Courts and tribunals are more likely to treat an AI agent as part of the organisation’s systems and customer interface, not as an independent legal person. A frequently cited example is the Air Canada chatbot decision in British Columbia, where the tribunal rejected Air Canada’s attempt to avoid responsibility for misleading information provided by its chatbot. (dentons.co.nz)
Separately, vendors’ own safety testing shows that frontier models can select high-risk behaviours in contrived but instructive scenarios (including “blackmail” behaviour in agentic misalignment testing). (anthropic.com) The practical implication is that organisations should plan on the basis that if the agent acts through your systems, brand, and permissions, the resulting liability risk will be assessed against your controls, oversight, and governance.
Legal risk areas to map before scaling
Consumer law (customer-facing agents)
If an agent interacts with customers, inaccurate statements or omissions can trigger Australian Consumer Law risk (including misleading or deceptive conduct). The risk increases when agents retrieve and combine large amounts of information across tools and systems, where quality and relevance may degrade.
Contract law (formation and authority)
Agents that negotiate, accept terms, place orders, or approve refunds raise questions of authorisation and contract formation. Even where the law is still developing in this area, the commercial risk is immediate: counterparties may rely on the communications and actions taken through your systems.
Privacy (access, use, disclosure, and security)
If an agent accesses personal information (email, HR systems, customer records), privacy compliance becomes a core design constraint, not an afterthought.
In Australia, reforms introduce new transparency requirements where an APP entity arranges for a computer program to use personal information to make decisions that could reasonably be expected to significantly affect an individual’s rights or interests. The OAIC guidance notes that this is due to commence from 10 December 2026. (OAIC)
Negligence (especially where physical-world impacts exist)
Where an agent’s actions can foreseeably cause damage to property or injury, negligence risk turns on whether reasonable care was taken. Courts are likely to scrutinise foreseeable failure modes, testing, safeguards, permissions, monitoring, and the level of human oversight.
What this means for decision-makers
The move from generative AI that advises to agentic AI that acts is a governance shift. Before scaling, organisations should be able to answer, in operational terms:
What systems can the agent access, and with what permissions?
What actions can it take without approval (and what is explicitly blocked)?
How are tool calls, outputs, and decisions logged in a way that supports audit and incident response?
What changes (model updates, tool updates, policy changes) can alter behaviour, and how are they controlled?
What is the escalation path when the agent produces uncertain outputs or encounters conflicting instructions?

