When we grant LLM’s, like ChatGPT, the agency to use data and tools for business or personal tasks then we get 'agentic' AI. These agents can work in collaboration with us as a co-pilot, or fully autonomously as an agent. They can work alone, or in teams with other agents, either in rigidly defined processes or in open conversations.
Such agentic AI now attracts a meaningful slice of the world’s AI research effort and huge investment from business. Meta has spent billions to “.. introduce AI agents to billions of people in ways that will be useful and meaningful”.
Yet, no vision has been shared for the world we are being driven towards. Let's distil this vision from key players' public statements. See end of document for full list of those players.
What Business Problem Does Agentic AI Address?
Sierra AI, led by the ex CEO of Salesforce and now OpenAI board member, Bret Taylor, makes the pithiest bid for agent's usefulness in business. Paraphrased below:
What if your most knowledgeable technician was available to advise sales staff without forgoing billable time?
What if staff could spend less time on documenting past work and more on developing client relationships?
What if customers could rely on your service specialists on site at all times ?
The potential benefits go on, Agentic AI promises to augment people with new skills and automate laborious paperwork. Let's cut to results of the review:
A Vision for Agentic AI
For Customers & Staff
Embody & Engage
Augment & Automate
Agents add value by augmenting human talent with digital skills [Microsoft]
Workflow automation with agents and agencies (multi-agent teams) [LangGraph]
Staff liberated to invest in client relationships, innovation and strategy [Zendesk]
For Operations & Regulators
Progress & Protect
Agents continuously learn from user interactions and other agents [CrewAI]
Agent operations; Monitoring safety and effectiveness [AgentOps]
Balance of helpful and empowered AI vs harmless but constrained [Imbue]
Trust & Transparency
Preserving brand trust with layered AI supervision and deterministic rules [Voiceflow]
Transparent decision-making processes with detailed logging and auditing [LangSmith]
User data privileges and privacy reflected in permissions of the AI agent [MS CoPilot]
For Developers & Analysts
Compose & Dispose
Agents compose and dispose their own teams for each task [AgentZero]
Spin out specialist ‘agents as a service’ publicly via marketplaces [MetaGPT]
Integration of agent teams via ‘Internet of Agents’ [Tsinghua / Tencent]
Flexible & Fast
Agents follow rules for rigid workflows, but devise solutions for open tasks [Sema4.ai]
LLM's map language to rigid workflows, code executes them repeatably [DeepMind]
Dynamic selection of LLM, tool or custom code as required (aka Compound AI) [BAIR]
See end of this document for the list of participants in the market which were reviewed for this vision.
Are the Foundations of Agentic AI Reliable?
Agentic AI has pedigree, it goes back at least as far as ‘Society of Mind’ by Marvin Minsky in 1986. Minsky was co-founder of the AI lab at MIT.
Minsky reminds us that the human mind is not a single, unified entity, but rather numerous specialisations, from a sense of balance to reason and emotional insight. Complex problems, such as the questions above, are solved through the cooperation and competition of these specialisms, which operate as ‘agents’.
Minsky emphasizes the importance of analogies and metaphors in human cognition and problem-solving. ChatGPT is our analogue for an agent, and in fact, asking ChatGPT to ‘reason by analogy’ is a useful prompting technique.
LLM’s such as ChatGPT are granted ‘agency’ when we augment them with access to business data, tools for processing that data into something useful and the ability to reflect on their work.
The tasks we give such an agent may be simple, such as booking a meeting, or multi-step, such as drafting a report using management information, or even complex, such as discussing product requirements with many users then synthesising user stories and tests.
Specialist agents can partially automate tasks in collaboration with people, as done by Microsoft Office Co-Pilot. Or, they can fully automate workflows in collaboration with other specialist agents, such as with Microsoft Autogen.
Don’t Pave the Cow Paths
The above vision requires us to employ the advantages of AI, whilst being wary of the limits. This is different to replacing people with AI, which is rarely feasible let alone desirable. Direct replacement tempts us into ‘paving the cow paths’ whereby we employ new technology in habitual processes, rather than establishing processes around the new technology’s advantages.
For example, when electric motors first replaced massive steam engines in factories, direct replacement of one centralised motor with another led to few gains. It wasn’t until factories were re-organised for distributed power tools that companies raised productivity.
Agents are the distribution of helpful intelligence to user’s fingertips, just as electricity delivers muscle power to their hands. Be careful, there is an unavoidable trade off between AI which is helpful and that which is harmless. As with power tools, helpful tools can do harm, harmless tools are helpless. AI safety researchers have been struggling with this dilemma for years.
What else can AI do that people can’t? Consider these advantages in the context of the business problems:
Mass Personalized Interaction
An AI agent can engage in thousands of individualized conversations simultaneously
Scalable Empathy
AI agents can provide emotionally intelligent interactions with unlimited patience,they simply have more time for patients than doctors do.
Generalist Expertise
AI agents combine multiple specialist skills in one entity, streamlining teamwork
Iterative Intuition+Rules Loop
AI agents enable cycling between intuitive conversation and the following of rules, which is computing’s traditional expertise
Cross-Modal Understanding
Multimodal AI blends different types of data (text, images, etc.) in ways humans cannot, for example, it can visualise the output of code
Vast Simultaneous Context
AI’s working memory is many books worth of information, it can summarise and extract themes from customer comments at scale.
More on these ‘super powers’ here.
Who Do the Agents Work For?
Agents can be employed by individuals and by companies. So, there are four interactions which can be mediated and automated by agents on our behalf:
Individual – Company … individual as a customer
Company – Individual … individual as staff, consultant or subject specialist
Company – Company … companies in partnership, eg a supplier and client
Individual – Individual … individuals as colleagues or partners
Be aware of the corporate impact of personal technologies. Consider how quickly Facebook’s momentum led to adoption in professional settings. In fact, ‘Bring Your Own AI’ is already a feature at many companies who do not approve of ChatGPT or Claude. Staff bypass this constraint by using their phones, such is the advantage of the AI.
A Dose of Skepticism
A vision is by definition not yet a reality. There are reasonable doubts to be addressed:
Can I Trust It?
Business needs safe reliable AI just as airlines need safe and reliable aircraft. I assume you’ve already used ChatGPT or Claude, so you know what it can do for you, and what it can’t.
A common refrain is that it requires the user to be a subject matter expert to know whether the AI output is accurate or an hallucination. It can be amazing, but when we inquire about subjects in which we are not experts then we must cross reference outputs, which is time consuming.
The vision of Agentic AI is that it blends probabilistic tools, like chatGPT, for handling common sense queries, and rule based tools, such as software code or policy documents, for handling regulated situations.
We already employ agents to write code, testing and correcting their own output. Even before ChatGPT, chatbots used rule books to ensure they adhere to company policies in conversations with clients. Since ChatGPT, many chatbots automatically switch between a fluid conversational AI and a set of approved responses, according to the context. Voiceflow discusses such a hybrid agent.
What Can AI Not Currently Do?
There are many limitations, but here are three that commonly face me as an agent developer:
Intuition vs Logic
Now we turn to ‘hallucination’. In the words of a recent scientific paper “LLM’s are in context semantic reasoners, not symbolic reasoners”. In other words, LLMs are not the reasoning engines of science fiction, instead, their style is fast and intuitive. As such, their reasoning is valid only within the constraints of the patterns of words they have seen during training. When the words or patterns are not familiar then LLMs will often confabulate false yet convincing outputs.
Compare this to our own intuition, quickly applying rules of thumb, versus our slower deliberate thoughts. Otherwise known as ‘thinking fast’ versus ‘thinking slow’, or common sense versus formal reasoning. LLM’s do not currently ‘think slow’.
There are ways to bridge the gap, but this is where the skill of the AI engineer comes in.
Interpolation not Extrapolation
All neural networks, including LLM’s, learn to interpolate between the data they are trained on. LLM’s and image generators are impressive because of the sheer breadth of their training.
This allows them to convincingly blend unrelated but pre-existing styles, such as holiday snaps with Picasso. But, they could not invent Picasso’s style from scratch, they cannot extrapolate into entirely new insights and styles.
Context Window and Cost
LLM’s, including AI Agents, are ‘stateless’. Every time we make a request, the entire conversation history must be sent to the LLM to provide a context for that request. As the conversation gets longer, the amount of data grows. Each ‘token’, or part of a word, costs money and time. Complex agentic workflows can consume enormous numbers of tokens.
This is also a skill of the AI engineer, to build workflows which are economical for businesses and performant for users.
OpenAI’s recent GPT-4o-mini is cheap and very fast, it is clearly targeting agentic workflows.
A Work In Progress
This document is a first attempt at the vision enabled by Agentic AI. I’d be delighted to incorporate your comments, do comment below !
List of Participants in the Agentic AI Space
The below table lists the companies whose public statements were reviewed in order to derive the vision statement.
Single Agents for Enterprise
| Workflow Automation for Agents
| Multi Agent Frameworks for Developers
|
- AutoGen - LangGraph - CrewAI - MetaGPT -AgentZero(Github) | ||
Agent Ops & Security
| Single Agents for Consumers | Chatbot Providers Developing into Agents
|
- Meta AI studio - Open Interpreter - OpenAI GPT Store
|