Build a Foundation for Creating AI Agents and Extending Microsoft 365 CopilotEstimated

Components of AI Agents

Estimated reading: 7 minutes 36 views

AI agents are built from a set of fundamental components that work together to enable intelligent behavior.

Agent Architecture

While all agents generally share the following components, their implementation and importance vary depending on the agent’s purpose and complexity.

Foundation Model (LLM): The large language model (LLM) provides generation and reasoning capabilities. It enables natural language understanding, generation, and contextual awareness.
Orchestrator: The orchestrator coordinates the agent’s behavior, deciding when to retrieve knowledge, invoke skills, or escalate to a human. It manages workflows, memory, and decision logic.
Knowledge: This refers to the information an agent uses to understand its environment and make decisions. It includes predefined instructions for the agent and reference data it can access, such as structured data, unstructured content, documents, databases, and real-time inputs. Agents use this knowledge to provide relevant responses and actions based on context.
Skills and Tools: These are the actions, capabilities, and workflows the agent can use to act, such as sending messages, querying databases, or triggering automated processes. This may include sending emails, retrieving data, updating records, or initiating automated workflows. Skills are often linked to APIs, services, or automation tools the agent can use to complete tasks.
Autonomy: This is the logic that guides how an agent interprets information and chooses actions. It includes decision-making frameworks, rule-based logic, triggers for autonomous capabilities, and increasingly, machine learning models that allow agents to adapt and improve over time.

Reflection:
Think of a process or task you’d like to automate. Which custom components (knowledge, skills, reasoning) would be most important for enabling an agent to successfully manage that process?

LLM vs. AI Agents: What’s the Difference?

Large language models (LLMs) are the core engine of generative AI. They enable agents to understand and generate human language, summarize content, translate text, and more.

However, LLMs alone are not agents.

AI agents extend the power of LLMs by integrating additional components:

Memory to retain context across interactions
Skills to perform actions in the real world
Reasoning and orchestration to manage complex workflows
Interfaces to interact with users and systems

In summary: LLMs generate intelligence. Agents apply that intelligence to achieve goals.

How AI Agents Work

Here’s how a typical AI agent operates:

Input: A user asks a question or initiates a task.
Understanding: The LLM interprets the input, determines intent, and extracts relevant information.
Planning: The orchestrator, often with help from the LLM, decides on the steps to take—such as retrieving knowledge, calling a skill, or requesting clarification.
Action: The agent performs the required actions using its skills or tools, guided by the plan.
Response Generation: The LLM generates a natural language response based on the results of the actions and the current context.
Communication: The agent delivers the response to the user via the chosen interface.
Learning: The agent stores relevant context or feedback to improve future interactions.

Example:
An employee asks an agent:

“What is our company’s travel policy, and can you book a flight to Seattle next week?”

The agent retrieves the latest travel policy from internal documentation or a knowledge base, considering organizational guidelines and the employee’s role.
It then calls an external flight booking API to search for available flights to Seattle that comply with the company’s travel policy (e.g., preferred airlines, budget limits, approval requirements).
The agent responds to the employee with a summary of the relevant travel policy, proposed flight options, and confirmation that the booking request has been initiated or completed—all in natural language.

Autonomous Agents

Autonomous agents operate with greater independence, often pursuing goals across multiple steps or sessions with minimal human intervention. A key feature of autonomous agents is their ability to respond to triggers—events or data changes—that prompt the agent to act without direct user input. Triggers may include scheduled times, data updates, external system events, or changes in user context.

Their workflow typically looks like this:

Goal Definition: The agent receives a high-level goal (from a user or system).
Trigger Monitoring: The agent continuously monitors relevant triggers such as deadlines, data changes, or external events requiring action.
Self-Planning: When a trigger is detected or a goal is received, the agent autonomously breaks down the goal into sub-tasks and creates a plan, often refining it iteratively.
Iterative Action: The agent executes actions, monitors results, and adjusts its plan as needed—potentially looping multiple times between planning and action. These actions may include triggering workflows, combining autonomous behavior with deterministic automated flows.
Self-Evaluation: The agent evaluates progress toward the goal, deciding whether to continue, adjust its approach, or declare the goal achieved.
Reporting/Communication: The agent summarizes results or requests intervention only when necessary.
Continuous Learning: The agent updates its memory and strategies based on outcomes to improve future autonomy.

Autonomous agents emphasize self-directed planning, trigger-based execution, and minimal reliance on step-by-step user input, enabling them to handle complex, multi-step tasks.

Example:
A financial organization uses a tax correction agent built with Copilot Studio agent flows.

The agent continuously monitors financial data for anomalies that may indicate the need for an audit.
When an anomaly is detected, it autonomously triggers a structured audit flow, collects necessary documents, and summarizes key findings.
The agent then delivers the audit results to the appropriate human reviewers for approval, ensuring compliance and transparency.
Throughout the process, the agent adapts its actions based on new data or feedback, combining autonomous decision-making with deterministic workflows to maintain flexibility and regulatory compliance.

This trigger-based cycle allows agents to operate in dynamic environments, adapt to user needs, and deliver increasingly personalized and effective outcomes.

Creating AI Agents

Creating AI agents may require a combination of foundational technologies, infrastructure, and development tools:

Foundation Models (LLMs): For natural language understanding, reasoning, and generation.
Orchestration Layer: To manage planning, decision-making, and coordination of actions.
Skills and Tools: A library of APIs, plugins, and services the agent can invoke to perform tasks.
Memory and Context Storage: For maintaining short- and long-term memory, enabling personalization and continuity.
Data Infrastructure: Secure and scalable access to structured and unstructured data sources.
Security and Governance: Identity management, access control, and compliance monitoring.
Deployment Environment: Cloud-native infrastructure (e.g., Azure Kubernetes Service, Azure Functions) to host and scale the agent.

However, the level of development required across these layers of the AI stack can vary significantly depending on the agent’s purpose and complexity. For retrieval or task-based agent scenarios, it may be sufficient to add knowledge, skills, and instructions while using existing infrastructure (e.g., creating an agent that extends Microsoft 365 Copilot). For more advanced and complex scenarios, you can fully customize your solution—including models, orchestration, logic, actions, security, and governance.

Microsoft AI Agent Solutions

Microsoft offers a range of tools and solutions to support your AI transformation journey, whether you want to build a solution with a fully customized AI stack or leverage existing components with your enterprise data, APIs, and business logic.

Adopt:
Microsoft 365 Copilot, Copilot Chat, and a range of proprietary agents offer powerful capabilities to support AI-powered productivity from the start, with built-in security and governance controls.

Extend:
Microsoft 365 Copilot can be extended with agents that leverage Copilot’s model, orchestrator, and user interface, but are tailored to custom business logic, data, and systems for business process automation.

Create:
A suite of Microsoft tools and services—including Copilot Studio, Microsoft 365 Agents Toolkit, Azure AI Foundry, and more—can be used to build custom agents and generative AI business applications for more advanced or complex scenarios.

Microsoft provides the best solutions for AI agents across this entire spectrum, including:

Microsoft 365 Copilot and Agent Builder: Business users can create AI agents using natural language in a no-code interface.
Copilot Studio: Makers can use a low-code interface to build custom AI agents and extend Microsoft 365 Copilot.
Visual Studio / GitHub / Azure AI Foundry: Developers can use these pro-code tools with SDKs and services like Semantic Kernel, Azure AI Agent Service, and Microsoft 365 Agents Toolkit to design, build, customize, publish, and manage enterprise-grade AI agent solutions.

Tech Hub

Components of AI Agents

Agent Architecture

LLM vs. AI Agents: What’s the Difference?

How AI Agents Work

Autonomous Agents

Creating AI Agents

Microsoft AI Agent Solutions

Next unit: Module Assessment

CONTENTS

About Saworks

Focus Areas

Past Articles

Stay in Touch