The Journey So Far
Let us take a look at the path we have covered. In Episode 4 we got to know open-source models. In Episode 5 we gave them memory with RAG. In Episode 6 we customized their behavior with Fine-tuning. But we still have a big problem: these models only talk.
Ask them what the weather is like — they give a generic answer but cannot actually check the weather. Tell them to send an email — they write the email text but do not actually send it. Tell them to read a file — they cannot.
Today we are going to break this limitation. Today the LLM comes alive.
What Is an Agent? — Three Key Differences from a Chatbot
An AI Agent is a system that has an LLM as its brain, but alongside it has tools that can take real action.
Let us look at the three main differences between an Agent and a simple Chatbot:
Difference 1: Action-Oriented
Chatbot: Only generates text. Answers your question and that is it.
Agent: Can take action — call APIs, read files, search databases, send emails.
Difference 2: Planning
Chatbot: Responds to each message independently. Has no plan.
Agent: Breaks a large task into smaller steps and executes them step by step.
Difference 3: Autonomy
Chatbot: Waits for your command. Does nothing without your question.
Agent: Can make decisions, detect and correct its mistakes, and even act without direct instruction.
Analogy: A Chatbot is like an encyclopedia — it answers but does nothing. An Agent is like a real assistant — it both answers and takes action. Say cancel tomorrow’s meeting, and it actually opens the calendar and cancels it.
Tool Use / Function Calling — The Agent’s Hands
The most important capability that turns an LLM into an Agent is Tool Use, also known as Function Calling.
How Does It Work?
The idea is simple: you give the model a list of tools (functions) and say use these whenever needed. The model itself determines when to use which tool.
# Defining tools for the model
tools = [
{
"name": "get_weather",
"description": "Returns the weather for a city",
"parameters": {
"city": "City name (e.g., Tehran)"
}
},
{
"name": "send_email",
"description": "Sends an email",
"parameters": {
"to": "Email address",
"subject": "Subject",
"body": "Email body"
}
},
{
"name": "search_database",
"description": "Searches the product database",
"parameters": {
"query": "Search term"
}
}
]
# Now when the user says "What is the weather in Tehran?"
# the model determines it should call get_weather
# and sets the city parameter to "Tehran"
Function Calling Workflow
- User asks a question
- Model decides whether a tool is needed
- If needed, model outputs the function name and parameters (not the result itself!)
- The system actually executes the function and returns the result to the model
- Model uses the result to give the final answer to the user
Architecture of a Smart Agent
A good Agent has four main components. Let us examine each one:
1. Perception — The Agent’s Eyes and Ears
The Agent needs to receive information. This information can come from various sources:
- User messages
- Tool results
- Conversation history
- Environmental information (e.g., time, location)
2. Reasoning — The Agent’s Brain
This is where the LLM plays the main role. The model must:
- Understand the question
- Decide what action is needed
- Create a plan
- Choose between different tools
A well-known pattern for Agent reasoning is the ReAct (Reasoning + Acting) pattern:
# ReAct Pattern
# Thought: I need to check the weather first
# Action: get_weather("Tehran")
# Observation: Temperature 25C, sunny
# Thought: Now I should answer the user
# Answer: The weather in Tehran is currently 25 degrees and sunny!
3. Action — The Agent’s Hands
After deciding, the Agent acts. Actions can be:
- Calling a tool (Function Calling)
- Answering the user
- Requesting more information from the user
- Breaking the task into smaller subtasks
4. Memory — The Agent’s Learning
Without memory, the Agent starts from scratch every time. Agent memory comes in two types:
- Short-term Memory: The current conversation history. It remembers everything said in this conversation.
- Long-term Memory: Information preserved across different conversations. Such as user preferences and history of previous operations.
Agent Loop — The Core Cycle
An Agent works in a loop. This loop continues until the task is complete:
# Agent loop pseudocode
def agent_loop(user_request):
messages = [{"role": "user", "content": user_request}]
while True:
# Model thinks
response = llm.generate(messages, tools=available_tools)
# Does it want to use a tool?
if response.has_tool_call:
# Execute the tool
tool_name = response.tool_call.name
tool_args = response.tool_call.arguments
result = execute_tool(tool_name, tool_args)
# Return result to model
messages.append({"role": "tool", "content": result})
continue # Think again
# If no tool needed, the final answer is ready
return response.text
Important note: The Agent might loop several times. For example, first search the database, then analyze the result, then call another API, and finally give the answer.
Practical Agent Examples
Let us look at some real-world examples to better understand how powerful Agents are:
Example 1: Programming Assistant
Tools: read file, write file, execute code, search documentation
Scenario: You say there is a bug in app.py, find and fix it
Agent: reads the file, finds the bug, edits the file, runs tests, reports the result
Example 2: Research Assistant
Tools: web search, read web pages, summarization
Scenario: You say research the latest AI developments in 2025
Agent: performs multiple searches, reads articles, extracts key information, delivers a summary report
Example 3: Sales Assistant
Tools: product search, inventory check, order placement, send email
Scenario: Customer says I want a laptop under 30 million
Agent: searches the database, shows options, after customer selection checks inventory, places order, sends confirmation email
Multi-Agent — A Team of Agents
A more advanced concept is Multi-Agent Systems — where multiple Agents work together.
Think of it like a company: one Agent is the project manager who distributes work. One Agent is the developer who writes code. One Agent is the tester who tests code. And one Agent is the documentation writer.
Each Agent has its own expertise and coordinates with the others. The final result is better than what a single Agent could accomplish alone.
Agent Challenges
Agents are powerful but have challenges:
- Security: If the Agent has access to sensitive tools (like file deletion), there must be restrictions and confirmation
- Cost: Each loop iteration means one LLM call. If it loops 10 times, the cost is 10x
- Reliability: Sometimes the Agent makes wrong decisions. You need fallback mechanisms
- Infinite loops: The Agent might get stuck repeating the same action. You must set a maximum number of iterations
Agent Frameworks
You do not need to build an Agent from scratch. There are good frameworks available:
- LangGraph: From the LangChain team. Graph-based and flexible. Suitable for complex Agents.
- CrewAI: Designed for Multi-Agent. Define roles and tasks for each Agent.
- AutoGen (Microsoft): Building conversations between Agents. Good for research.
- Anthropic Agents SDK: The official Anthropic tool for building Agents with Claude.
Summary
In this episode you learned:
- The difference between Agent and Chatbot: action, planning, autonomy
- How Tool Use / Function Calling works
- The 4-part Agent architecture: perception, reasoning, action, memory
- The Agent loop and the ReAct pattern
- Practical examples and Agent challenges
- Popular frameworks
Next Episode: Complete Architecture and Roadmap — Putting It All Together
The next episode is the final episode of this series. We will put everything we learned together: model + RAG + Fine-tuning + Agent. We will map out a real AI project architecture and give you a 6-month roadmap for entering this field. Do not miss it!