AI Agents — When the LLM Comes Alive

Episode 7 25 minutes

The Journey So Far

Let us take a look at the path we have covered. In Episode 4 we got to know open-source models. In Episode 5 we gave them memory with RAG. In Episode 6 we customized their behavior with Fine-tuning. But we still have a big problem: these models only talk.

Ask them what the weather is like — they give a generic answer but cannot actually check the weather. Tell them to send an email — they write the email text but do not actually send it. Tell them to read a file — they cannot.

Today we are going to break this limitation. Today the LLM comes alive.

What Is an Agent? — Three Key Differences from a Chatbot

An AI Agent is a system that has an LLM as its brain, but alongside it has tools that can take real action.

Let us look at the three main differences between an Agent and a simple Chatbot:

Difference 1: Action-Oriented

Chatbot: Only generates text. Answers your question and that is it.

Agent: Can take action — call APIs, read files, search databases, send emails.

Difference 2: Planning

Chatbot: Responds to each message independently. Has no plan.

Agent: Breaks a large task into smaller steps and executes them step by step.

Difference 3: Autonomy

Chatbot: Waits for your command. Does nothing without your question.

Agent: Can make decisions, detect and correct its mistakes, and even act without direct instruction.

Analogy: A Chatbot is like an encyclopedia — it answers but does nothing. An Agent is like a real assistant — it both answers and takes action. Say cancel tomorrow’s meeting, and it actually opens the calendar and cancels it.

Tool Use / Function Calling — The Agent’s Hands

The most important capability that turns an LLM into an Agent is Tool Use, also known as Function Calling.

How Does It Work?

The idea is simple: you give the model a list of tools (functions) and say use these whenever needed. The model itself determines when to use which tool.

# Defining tools for the model
tools = [
    {
        "name": "get_weather",
        "description": "Returns the weather for a city",
        "parameters": {
            "city": "City name (e.g., Tehran)"
        }
    },
    {
        "name": "send_email", 
        "description": "Sends an email",
        "parameters": {
            "to": "Email address",
            "subject": "Subject",
            "body": "Email body"
        }
    },
    {
        "name": "search_database",
        "description": "Searches the product database",
        "parameters": {
            "query": "Search term"
        }
    }
]

# Now when the user says "What is the weather in Tehran?"
# the model determines it should call get_weather
# and sets the city parameter to "Tehran"

Function Calling Workflow

  1. User asks a question
  2. Model decides whether a tool is needed
  3. If needed, model outputs the function name and parameters (not the result itself!)
  4. The system actually executes the function and returns the result to the model
  5. Model uses the result to give the final answer to the user
Important note: The model does not execute the function itself! It only says I think you should call function X with parameter Y. Your code performs the actual execution. This is very important for security — you control what gets executed.

Architecture of a Smart Agent

A good Agent has four main components. Let us examine each one:

1. Perception — The Agent’s Eyes and Ears

The Agent needs to receive information. This information can come from various sources:

  • User messages
  • Tool results
  • Conversation history
  • Environmental information (e.g., time, location)

2. Reasoning — The Agent’s Brain

This is where the LLM plays the main role. The model must:

  • Understand the question
  • Decide what action is needed
  • Create a plan
  • Choose between different tools

A well-known pattern for Agent reasoning is the ReAct (Reasoning + Acting) pattern:

# ReAct Pattern
# Thought: I need to check the weather first
# Action: get_weather("Tehran")
# Observation: Temperature 25C, sunny
# Thought: Now I should answer the user
# Answer: The weather in Tehran is currently 25 degrees and sunny!

3. Action — The Agent’s Hands

After deciding, the Agent acts. Actions can be:

  • Calling a tool (Function Calling)
  • Answering the user
  • Requesting more information from the user
  • Breaking the task into smaller subtasks

4. Memory — The Agent’s Learning

Without memory, the Agent starts from scratch every time. Agent memory comes in two types:

  • Short-term Memory: The current conversation history. It remembers everything said in this conversation.
  • Long-term Memory: Information preserved across different conversations. Such as user preferences and history of previous operations.
Connection to RAG: Agent long-term memory is usually implemented with RAG! Remember in Episode 5 we discussed how to store information in a Vector Database? The same technique is used here for Agent memory.

Agent Loop — The Core Cycle

An Agent works in a loop. This loop continues until the task is complete:

# Agent loop pseudocode
def agent_loop(user_request):
    messages = [{"role": "user", "content": user_request}]
    
    while True:
        # Model thinks
        response = llm.generate(messages, tools=available_tools)
        
        # Does it want to use a tool?
        if response.has_tool_call:
            # Execute the tool
            tool_name = response.tool_call.name
            tool_args = response.tool_call.arguments
            result = execute_tool(tool_name, tool_args)
            
            # Return result to model
            messages.append({"role": "tool", "content": result})
            continue  # Think again
        
        # If no tool needed, the final answer is ready
        return response.text

Important note: The Agent might loop several times. For example, first search the database, then analyze the result, then call another API, and finally give the answer.

Practical Agent Examples

Let us look at some real-world examples to better understand how powerful Agents are:

Example 1: Programming Assistant

Tools: read file, write file, execute code, search documentation

Scenario: You say there is a bug in app.py, find and fix it

Agent: reads the file, finds the bug, edits the file, runs tests, reports the result

Example 2: Research Assistant

Tools: web search, read web pages, summarization

Scenario: You say research the latest AI developments in 2025

Agent: performs multiple searches, reads articles, extracts key information, delivers a summary report

Example 3: Sales Assistant

Tools: product search, inventory check, order placement, send email

Scenario: Customer says I want a laptop under 30 million

Agent: searches the database, shows options, after customer selection checks inventory, places order, sends confirmation email

Multi-Agent — A Team of Agents

A more advanced concept is Multi-Agent Systems — where multiple Agents work together.

Think of it like a company: one Agent is the project manager who distributes work. One Agent is the developer who writes code. One Agent is the tester who tests code. And one Agent is the documentation writer.

Each Agent has its own expertise and coordinates with the others. The final result is better than what a single Agent could accomplish alone.

Agent Challenges

Agents are powerful but have challenges:

  • Security: If the Agent has access to sensitive tools (like file deletion), there must be restrictions and confirmation
  • Cost: Each loop iteration means one LLM call. If it loops 10 times, the cost is 10x
  • Reliability: Sometimes the Agent makes wrong decisions. You need fallback mechanisms
  • Infinite loops: The Agent might get stuck repeating the same action. You must set a maximum number of iterations

Agent Frameworks

You do not need to build an Agent from scratch. There are good frameworks available:

  • LangGraph: From the LangChain team. Graph-based and flexible. Suitable for complex Agents.
  • CrewAI: Designed for Multi-Agent. Define roles and tasks for each Agent.
  • AutoGen (Microsoft): Building conversations between Agents. Good for research.
  • Anthropic Agents SDK: The official Anthropic tool for building Agents with Claude.
Practical exercise: Build a simple Agent with Python. Give it three tools: a calculator, a Wikipedia search function, and a file save function. Then ask it to find the area of Iran, convert it to both square kilometers and square miles, and save the result to a file. Watch how it plans the steps on its own.

Summary

In this episode you learned:

  1. The difference between Agent and Chatbot: action, planning, autonomy
  2. How Tool Use / Function Calling works
  3. The 4-part Agent architecture: perception, reasoning, action, memory
  4. The Agent loop and the ReAct pattern
  5. Practical examples and Agent challenges
  6. Popular frameworks

Next Episode: Complete Architecture and Roadmap — Putting It All Together

The next episode is the final episode of this series. We will put everything we learned together: model + RAG + Fine-tuning + Agent. We will map out a real AI project architecture and give you a 6-month roadmap for entering this field. Do not miss it!