Python code being written to create an AI agent
llms-chatbots

From Prompt to Agent: Build an AI Assistant with Memory and Tools in Python (2026)

NeuralPulse|27 de maio de 2026|12 min read|Ler em Português

If you've ever called the OpenAI or Google API to make a chatbot answer questions, you know how easy it is to get there. The problem comes later: the chatbot doesn't remember the previous conversation, can't search the web, doesn't execute code, and does nothing beyond text.

That's not an agent. That's a demo.

A real AI agent combines four components: a language model that reasons, tools it can call, memory that persists between turns, and an orchestration loop that decides the next step. By 2026, frameworks like LangGraph (126k stars on GitHub), Vercel AI SDK, and Claude Agent SDK have made this accessible to any developer.

In this tutorial, you'll build a functional agent in Python using two approaches: first the pure ReAct loop to understand the mechanism, then with LangGraph for production. By the end, your agent will search the web, calculate expressions, remember previous conversations, and even persist state between sessions.

What Separates a Chatbot from an Agent

A chatbot receives a message and returns another. End.

An agent receives a goal, plans the steps, executes tools, observes the results, and repeats until the task is complete. The difference lies in the loop.

Its basic cycle — called ReAct (Reasoning + Acting) — works like this:

  1. The LLM receives the history + the user's question
  2. It decides: respond directly or call a tool
  3. If it calls a tool, it executes and returns the result to the LLM
  4. The LLM analyzes the result and decides the next step
  5. Repeats until it has a final answer

It's deceptively simple. Most errors in production agents come from hasty implementations of this loop.

Think of a real scenario: a support assistant that needs to consult the knowledge base, check the order status in the system, and then respond to the customer. Without the ReAct loop, you'd have to hardcode each step. With the loop, the LLM decides the sequence on its own based on the available tools.

The Detail That Makes All the Difference: tool_choice

In the code above, I used tool_choice="auto". This allows the LLM to decide between responding or calling a tool. But there are two other important options:

  • tool_choice="none": forces the LLM to respond without tools — useful when you want to guarantee a direct answer
  • tool_choice="required": forces the LLM to call at least one tool — good for pipelines where every query needs to pass through validation first

Choosing the wrong one here is one of the most common sources of infinite loops in agents. If the LLM keeps calling tools without ever reaching a final answer, you need an iteration limit (like max_iterations=10 in the example above) and a well-defined fallback.

Approach 1: Pure ReAct Loop (No Framework)

Before using LangGraph, it's worth writing the loop manually. You understand what the framework does under the hood.

import json
from openai import OpenAI

client = OpenAI()

TOOLS = [ { "type": "function", "function": { "name": "search_web", "description": "Searches the web and returns a summary of results", "parameters": { "type": "object", "properties": { "query": {"type": "string"} }, "required": ["query"] } } }, { "type": "function", "function": { "name": "calculate", "description": "Executes a mathematical expression", "parameters": { "type": "object", "properties": { "expression": {"type": "string"} }, "required": ["expression"] } } } ]

def execute_tool(name, args): if name == "calculate": return str(eval(args["expression"])) if name == "search_web": return f"Simulated results for: {args['query']}" return "Tool not found"

def react_agent(user_message, max_iterations=10): messages = [{"role": "user", "content": user_message}]

for _ in range(max_iterations):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        tools=TOOLS,
        tool_choice="auto"
    )
    msg = response.choices[0].message
    messages.append(msg)
    
    if not msg.tool_calls:
        return msg.content
    
    for call in msg.tool_calls:
        result = execute_tool(
            call.function.name,
            json.loads(call.function.arguments)
        )
        messages.append({
            "role": "tool",
            "tool_call_id": call.id,
            "content": result
        })

return "Maximum number of iterations reached"

This code works. But it has problems: it doesn't persist state between sessions, has no checkpoint to resume after a failure, and the linear loop becomes fragile in tasks with multiple branches. This is where LangGraph comes in.

Approach 2: Agent with LangGraph

LangGraph replaces the linear loop with a directed graph with typed states, native checkpointing, and support for conditionals. Each node in the graph is a processing step; each edge defines how execution flows.

from langgraph.graph import StateGraph, END
from typing import TypedDict, List
import json

class AgentState(TypedDict): messages: List[dict] next_step: str

def node_llm(state: AgentState): response = client.chat.completions.create( model="gpt-4o-mini", messages=state["messages"], tools=TOOLS ) msg = response.choices[0].message state["messages"].append(msg) if msg.tool_calls: state["next_step"] = "tools" else: state["next_step"] = "end" return state

def node_tools(state: AgentState): last_msg = state["messages"][-1] for call in last_msg.tool_calls: result = execute_tool( call.function.name, json.loads(call.function.arguments) ) state["messages"].append({ "role": "tool", "tool_call_id": call.id, "content": result }) state["next_step"] = "llm" return state

def router(state: AgentState): if state["next_step"] == "tools": return "tools" elif state["next_step"] == "end": return END return "llm"

graph = StateGraph(AgentState) graph.add_node("llm", node_llm) graph.add_node("tools", node_tools) graph.set_entry_point("llm") graph.add_conditional_edges("llm", router) graph.add_edge("tools", "llm") app = graph.compile()

The fundamental difference: with LangGraph, the llm → tools → llm cycle is explicit in the graph, and each execution goes through a checkpoint automatically. If the process crashes in the middle of a tool call, it resumes from where it stopped.

Why Graphs Are Better Than Linear Loops

The pure ReAct loop is sequential: one tool call at a time, always in the same order. A graph allows:

  • Conditional branches: depending on the LLM's output, the flow goes to different nodes
  • Parallel nodes: multiple tools executing at the same time
  • Human-in-the-loop: an approval node that pauses execution until a human authorizes
  • Subgraphs: specialized agents within the main agent, each with its own graph

In production, this flexibility isn't a luxury — it's a necessity. A support agent, for example, might need to query the customer database and the inventory at the same time (parallel), and only proceed if a manager approves a discount (human-in-the-loop).

Persistent Memory with Checkpointing

LangGraph's native checkpointing is one of its differentiators. With one line, you connect an SQLite database and your agent starts remembering everything between sessions:

from langgraph.checkpoint.sqlite import SqliteSaver

with SqliteSaver.from_conn_string("memory.db") as saver: app = graph.compile(checkpointer=saver) config = {"configurable": {"thread_id": "user-123"}}

result = app.invoke(
    {"messages": [{"role": "user", "content": "What was that article we discussed yesterday?"}]},
    config
)

Each thread_id represents an independent session. The agent automatically retrieves the complete history for that user. This solves the most common problem with chatbots in production: amnesia between conversations.

Real Tools: Connecting the Agent to the World

An agent without tools is a fancy chatbot. With tools, it becomes useful. Here's how to add a real web search using the DuckDuckGo search API:

from duckduckgo_search import DDGS

def search_web(query: str) -> str: with DDGS() as ddgs: results = list(ddgs.text(query, max_results=3)) return "\n".join( f"{r['title']}: {r['body'][:200]}" for r in results )

Just register the tool in the TOOLS array with the correct description. The LLM decides when to use it based on the description — which is why clear docstrings are essential.

Next Steps

Your agent already works with tools and persistent memory. From here, you can:

  • Add more tools: calculator, database access, email sending, external API queries
  • Multi-agent: use the supervisor + sub-agents pattern from LangGraph to delegate complex tasks
  • Human-in-the-loop: add approval nodes before critical actions (sending email, executing code)
  • Streaming: implement LangGraph's astream_events to show the agent's reasoning in real-time
  • Observability: connect LangSmith to track every tool call and token cost

Deploying the Agent

With LangGraph's checkpointing, your agent already persists state. Now you need to expose it as an API. The simplest way is with FastAPI:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class Question(BaseModel): message: str session: str

@app.post("/chat") async def chat(req: Question): config = {"configurable": {"thread_id": req.session}} result = app.invoke( {"messages": [{"role": "user", "content": req.message}]}, config ) return {"response": result["messages"][-1]["content"]}

With this, any frontend — React, Streamlit, even a Telegram bot — can converse with your agent. Each session becomes an independent conversation with its own history.

Summary: The Roadmap for Building Agents in 2026

ApproachComplexityProductionIdeal for
Pure ReAct LoopLowNoLearning the mechanism
Basic LangGraphMediumYesAgents with tools
LangGraph + checkpointMediumYesAgents with persistent memory
LangGraph + subgraphsHighYesMulti-agent systems

Start with the pure loop, understand each piece, and only then migrate to the framework. In 2026, building agents is no longer magic. It's software engineering with a new type of API. Your future self — when the agent breaks at 3 AM — will thank you for starting with the right approach.

#tutorial#langgraph#ai-agents#python#llms#chatbots
Compartilhar: