Python code being written to create an AI agent

From Prompt to Agent: Build an AI Assistant with Memory and Tools in Python (2026)

NeuralPulse|27 de maio de 2026|12 min read|Ler em Português

If you've ever called the OpenAI or Google API to make a chatbot answer questions, you know how easy it is to get there. The problem comes later: the chatbot doesn't remember the previous conversation, can't search the web, doesn't execute code, and does nothing beyond text.

That's not an agent. That's a demo.

A real AI agent combines four components: a language model that reasons, tools it can call, memory that persists between turns, and an orchestration loop that decides the next step. By 2026, frameworks like LangGraph (126k stars on GitHub), Vercel AI SDK, and Claude Agent SDK have made this accessible to any developer.

In this tutorial, you'll build a functional agent in Python using two approaches: first the pure ReAct loop to understand the mechanism, then with LangGraph for production. By the end, your agent will search the web, calculate expressions, remember previous conversations, and even persist state between sessions.

What Separates a Chatbot from an Agent

A chatbot receives a message and returns another. End.

An agent receives a goal, plans the steps, executes tools, observes the results, and repeats until the task is complete. The difference lies in the loop.

Its basic cycle — called ReAct (Reasoning + Acting) — works like this:

The LLM receives the history + the user's question
It decides: respond directly or call a tool
If it calls a tool, it executes and returns the result to the LLM
The LLM analyzes the result and decides the next step
Repeats until it has a final answer

It's deceptively simple. Most errors in production agents come from hasty implementations of this loop.

Think of a real scenario: a support assistant that needs to consult the knowledge base, check the order status in the system, and then respond to the customer. Without the ReAct loop, you'd have to hardcode each step. With the loop, the LLM decides the sequence on its own based on the available tools.

The Detail That Makes All the Difference: tool_choice

In the code above, I used tool_choice="auto". This allows the LLM to decide between responding or calling a tool. But there are two other important options:

tool_choice="none": forces the LLM to respond without tools — useful when you want to guarantee a direct answer
tool_choice="required": forces the LLM to call at least one tool — good for pipelines where every query needs to pass through validation first

Choosing the wrong one here is one of the most common sources of infinite loops in agents. If the LLM keeps calling tools without ever reaching a final answer, you need an iteration limit (like max_iterations=10 in the example above) and a well-defined fallback.

Approach 1: Pure ReAct Loop (No Framework)

Before using LangGraph, it's worth writing the loop manually. You understand what the framework does under the hood.

import json
from openai import OpenAI

client = OpenAI()

TOOLS = [ { "type": "function", "function": { "name": "search_web", "description": "Searches the web and returns a summary of results", "parameters": { "type": "object", "properties": { "query": {"type": "string"} }, "required": ["query"] } } }, { "type": "function", "function": { "name": "calculate", "description": "Executes a mathematical expression", "parameters": { "type": "object", "properties": { "expression": {"type": "string"} }, "required": ["expression"] } } } ]

def execute_tool(name, args): if name == "calculate": return str(eval(args["expression"])) if name == "search_web": return f"Simulated results for: {args['query']}" return "Tool not found"

def react_agent(user_message, max_iterations=10): messages = [{"role": "user", "content": user_message}]

for _ in range(max_iterations):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        tools=TOOLS,
        tool_choice="auto"
    )
    msg = response.choices[0].message
    messages.append(msg)
    
    if not msg.tool_calls:
        return msg.content
    
    for call in msg.tool_calls:
        result = execute_tool(
            call.function.name,
            json.loads(call.function.arguments)
        )
        messages.append({
            "role": "tool",
            "tool_call_id": call.id,
            "content": result
        })

return "Maximum number of iterations reached"

This code works. But it has problems: it doesn't persist state between sessions, has no checkpoint to resume after a failure, and the linear loop becomes fragile in tasks with multiple branches. This is where LangGraph comes in.

Approach 2: Agent with LangGraph

LangGraph replaces the linear loop with a directed graph with typed states, native checkpointing, and support for conditionals. Each node in the graph is a processing step; each edge defines how execution flows.

from langgraph.graph import StateGraph, END
from typing import TypedDict, List
import json

class AgentState(TypedDict): messages: List[dict] next_step: str

def node_llm(state: AgentState): response = client.chat.completions.create( model="gpt-4o-mini", messages=state["messages"], tools=TOOLS ) msg = response.choices[0].message state["messages"].append(msg) if msg.tool_calls: state["next_step"] = "tools" else: state["next_step"] = "end" return state

def node_tools(state: AgentState): last_msg = state["messages"][-1] for call in last_msg.tool_calls: result = execute_tool( call.function.name, json.loads(call.function.arguments) ) state["messages"].append({ "role": "tool", "tool_call_id": call.id, "content": result }) state["next_step"] = "llm" return state

def router(state: AgentState): if state["next_step"] == "tools": return "tools" elif state["next_step"] == "end": return END return "llm"

ElevenLabs

Transforme texto em voz com IA realista. Perfeito para narracoes, podcasts e audiolivros.

Testar gratuito

graph = StateGraph(AgentState) graph.add_node("llm", node_llm) graph.add_node("tools", node_tools) graph.set_entry_point("llm") graph.add_conditional_edges("llm", router) graph.add_edge("tools", "llm") app = graph.compile()

The fundamental difference: with LangGraph, the llm → tools → llm cycle is explicit in the graph, and each execution goes through a checkpoint automatically. If the process crashes in the middle of a tool call, it resumes from where it stopped.

Why Graphs Are Better Than Linear Loops

The pure ReAct loop is sequential: one tool call at a time, always in the same order. A graph allows:

Conditional branches: depending on the LLM's output, the flow goes to different nodes
Parallel nodes: multiple tools executing at the same time
Human-in-the-loop: an approval node that pauses execution until a human authorizes
Subgraphs: specialized agents within the main agent, each with its own graph

In production, this flexibility isn't a luxury — it's a necessity. A support agent, for example, might need to query the customer database and the inventory at the same time (parallel), and only proceed if a manager approves a discount (human-in-the-loop).

Persistent Memory with Checkpointing

LangGraph's native checkpointing is one of its differentiators. With one line, you connect an SQLite database and your agent starts remembering everything between sessions:

from langgraph.checkpoint.sqlite import SqliteSaver

with SqliteSaver.from_conn_string("memory.db") as saver: app = graph.compile(checkpointer=saver) config = {"configurable": {"thread_id": "user-123"}}

result = app.invoke(
    {"messages": [{"role": "user", "content": "What was that article we discussed yesterday?"}]},
    config
)

Each thread_id represents an independent session. The agent automatically retrieves the complete history for that user. This solves the most common problem with chatbots in production: amnesia between conversations.

Real Tools: Connecting the Agent to the World

An agent without tools is a fancy chatbot. With tools, it becomes useful. Here's how to add a real web search using the DuckDuckGo search API:

from duckduckgo_search import DDGS

def search_web(query: str) -> str: with DDGS() as ddgs: results = list(ddgs.text(query, max_results=3)) return "\n".join( f"{r['title']}: {r['body'][:200]}" for r in results )

Just register the tool in the TOOLS array with the correct description. The LLM decides when to use it based on the description — which is why clear docstrings are essential.

Next Steps

Your agent already works with tools and persistent memory. From here, you can:

Add more tools: calculator, database access, email sending, external API queries
Multi-agent: use the supervisor + sub-agents pattern from LangGraph to delegate complex tasks
Human-in-the-loop: add approval nodes before critical actions (sending email, executing code)
Streaming: implement LangGraph's astream_events to show the agent's reasoning in real-time
Observability: connect LangSmith to track every tool call and token cost

Deploying the Agent

With LangGraph's checkpointing, your agent already persists state. Now you need to expose it as an API. The simplest way is with FastAPI:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class Question(BaseModel): message: str session: str

@app.post("/chat") async def chat(req: Question): config = {"configurable": {"thread_id": req.session}} result = app.invoke( {"messages": [{"role": "user", "content": req.message}]}, config ) return {"response": result["messages"][-1]["content"]}

With this, any frontend — React, Streamlit, even a Telegram bot — can converse with your agent. Each session becomes an independent conversation with its own history.

Summary: The Roadmap for Building Agents in 2026

Approach	Complexity	Production	Ideal for
Pure ReAct Loop	Low	No	Learning the mechanism
Basic LangGraph	Medium	Yes	Agents with tools
LangGraph + checkpoint	Medium	Yes	Agents with persistent memory
LangGraph + subgraphs	High	Yes	Multi-agent systems

Start with the pure loop, understand each piece, and only then migrate to the framework. In 2026, building agents is no longer magic. It's software engineering with a new type of API. Your future self — when the agent breaks at 3 AM — will thank you for starting with the right approach.

#tutorial#langgraph#ai-agents#python#llms#chatbots

Python code running in a text editor with semantic similarity charts in the background

tutorials|5 min

Semantic Search with Python and Open-Source Models

Practical tutorial on embeddings for semantic search in Python using open-source models such as BGE-M3 and GTE-Qwen2. Runnable code and performance metrics.

13 de junho de 2026Read more

Hyperparameter optimization graph with performance curves and search points, representing tuning automation with Hyperopt.

tutorials|7 min

Hyperparameter Optimization with Hyperopt in 2026: Practical Guide

2026 practical tutorial: learn to optimize machine learning model hyperparameters using Hyperopt, with Bayesian search and result visualization.

12 de junho de 2026Read more

Python code interface with audio waves and a virtual chatbot