Function Calling in Practice: Python Tutorial for Chatbots with LLMs that Execute Actions in 2026
Your chatbot can answer questions. But can it act? In 2026, that's the line separating a prototype from a real product.
OpenAI data shows that 70% of developers already use function calling in production chatbots (OpenAI DevDay 2025). The reason is simple: without it, the LLM is just a digital parrot. With function calling, it becomes a system orchestrator.
In this tutorial, you'll learn how to implement function calling in practice with Python. We'll use the APIs of the three main platforms — OpenAI, Anthropic Claude, and Google Gemini — to build a chatbot that checks the weather, accesses a database, and sends notifications.
What is Function Calling (and Why You Need It)
Function calling, also called tool use, is the LLM's ability to identify when it should call an external function. The model doesn't execute the code — it returns a structured JSON object with the function's parameters. Your system executes the function and returns the result to the model.
"Function calling transforms LLMs from text generators into software agents. It's the bridge between natural language and APIs." — Anthropic's official documentation on tool use (2025)
The basic flow is:
- You define functions with descriptions and parameters (like a JSON schema).
- The model decides whether to call a function based on the user's prompt.
- Your code executes the actual function and returns the result.
- The model incorporates this result into the final response.
The average cost of a function call via LLM is US$0.003 (Anthropic pricing, 2025). For high-volume chatbots, this represents a low operational cost compared to the gain in utility.
Hands-on: Implementation with OpenAI
OpenAI pioneered function calling. The API is mature and well-documented. Let's create an assistant that checks the weather and manages tasks.
First, install the library:
pip install openai
Now, define the functions the model can call:
import json
from openai import OpenAI
client = OpenAI(api_key="your-key-here")
tools = [ { "type": "function", "function": { "name": "get_weather", "description": "Gets the current temperature for a city", "parameters": { "type": "object", "properties": { "city": { "type": "string", "description": "City name, e.g., São Paulo" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["city"] } } }, { "type": "function", "function": { "name": "add_task", "description": "Adds a task to the user's list", "parameters": { "type": "object", "properties": { "task": { "type": "string", "description": "Task description" }, "priority": { "type": "string", "enum": ["high", "medium", "low"] } }, "required": ["task", "priority"] } } } ]
Create the actual functions that will be executed:
def get_weather(city: str, unit: str = "celsius"):
# Simulates an external API call
temperatures = {"São Paulo": 22, "Rio de Janeiro": 30, "Brasília": 25}
temp = temperatures.get(city, 20)
if unit == "fahrenheit":
temp = temp * 9/5 + 32
return json.dumps({"city": city, "temperature": temp, "unit": unit})
def add_task(task: str, priority: str): # Simulates a database insertion return json.dumps({"status": "success", "task": task, "priority": priority})
Now, the main conversation loop:
messages = [{"role": "user", "content": "What's the temperature in Rio and add 'buy milk' as high priority"}]
response = client.chat.completions.create( model="gpt-4o", messages=messages, tools=tools, tool_choice="auto" )
assistant_message = response.choices[0].message tool_calls = assistant_message.tool_calls
if tool_calls: for tool_call in tool_calls: function_name = tool_call.function.name arguments = json.loads(tool_call.function.arguments)
if function_name == "get_weather":
result = get_weather(**arguments)
elif function_name == "add_task":
result = add_task(**arguments)
messages.append({
"tool_call_id": tool_call.id,
"role": "tool",
"name": function_name,
"content": result
})
final_response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
print(final_response.choices[0].message.content)
Notice that the model can call multiple functions in parallel. It understands it needs the weather and also to add a task. This is native to the OpenAI API.
Implementation with Anthropic Claude
Anthropic calls it tool use. The API is similar, but has important differences. Let's get to the code.
pip install anthropic
from anthropic import Anthropic
client = Anthropic(api_key="your-key")
tools = [ { "name": "get_weather", "description": "Gets the current temperature for a city", "input_schema": { "type": "object", "properties": { "city": {"type": "string", "description": "City name"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]} }, "required": ["city"] } }, { "name": "add_task", "description": "Adds a task to the list", "input_schema": { "type": "object", "properties": { "task": {"type": "string"}, "priority": {"type": "string", "enum": ["high", "medium", "low"]} }, "required": ["task", "priority"] } } ]
message = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=[{"role": "user", "content": "What's the temperature in Brasília?"}], tools=tools )
Claude returns stop_reason="tool_use" when it wants to call a function
if message.stop_reason == "tool_use": for content_block in message.content: if content_block.type == "tool_use": tool_name = content_block.name tool_input = content_block.input
if tool_name == "get_weather":
result = get_weather(**tool_input)
# Sends the result back
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
messages=[
{"role": "user", "content": "What's the temperature in Brasília?"},
{"role": "assistant", "content": message.content},
{"role": "user", "content": [
{
"type": "tool_result",
"tool_use_id": content_block.id,
"content": result
}
]}
],
tools=tools
)
print(response.content[0].text)
The main difference: Claude uses stop_reason and structured content blocks. It does not accept multiple parallel calls in the same round — each tool_use needs to be processed sequentially.
Implementation with Google Gemini
Gemini also offers native function calling. The syntax is a bit different, but the concept is the same.
pip install google-generativeai
import google.generativeai as genai
genai.configure(api_key="your-key")
model = genai.GenerativeModel("gemini-2.0-flash")
Defines functions as a Python dictionary
get_weather_tool = { "function_declarations": [ { "name": "get_weather", "description": "Gets the current temperature for a city", "parameters": { "type_": "OBJECT", "properties": { "city": {"type_": "STRING"}, "unit": {"type_": "STRING", "enum": ["celsius", "fahrenheit"]} }, "required": ["city"] } }, { "name": "add_task", "description": "Adds a task to the list", "parameters": { "type_": "OBJECT", "properties": { "task": {"type_": "STRING"}, "priority": {"type_": "STRING", "enum": ["high", "medium", "low"]} }, "required": ["task", "priority"] } } ] }
chat = model.start_chat()
response = chat.send_message( "What's the temperature in São Paulo?", tools=[get_weather_tool] )
Gemini returns function_call in the response
if response.candidates[0].content.parts[0].function_call: function_call = response.candidates[0].content.parts[0].function_call tool_name = function_call.name args = {key: value for key, value in function_call.args.items()}
if tool_name == "get_weather":
result = get_weather(**args)
# Sends the result
response = chat.send_message(
genai.protos.Content(
parts=[
genai.protos.Part(
function_response=genai.protos.FunctionResponse(
name=tool_name,
response={"result": result}
)
)
]
)
)
print(response.text)
Gemini uses function_declarations and FunctionResponse. It also supports parallel calls, but the implementation is more verbose than OpenAI's.
Comparison between the three platforms
| Feature | OpenAI | Anthropic Claude | Google Gemini |
|---|---|---|---|
| Parallel calls | Yes (native) | No (sequential) | Yes (but verbose) |
| Schema format | JSON Schema | JSON Schema | Python Dictionary |
| Call identification | tool_calls | stop_reason: tool_use | function_call |
| Result return | role: tool | tool_result | FunctionResponse |
| Recommended model | gpt-4o | claude-3-5-sonnet | gemini-2.0-flash |
| Cost per call | ~US$0.005 | ~US$0.003 | ~US$0.002 |
OpenAI has an advantage in maturity and parallelism support. Anthropic offers the lowest cost per call. Gemini stands out for its integration with the Google ecosystem.
Best practices for production
Function calling in production requires extra care. Here are three essential rules.
Validate parameters before executing. The model can hallucinate values. Always check that parameters are within the expected range.
def safe_get_weather(city, unit="celsius"):
if unit not in ["celsius", "fahrenheit"]:
unit = "celsius" # safe fallback
# continues...
Define clear and detailed descriptions. The better the function and parameter descriptions, the higher the chance the model will call the correct function. Use examples.
"description": "Gets the temperature for a city. Ex: 'São Paulo' returns 22°C."
Implement timeouts and fallbacks. External APIs can fail. If the function call takes longer than 5 seconds, return a friendly error to the model.
import asyncio
async def call_with_timeout(func, timeout=5): try: return await asyncio.wait_for(func, timeout=timeout) except asyncio.TimeoutError: return json.dumps({"error": "Service currently unavailable"})
The future is agentic
Function calling is the foundation for autonomous agents. In 2026, chatbots that don't execute actions are doomed to be replaced. The difference between a useful assistant and a tech toy lies in the ability to integrate systems.
Start small: a weather function, a database function. Then add email sending, CRM queries, script execution. The pattern is the same. The complexity comes from orchestration.
The code from this tutorial is available in a public repository. Use it as a base for your next chatbot. And remember: the LLM is the brain, but the functions are the muscles.
Related Articles
Also check out: Autonomous AI Agents in 2026: how they work, where they are being used, and what to expect Also check out: 7 Steps to a Hallucination-Free Chatbot: CoT, Self-Consistency, and DSPy in Python Also check out: The Silent Crisis of Multimodal Models: Why 1 in 3 Visual Responses from LLMs in 2026 is a Hallucination
Related Articles
Semantic Search with Python and Open-Source Models
Practical tutorial on embeddings for semantic search in Python using open-source models such as BGE-M3 and GTE-Qwen2. Runnable code and performance metrics.
Hyperparameter Optimization with Hyperopt in 2026: Practical Guide
2026 practical tutorial: learn to optimize machine learning model hyperparameters using Hyperopt, with Bayesian search and result visualization.
Transcription and Response Pipeline with Whisper and Llama 3: Local Implementation in Python
Learn to build a complete voice processing pipeline using Whisper and Llama 3, all locally in Python, with no API costs and full privacy.