In our earlier tutorial, we constructed an AI agent able to answering queries by browsing the net. Nonetheless, when constructing brokers for longer-running duties, two vital ideas come into play: persistence and streaming. Persistence lets you save the state of an agent at any given level, enabling you to renew from that state in future interactions. That is essential for long-running purposes. Alternatively, streaming helps you to emit real-time alerts about what the agent is doing at any second, offering transparency and management over its actions. On this tutorial, we’ll improve our agent by including these highly effective options.
Setting Up the Agent
Let’s begin by recreating our agent. We’ll load the required atmosphere variables, set up and import the required libraries, arrange the Tavily search instrument, outline the agent state, and at last, construct the agent.
os.environ[‘TAVILY_API_KEY’] = “<TAVILY_API_KEY>”
os.environ[‘GROQ_API_KEY’] = “<GROQ_API_KEY>”
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
from langchain_core.messages import AnyMessage, SystemMessage, HumanMessage, ToolMessage
from langchain_groq import ChatGroq
from langchain_community.instruments.tavily_search import TavilySearchResults
instrument = TavilySearchResults(max_results=2)
class AgentState(TypedDict):
messages: Annotated[list[AnyMessage], operator.add]
class Agent:
def __init__(self, mannequin, instruments, system=””):
self.system = system
graph = StateGraph(AgentState)
graph.add_node(“llm”, self.call_openai)
graph.add_node(“motion”, self.take_action)
graph.add_conditional_edges(“llm”, self.exists_action, {True: “motion”, False: END})
graph.add_edge(“motion”, “llm”)
graph.set_entry_point(“llm”)
self.graph = graph.compile()
self.instruments = {t.title: t for t in instruments}
self.mannequin = mannequin.bind_tools(instruments)
def call_openai(self, state: AgentState):
messages = state[‘messages’]
if self.system:
messages = [SystemMessage(content=self.system)] + messages
message = self.mannequin.invoke(messages)
return {‘messages’: [message]}
def exists_action(self, state: AgentState):
consequence = state[‘messages’][-1]
return len(consequence.tool_calls) > 0
def take_action(self, state: AgentState):
tool_calls = state[‘messages’][-1].tool_calls
outcomes = []
for t in tool_calls:
print(f”Calling: {t}”)
consequence = self.instruments[t[‘name’]].invoke(t[‘args’])
outcomes.append(ToolMessage(tool_call_id=t[‘id’], title=t[‘name’], content material=str(consequence)))
print(“Again to the mannequin!”)
return {‘messages’: outcomes}
Including Persistence
So as to add persistence, we’ll use LangGraph’s checkpointer characteristic. A checkpointer saves the state of the agent after and between each node. For this tutorial, we’ll use SqliteSaver, a easy checkpointer that leverages SQLite, a built-in database. Whereas we’ll use an in-memory database for simplicity, you may simply join it to an exterior database or use different checkpoints like Redis or Postgres for extra sturdy persistence.
import sqlite3
sqlite_conn = sqlite3.join(“checkpoints.sqlite”,check_same_thread=False)
reminiscence = SqliteSaver(sqlite_conn)
Subsequent, we’ll modify our agent to simply accept a checkpointer:
def __init__(self, mannequin, instruments, checkpointer, system=””):
# The whole lot else stays the identical as earlier than
self.graph = graph.compile(checkpointer=checkpointer)
# The whole lot else after this stays the identical
Now, we will create our agent with persistence enabled:
You might be allowed to make a number of calls (both collectively or in sequence).
Solely search for data if you end up positive of what you need.
If you must search for some data earlier than asking a follow-up query, you’re allowed to do this!
“””
mannequin = ChatGroq(mannequin=”Llama-3.3-70b-Specdec”)
bot = Agent(mannequin, [tool], system=immediate, checkpointer=reminiscence)
Including Streaming
Streaming is crucial for real-time updates. There are two sorts of streaming we’ll deal with:
1. Streaming Messages: Emitting intermediate messages like AI selections and gear outcomes.
2. Streaming Tokens: Streaming particular person tokens from the LLM’s response.Let’s begin by streaming messages. We’ll create a human message and use the stream technique to look at the agent’s actions in real-time.
thread = {“configurable”: {“thread_id”: “1”}}
for occasion in bot.graph.stream({“messages”: messages}, thread):
for v in occasion.values():
print(v[‘messages’])
Remaining output: The present climate in Texas is sunny with a temperature of 19.4°C (66.9°F) and a wind pace of 4.3 mph (6.8 kph)…..
If you run this, you’ll see a stream of outcomes. First, an AI message instructing the agent to name Tavily, adopted by a instrument message with the search outcomes, and at last, an AI message answering the query.
Understanding Thread IDs
The thread_id is an important a part of the thread configuration. It permits the agent to take care of separate conversations with completely different customers or contexts. By assigning a novel thread_id to every dialog, the agent can preserve observe of a number of interactions concurrently with out mixing them up.
For instance, let’s proceed the dialog by asking, “What about in LA?” utilizing the identical thread_id:
thread = {“configurable”: {“thread_id”: “1”}}
for occasion in bot.graph.stream({“messages”: messages}, thread):
for v in occasion.values():
print(v)
Remaining output: The present climate in Los Angeles is sunny with a temperature of 17.2°C (63.0°F) and a wind pace of two.2 mph (3.6 kph) ….
The agent infers that we’re asking concerning the climate, because of persistence. To confirm, let’s ask, “Which one is hotter?”:
thread = {“configurable”: {“thread_id”: “1”}}
for occasion in bot.graph.stream({“messages”: messages}, thread):
for v in occasion.values():
print(v)
Remaining output: Texas is hotter than Los Angeles. The present temperature in Texas is nineteen.4°C (66.9°F), whereas the present temperature in Los Angeles is 17.2°C (63.0°F)
The agent accurately compares the climate in Texas and LA. To check if persistence retains conversations separate, let’s ask the identical query with a distinct thread_id:
thread = {“configurable”: {“thread_id”: “2”}}
for occasion in bot.graph.stream({“messages”: messages}, thread):
for v in occasion.values():
print(v)
Output: I would like extra data to reply that query. Are you able to please present extra context or specify which two issues you’re evaluating?
This time, the agent will get confused as a result of it doesn’t have entry to the earlier dialog’s historical past.
Streaming Tokens
To stream tokens, we’ll use the astream_events technique, which is asynchronous. We’ll additionally swap to an async checkpointer.
async with AsyncSqliteSaver.from_conn_string(“:reminiscence:”) as checkpointer:
abot = Agent(mannequin, [tool], system=immediate, checkpointer=checkpointer)
messages = [HumanMessage(content=”What is the weather in SF?”)]
thread = {“configurable”: {“thread_id”: “4”}}
async for occasion in abot.graph.astream_events({“messages”: messages}, thread, model=”v1″):
type = occasion[“event”]
if type == “on_chat_model_stream”:
content material = occasion[“data”][“chunk”].content material
if content material:
# Empty content material within the context of OpenAI means
# that the mannequin is asking for a instrument to be invoked.
# So we solely print non-empty content material
print(content material, finish=”|”)
It will stream tokens in real-time, providing you with a stay view of the agent’s thought course of.
Conclusion
By including persistence and streaming, we’ve considerably enhanced our AI agent’s capabilities. Persistence permits the agent to take care of context throughout interactions, whereas streaming offers real-time insights into its actions. These options are important for constructing production-ready purposes, particularly these involving a number of customers or human-in-the-loop interactions.
Within the subsequent tutorial, we’ll dive into human-in-the-loop interactions, the place persistence performs an important position in enabling seamless collaboration between people and AI brokers. Keep tuned!
References:
(DeepLearning.ai)
Additionally, don’t overlook to observe us on Twitter and be a part of our Telegram Channel and LinkedIn Group. Don’t Neglect to affix our 75k+ ML SubReddit.
🚨 Meet IntellAgent: An Open-Supply Multi-Agent Framework to Consider Advanced Conversational AI System (Promoted)
Vineet Kumar is a consulting intern at MarktechPost. He’s at present pursuing his BS from the Indian Institute of Know-how(IIT), Kanpur. He’s a Machine Studying fanatic. He’s keen about analysis and the most recent developments in Deep Studying, Laptop Imaginative and prescient, and associated fields.