Streaming agent output

In some applications, especially where humans are involved, output streaming can be useful for improving responsiveness. LLMs can take a while to generate output, and if there is a human on the other end, it would be much better if they’ll instantly see the output being generated in real time, rather than wait for 30 seconds for the output to be fully generated and displayed.

We’ll demonstrate how this can be achieved in motleycrew using Langchain callbacks. In the first example, we’ll stream agent output to the console while it solves a task inside a crew. The second example will show a standalone agent that asynchronously streams its output to a web client using WebSockets.

Please note that streaming in this fashion is currently only possible with Langchain-based agents. Please file a GitHub issue if you’d like to see other frameworks supported.

[1]:

from langchain.tools import DuckDuckGoSearchRun
from langchain.callbacks.base import BaseCallbackHandler

from motleycrew import MotleyCrew
from motleycrew.tasks import SimpleTask
from motleycrew.agents.langchain import ReActToolCallingMotleyAgent

For observing the output in real time, we’ll use a callback that prints the output tokens as they come in from the LLM.

[2]:

class CommandLineCallbackHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token: str, **kwargs) -> None:
        print(token, end="", flush=True)

    def on_llm_end(self, response, **kwargs) -> None:
        print()

[3]:

agent = ReActToolCallingMotleyAgent(
    tools=[DuckDuckGoSearchRun()],
    runnable_config={"callbacks": [CommandLineCallbackHandler()]},
)

[4]:

crew = MotleyCrew()
task = SimpleTask(
    crew=crew,
    agent=agent,
    description="Conduct a brief report about the latest AI advancements in 2024.",
)
crew.run()

Thought: To provide a brief report on the latest AI advancements in 2024, I need to gather information on recent developments, breakthroughs, and trends in AI for this year. I will use DuckDuckGo Search to find relevant articles and reports.

I will perform a search query to gather the latest information on AI advancements in 2024.

Thought: The search results provide some insights into the latest AI advancements in 2024, including trends and technologies. I will summarize the key points from the information gathered to create a brief report.

Final Answer:

### Brief Report on the Latest AI Advancements in 2024

1. **Generative AI for Everyday Use**:
- Generative AI is becoming more accessible and useful for non-technical users. This year, there is a significant increase in people experimenting with various small AI models.

2. **Top AI Trends**:
- **Multimodal AI**: This technology goes beyond traditional single-mode data processing, integrating multiple types of data (e.g., text, images, audio) to create more sophisticated AI models.
- **Ethics and Safety**: There is a growing emphasis on ethical AI development and deployment, with a focus on safety and compliance with evolving regulatory standards.

3. **Breakthrough Technologies**:
- According to MIT Technology Review, AI continues to be a major area of innovation, with several technologies identified as having the potential to significantly impact our lives.

4. **Industry Leaders**:
- Companies like Google, Meta, Microsoft, and OpenAI are at the forefront of AI advancements, continuously pushing the boundaries of what AI can achieve.

5. **Integration into Daily Life**:
- Since the release of OpenAI's ChatGPT, AI has been increasingly integrated into daily activities, making it a seamless part of everyday life.

These advancements reflect a deepening sophistication in AI technologies and a cautious approach to their development and deployment, ensuring they are both innovative and responsible.

[4]:

[TaskUnit(status=done)]

Streaming agent output to a web client using WebSockets

In real applications, it’s more likely that you’ll want to connect an agent to a client via some protocol. WebSockets are a common choice for tasks like this, because they allow robust low-latency communication between the client and server.

We’ll create a simple web app with FastAPI that allows you to send messages to the agent and see the output in real time.

[5]:

from fastapi import FastAPI, WebSocket
from fastapi.responses import HTMLResponse
import uvicorn

[6]:

class WebSocketCallbackHandler(BaseCallbackHandler):
    def __init__(self, websocket):
        self.websocket = websocket

    async def on_llm_new_token(self, token: str, **kwargs) -> None:
        await self.websocket.send_json({"type": "token", "content": token})

[9]:

app = FastAPI()

@app.get("/")
async def get():
    html_content = """
    <!DOCTYPE html>
    <html>
        <head>
            <title>motleycrew streaming demo</title>
        </head>
        <body>
            <h1>motleycrew streaming demo</h1>
            <input type="text" id="message" placeholder="Enter your message">
            <button onclick="sendMessage()">Send message</button>
            <div id="response" style="white-space: pre-wrap; word-break: break-word; width: 100%;"></div>
            <script>
                var socket = new WebSocket("ws://localhost:8000/ws");
                socket.onmessage = function(event) {
                    var data = JSON.parse(event.data);
                    if (data.type === "token") {
                        document.getElementById("response").textContent += data.content;
                    }
                };
                function sendMessage() {
                    var message = document.getElementById("message").value;
                    socket.send(message);
                    document.getElementById("response").textContent = "";
                }
            </script>
        </body>
    </html>
    """
    return HTMLResponse(content=html_content)


@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    while True:
        message = await websocket.receive_text()

        agent = ReActToolCallingMotleyAgent(
            tools=[DuckDuckGoSearchRun()],
            runnable_config={"callbacks": [WebSocketCallbackHandler(websocket)]},
        )

        await websocket.send_text("Agent response:\n")
        await agent.ainvoke({"prompt": message})

[ ]:

config = uvicorn.Config(app)
server = uvicorn.Server(config)
await server.serve()

Now you can open http://localhost:8000/ in your browser and send messages to the agent. You should see the output generated in real time.