Using AutoGen with motleycrew

Microsoft AutoGen is one of the most popular multi-agent frameworks. Motleycrew supports interaction with autogen in both directions: firstly, you can wrap an AutoGen chat as a motleycrew tool, which you can then give to any supported agent; secondly, you can give any motleycrew tool (which includes agents used as tools) to an AutoGen agent.

The two options are described in detail below

Let’s install external dependencies for the example.

[1]:
%pip install duckduckgo-search==5.3.0b4

# install lunary if you need logging
%pip install lunary==1.1.5
Collecting duckduckgo-search==5.3.0b4
  Using cached duckduckgo_search-5.3.0b4-py3-none-any.whl.metadata (18 kB)
Requirement already satisfied: click>=8.1.7 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from duckduckgo-search==5.3.0b4) (8.1.7)
Requirement already satisfied: httpx>=0.27.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (0.27.2)
Requirement already satisfied: anyio in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx>=0.27.0->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (4.6.0)
Requirement already satisfied: certifi in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx>=0.27.0->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (2024.8.30)
Requirement already satisfied: httpcore==1.* in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx>=0.27.0->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (1.0.5)
Requirement already satisfied: idna in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx>=0.27.0->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (3.10)
Requirement already satisfied: sniffio in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx>=0.27.0->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (1.3.1)
Requirement already satisfied: h11<0.15,>=0.13 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpcore==1.*->httpx>=0.27.0->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (0.14.0)
Requirement already satisfied: brotli in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (1.1.0)
Requirement already satisfied: h2<5,>=3 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (4.1.0)
Requirement already satisfied: socksio==1.* in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (1.0.0)
Requirement already satisfied: hyperframe<7,>=6.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from h2<5,>=3->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (6.0.1)
Requirement already satisfied: hpack<5,>=4.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from h2<5,>=3->httpx[brotli,http2,socks]>=0.27.0->duckduckgo-search==5.3.0b4) (4.0.0)
Using cached duckduckgo_search-5.3.0b4-py3-none-any.whl (23 kB)
Installing collected packages: duckduckgo-search
Successfully installed duckduckgo-search-5.3.0b4
Note: you may need to restart the kernel to use updated packages.
Collecting lunary==1.1.5
  Using cached lunary-1.1.5-py3-none-any.whl.metadata (1.6 kB)
Requirement already satisfied: aiohttp<4.0.0,>=3.9.5 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (3.10.8)
Requirement already satisfied: chevron<0.15.0,>=0.14.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (0.14.0)
Requirement already satisfied: jsonpickle<4.0.0,>=3.0.4 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (3.3.0)
Requirement already satisfied: packaging<24.0,>=23.2 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (23.2)
Requirement already satisfied: pyhumps<4.0.0,>=3.8.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (3.8.0)
Requirement already satisfied: requests<3.0.0,>=2.31.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (2.32.3)
Requirement already satisfied: setuptools<73.0.0,>=72.1.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (72.2.0)
Requirement already satisfied: tenacity<9.0.0,>=8.2.3 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from lunary==1.1.5) (8.5.0)
Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.9.5->lunary==1.1.5) (2.4.2)
Requirement already satisfied: aiosignal>=1.1.2 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.9.5->lunary==1.1.5) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.9.5->lunary==1.1.5) (24.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.9.5->lunary==1.1.5) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.9.5->lunary==1.1.5) (6.1.0)
Requirement already satisfied: yarl<2.0,>=1.12.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.9.5->lunary==1.1.5) (1.13.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests<3.0.0,>=2.31.0->lunary==1.1.5) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests<3.0.0,>=2.31.0->lunary==1.1.5) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests<3.0.0,>=2.31.0->lunary==1.1.5) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests<3.0.0,>=2.31.0->lunary==1.1.5) (2024.8.30)
Using cached lunary-1.1.5-py3-none-any.whl (18 kB)
Installing collected packages: lunary
Successfully installed lunary-1.1.5
Note: you may need to restart the kernel to use updated packages.
[2]:
from dotenv import load_dotenv
load_dotenv()
[2]:
True

Integrating an AutoGen chat into motleycrew

If you want to use an AutoGen chat (say one from their many wonderful examples, or one you have already) as part of a motleycrew setup, you can wrap it as a tool, which you can then give to any other motleycrew-supported agent.

Let’s create an AutoGen chat for automatic problem solving. The code is taken from the example here: https://microsoft.github.io/autogen/docs/notebooks/agentchat_groupchat

[3]:
%pip install pyautogen
Requirement already satisfied: pyautogen in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (0.3.1)
Requirement already satisfied: diskcache in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (5.6.3)
Requirement already satisfied: docker in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (7.1.0)
Requirement already satisfied: flaml in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (2.3.1)
Requirement already satisfied: numpy<2,>=1.17.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (1.26.4)
Requirement already satisfied: openai>=1.3 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (1.50.2)
Requirement already satisfied: packaging in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (23.2)
Requirement already satisfied: pydantic!=2.6.0,<3,>=1.10 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (2.9.2)
Requirement already satisfied: python-dotenv in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (1.0.1)
Requirement already satisfied: termcolor in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (2.5.0)
Requirement already satisfied: tiktoken in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pyautogen) (0.7.0)
Requirement already satisfied: anyio<5,>=3.5.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from openai>=1.3->pyautogen) (4.6.0)
Requirement already satisfied: distro<2,>=1.7.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from openai>=1.3->pyautogen) (1.9.0)
Requirement already satisfied: httpx<1,>=0.23.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from openai>=1.3->pyautogen) (0.27.2)
Requirement already satisfied: jiter<1,>=0.4.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from openai>=1.3->pyautogen) (0.5.0)
Requirement already satisfied: sniffio in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from openai>=1.3->pyautogen) (1.3.1)
Requirement already satisfied: tqdm>4 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from openai>=1.3->pyautogen) (4.66.5)
Requirement already satisfied: typing-extensions<5,>=4.11 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from openai>=1.3->pyautogen) (4.12.2)
Requirement already satisfied: annotated-types>=0.6.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pydantic!=2.6.0,<3,>=1.10->pyautogen) (0.7.0)
Requirement already satisfied: pydantic-core==2.23.4 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from pydantic!=2.6.0,<3,>=1.10->pyautogen) (2.23.4)
Requirement already satisfied: requests>=2.26.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from docker->pyautogen) (2.32.3)
Requirement already satisfied: urllib3>=1.26.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from docker->pyautogen) (2.2.3)
Requirement already satisfied: regex>=2022.1.18 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from tiktoken->pyautogen) (2024.9.11)
Requirement already satisfied: idna>=2.8 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from anyio<5,>=3.5.0->openai>=1.3->pyautogen) (3.10)
Requirement already satisfied: certifi in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx<1,>=0.23.0->openai>=1.3->pyautogen) (2024.8.30)
Requirement already satisfied: httpcore==1.* in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpx<1,>=0.23.0->openai>=1.3->pyautogen) (1.0.5)
Requirement already satisfied: h11<0.15,>=0.13 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai>=1.3->pyautogen) (0.14.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests>=2.26.0->docker->pyautogen) (3.3.2)
Note: you may need to restart the kernel to use updated packages.
[4]:
import autogen
import os

llm_config = {
    "config_list": [{"model": "gpt-4-turbo", "api_key": os.environ["OPENAI_API_KEY"]}],
    "cache_seed": None,
}
flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/stefan/.config/sagemaker/config.yaml
[5]:
user_proxy = autogen.UserProxyAgent(
    name="User_proxy",
    system_message="A human admin.",
    code_execution_config={
        "last_n_messages": 2,
        "work_dir": "examples/data/groupchat",
        "use_docker": False,
    },  # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
    human_input_mode="TERMINATE",
)
coder = autogen.AssistantAgent(
    name="Coder",
    llm_config=llm_config,
)
pm = autogen.AssistantAgent(
    name="Product_manager",
    system_message="Creative in software product ideas.",
    llm_config=llm_config,
)
groupchat = autogen.GroupChat(agents=[user_proxy, coder, pm], messages=[], max_round=12)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

If we were using plain AutoGen, we’d start the chat with something like this:

user_proxy.initiate_chat(
    manager, message="Find a latest paper about gpt-4 on arxiv and find its potential applications in software."
)
You see, the chat accepts an input and returns an output, just like a tool.
Because of that, for using the chat in motleycrew, we can utilize the built-in AutoGenChatTool. Its prompt can either be a template or a plain string. Here we are creating a tool that searches arXiv for recent papers.
[6]:
from motleycrew.tools.autogen_chat_tool import AutoGenChatTool

knowledge_retrieval_tool = AutoGenChatTool(
    name="retrieve_knowledge_by_topic",
    description="Search arxiv for the latest paper on a given topic "
                "and find its potential applications in software.",  # will be used in the prompt of the future agent that will use the tool
    prompt="Find a latest paper about {topic} on arxiv "
            "and find its potential applications in software.",  # this is the initial prompt for the AutoGen chat itself
    initiator=user_proxy,
    recipient=manager,
)

We can now give the tool to any agent and solve tasks with it.

[7]:
from motleycrew import MotleyCrew
from motleycrew.agents.langchain import ReActToolCallingMotleyAgent

crew = MotleyCrew()
writer = ReActToolCallingMotleyAgent(tools=[knowledge_retrieval_tool])
[8]:
from motleycrew.tasks import SimpleTask

blog_post_task = SimpleTask(
        crew = crew,
        name="Produce blog post on the applications of latest advancements related to GPT-4",
        description="Using the insights provided by searching research papers, develop an engaging blog "
                    "post that highlights the most significant advancements on GPT-4 ant their applications.\n"
                    "Your post should be informative yet accessible, catering to a tech-savvy audience.\n"
                    "Make it sound cool, avoid complex words so it doesn't sound like AI. "
                    "Create a blog post of at least 4 paragraphs.",
        agent=writer,
    )
[9]:
crew.run()
User_proxy (to chat_manager):

Find a latest paper about GPT-4 advancements on arxiv and find its potential applications in software.

--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: Coder

Coder (to chat_manager):

To fetch the latest paper about GPT-4 advancements from arXiv and explore its potential applications in software, the following python script uses the `arxiv` library that makes it easy to query the arXiv API, find the relevant papers about GPT-4, and display the title, summary, and categories of the latest paper. This will help us understand what kind of advancements have been made and the potential domains of application.

Here’s the python script that you will need to run:

```python
# filename: fetch_latest_gpt4_paper.py
import arxiv

# Search for papers on arXiv about GPT-4
search = arxiv.Search(
  query="GPT-4",
  max_results=1,
  sort_by=arxiv.SortCriterion.SubmittedDate,  # Fetches the most recent submission
  sort_order=arxiv.SortOrder.Descending
)

# Fetch and display the information for the latest paper
for result in search.results():
    print("Title:", result.title)
    print("Authors:", ", ".join(author.name for author in result.authors))
    print("Summary:", result.summary.replace('\n', ' '))
    print("Categories:", ", ".join(category for category in result.categories))
    print("Published Date:", result.published)
    print("PDF Link:", result.pdf_url)
```

To proceed:
1. Ensure you have the `arxiv` Python library installed. If not, you can install it using pip:

   ```
   pip install arxiv
   ```

2. Save the above code into a file named `fetch_latest_gpt4_paper.py`.
3. Run the script using Python:

   ```
   python fetch_latest_gpt4_paper.py
   ```

The script will display the latest GPT-4 paper it finds on arXiv. After you run this and provide me with the summary from the output, I can then help identify the potential software applications described in the paper.

--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
User_proxy (to chat_manager):

exitcode: 1 (execution failed)
Code output:
Traceback (most recent call last):
  File "fetch_latest_gpt4_paper.py", line 2, in <module>
    import arxiv
ModuleNotFoundError: No module named 'arxiv'


--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
User_proxy (to chat_manager):

exitcode: 1 (execution failed)
Code output:
Traceback (most recent call last):
  File "fetch_latest_gpt4_paper.py", line 2, in <module>
    import arxiv
ModuleNotFoundError: No module named 'arxiv'


--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: Coder

Coder (to chat_manager):

It appears that the ‘arxiv’ library required for the script has not been installed. Please install the library using the following command and then execute the script again:

```sh
pip install arxiv
```

After installing the library, rerun the script `fetch_latest_gpt4_paper.py` by issuing the following command:

```sh
python fetch_latest_gpt4_paper.py
```

This should fetch the latest GPT-4 paper from arXiv. Please provide the output afterwards so I can help identify the potential applications in software from the paper's content.

--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is sh)...

>>>>>>>> EXECUTING CODE BLOCK 1 (inferred language is sh)...
User_proxy (to chat_manager):

exitcode: 0 (execution succeeded)
Code output:
Collecting arxiv
  Downloading arxiv-2.1.3-py3-none-any.whl.metadata (6.1 kB)
Collecting feedparser~=6.0.10 (from arxiv)
  Using cached feedparser-6.0.11-py3-none-any.whl.metadata (2.4 kB)
Requirement already satisfied: requests~=2.32.0 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from arxiv) (2.32.3)
Collecting sgmllib3k (from feedparser~=6.0.10->arxiv)
  Using cached sgmllib3k-1.0.0-py3-none-any.whl
Requirement already satisfied: charset-normalizer<4,>=2 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests~=2.32.0->arxiv) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests~=2.32.0->arxiv) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests~=2.32.0->arxiv) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in /home/stefan/.cache/pypoetry/virtualenvs/motleycrew-RnW5x4dY-py3.12/lib/python3.12/site-packages (from requests~=2.32.0->arxiv) (2024.8.30)
Downloading arxiv-2.1.3-py3-none-any.whl (11 kB)
Using cached feedparser-6.0.11-py3-none-any.whl (81 kB)
Installing collected packages: sgmllib3k, feedparser, arxiv
Successfully installed arxiv-2.1.3 feedparser-6.0.11 sgmllib3k-1.0.0

Title: SWE-Bench+: Enhanced Coding Benchmark for LLMs
Authors: Reem Aleithan, Haoran Xue, Mohammad Mahdi Mohajer, Elijah Nnorom, Gias Uddin, Song Wang
Summary: Large Language Models (LLMs) in Software Engineering (SE) can offer assistance for coding. To facilitate a rigorous evaluation of LLMs in practical coding contexts, Carlos et al. introduced the SWE-bench dataset, which comprises 2,294 real-world GitHub issues and their corresponding pull requests, collected from 12 widely used Python repositories. Several impressive LLM-based toolkits recently are developed and evaluated on this dataset. However, a systematic evaluation of the quality of SWE-bench remains missing. In this paper, we addressed this gap by presenting an empirical analysis of the SWE-bench dataset. We conducted a manual screening of instances where SWEAgent + GPT-4 successfully resolved issues by comparing the model-generated patches with the actual pull requests. SWE-Agent+GPT-4 was at the top of SWE-bench leaderboard during the time of our study. Our analysis reveals some critical issues with the SWE-bench dataset: 1) 32.67% of the successful patches involve cheating as the solutions were directly provided in the issue report or the comments. We refer to as solution leakage problem. 2) 31.08% of the passed patches are suspicious patches due to weak test cases, i.e., the tests were not adequate to verify the correctness of a patch. When we filtered out these problematic issues, the resolution rate of SWE-Agent+GPT-4 dropped from 12.47% to 3.97%. We also observed that the same data quality issues also exist in the two variants of SWE-bench, i.e., SWE-bench Lite and SWE-Bench Verified. In addition, over 94% of the issues were created before LLM's knowledge cutoff dates, posing potential data leakage issues.
Categories: cs.SE
Published Date: 2024-10-09 15:38:53+00:00
PDF Link: http://arxiv.org/pdf/2410.06992v1


--------------------------------------------------------------------------------

Next speaker: Coder

Coder (to chat_manager):

The latest paper retrieved from arXiv titled **"SWE-Bench+: Enhanced Coding Benchmark for LLMs"** discusses advancements related to Large Language Models (LLMs) like GPT-4 specifically in the context of software engineering. Here are some important insights regarding its potential applications in software:

1. **Automated Code Generation**: The SWE-Agent+GPT-4 demonstrated the ability to resolve issues by generating code patches based on real-world GitHub issues. This application shows promise for automated pull requests and bug fixes.

2. **Evaluation of LLMs in Real-world Scenarios**: The paper emphasizes rigorous evaluation of LLMs in practical coding contexts, implying that GPT-4 can significantly contribute to ensuring the reliability and quality of automations in software development.

3. **Quality Control in Machine-Generated Code**: The paper addresses issues like 'solution leakage' and weak test cases impacting the performance of GPT-4. The analysis in the paper could help improve the mechanisms for validating and verifying machine-generated code, making it safe and reliable for practical use.

The analysis also highlighted the importance of data quality in training these models, where data leakage and inadequate tests can mislead the effectiveness of the model. This understanding can drive the development of better training datasets and evaluation protocols for AI-driven development tools.

Overall, the potential applications of GPT-4 in software outlined by this academic work include enhancing automated coding tools, improving software development testing, and refining dataset quality for training AI models. These advancements can lead to more robust, AI-assisted programming environments, potentially reducing human error and increasing development efficiency.

This reflects a substantial forward in how AI can assist in software engineering, hinting at broader implementation and deeper integration of LLMs into the software development lifecycle.

If you find these findings intriguing and wish to delve deeper into the details of the research, you can further explore the study via the provided PDF link: [SWE-Bench+: Enhanced Coding Benchmark for LLMs PDF](http://arxiv.org/pdf/2410.06992v1)

Would there be anything specific you'd like to explore or inquire further about this topic?

--------------------------------------------------------------------------------

Next speaker: Product_manager

Product_manager (to chat_manager):

It appears there was a misunderstanding. If you have specific software product ideas in relation to GPT-4 advancements or any other requests, please let me know so I can assist you further.

--------------------------------------------------------------------------------

Next speaker: Product_manager

Product_manager (to chat_manager):

It looks like there was a bit of confusion in our last exchange. If you have any specific queries or directions regarding using GPT-4 advancements in software product ideas or any other questions, please let me know! I'm here to help with more tailored responses or to explore other areas of interest you might have.

--------------------------------------------------------------------------------
User_proxy (to chat_manager):

Find a latest paper about GPT-4 advancements on arxiv and find its potential applications in software.

--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...
User_proxy (to chat_manager):



--------------------------------------------------------------------------------

Next speaker: Coder

Coder (to chat_manager):

First, let's search for the latest papers related to GPT-4 advancements on arXiv.org using their API. To do this, we will query the arXiv API for papers mentioning "GPT-4" and sort them by submission date to get the most recent publications.

```python
import urllib.request
import urllib.parse
import feedparser

# Base URL for the arXiv API
base_url = 'http://export.arxiv.org/api/query?'

# Search parameters
query = 'ti:"GPT-4"'
params = {
    'search_query': query,
    'sortBy': 'submittedDate',
    'sortOrder': 'descending',
    'max_results': 1
}

# Form the request URL
url = base_url + urllib.parse.urlencode(params)

# Perform the request
response = urllib.request.urlopen(url)
feed = feedparser.parse(response)

# Check if any results were found
if len(feed.entries) > 0:
    title = feed.entries[0].title
    summary = feed.entries[0].summary
    published = feed.entries[0].published
    link = feed.entries[0].links[0].href

    print(f"Title: {title}\nPublished: {published}\nLink: {link}\nSummary: {summary}")
else:
    print("No results found for GPT-4 on arXiv.")
```

Please run the above Python script. It will find and print details about the latest paper related to GPT-4 on arXiv, including the title, publication date, link, and a brief summary.

--------------------------------------------------------------------------------

Next speaker: User_proxy


>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
User_proxy (to chat_manager):

exitcode: 0 (execution succeeded)
Code output:
Title: Creative and Context-Aware Translation of East Asian Idioms with GPT-4
Published: 2024-10-01T18:24:43Z
Link: http://arxiv.org/abs/2410.00988v1
Summary: As a type of figurative language, an East Asian idiom condenses rich cultural
background into only a few characters. Translating such idioms is challenging
for human translators, who often resort to choosing a context-aware translation
from an existing list of candidates. However, compiling a dictionary of
candidate translations demands much time and creativity even for expert
translators. To alleviate such burden, we evaluate if GPT-4 can help generate
high-quality translations. Based on automatic evaluations of faithfulness and
creativity, we first identify Pareto-optimal prompting strategies that can
outperform translation engines from Google and DeepL. Then, at a low cost, our
context-aware translations can achieve far more high-quality translations per
idiom than the human baseline. We open-source all code and data to facilitate
further research.


--------------------------------------------------------------------------------

Next speaker: Product_manager

Product_manager (to chat_manager):

The latest paper on GPT-4, titled "Creative and Context-Aware Translation of East Asian Idioms with GPT-4," describes an innovative application of GPT-4 in language translation, particularly focusing on translating complex East Asian idioms. The paper is available [here](http://arxiv.org/abs/2410.00988v1).

**Potential Applications in Software:**

1. **Translation Software Enhancement:**
   - Implementing GPT-4 in translation software could significantly improve the translation quality of idiomatic phrases, which are often a challenge due to their contextual and cultural depth. Software could provide multiple translations with contextual nuances, improving user understanding and satisfaction.

2. **Language Learning Tools:**
   - Language learning platforms can integrate GPT-4 to offer more nuanced translations of phrases and idioms, helping learners understand the cultural context better. This could enhance comprehension and retention for students studying East Asian languages.

3. **Content Localization Services:**
   - Companies that require localization of content into multiple languages, such as media services or multinational corporations, could use GPT-4-enhanced tools to ensure that the translated content retains the intended meaning and cultural significance.

4. **Dictionary and Reference Applications:**
   - Digital dictionaries and reference applications can leverage GPT-4 to provide users with rich, contextually appropriate interpretations of idioms or phrases, rather than static, literal translations.

5. **Creative Writing Software:**
   - Tools designed to aid in creative writing or content generation could incorporate GPT-4 to suggest creative ways to use idiomatic expressions in various languages, enhancing the narrative quality and engagement of written content.

6. **Automated Subtitling and Dubbing in Media Production:**
   - Media production companies can apply GPT-4-based translation frameworks to automate and improve the quality of subtitling and dubbing in different languages, particularly for content rich in cultural idioms.

This advancement offers an exciting glimpse into how AI can bridge the gap between languages and cultures, enhancing understanding and communication globally.

--------------------------------------------------------------------------------

Next speaker: Product_manager

Product_manager (to chat_manager):

This research demonstrates GPT-4's potential in language translation applications, specifically for translating complex East Asian idioms. Here are possible applications of this technology in software:

1. **Advanced Translation Services**: GPT-4 can be integrated into translation software to provide more accurate and culturally relevant translations of idioms. This would be particularly useful in professional settings where precise translation can avoid miscommunication.

2. **Educational Tools**: Educational platforms could use GPT-4 to help students understand idioms and expressions from different cultures in their original context, improving language learning by providing insights into cultural nuances.

3. **Content Localization**: Companies looking to expand globally often need to adapt their content to local cultures. GPT-4's ability to provide context-aware translations could improve content localization, making products more accessible and appealing to international markets.

4. **Customer Support**: AI-powered chatbots and customer support tools could use GPT-4 to better understand and respond to queries that include idiomatic expressions, providing a more effective and satisfying user experience for customers speaking different languages.

5. **Creative Writing**: Software tools designed to assist with writing could incorporate GPT-4 to suggest or decode idiomatic expressions, aiding writers in creating more authentic and engaging content for diverse audiences.

These applications could significantly enhance the capabilities of software products by providing deeper linguistic and cultural understanding, essential for global interaction in the digital age.

--------------------------------------------------------------------------------

Next speaker: Product_manager

Product_manager (to chat_manager):

Given the significant potential of GPT-4 in enhancing translation accuracy and contextual understanding, especially for complex idiomatic expressions in East Asian languages, here are various software product ideas that could leverage such advancements:

### Multilingual Customer Support Chatbots
1. **Improved Understanding**: GPT-4 could enable chatbots to understand and employ idiomatic language effectively, providing more natural and relatable interactions for users who communicate using these expressions.
2. **Cultural Sensitivity**: Chatbots could be trained to recognize cultural nuances, improving engagement with users from diverse backgrounds.

### Translation Tools for Authors and Creators
1. **Cultural Translation Features**: Tools that help authors translate their works while keeping cultural meanings intact, preserving the original sentiment and context.
2. **Assisted Writing Modes**: For writers looking to authentically incorporate foreign idioms into their work, offering suggestions and translations that retain the flavor of the source language.

### Enhanced Subtitling Applications for Media
1. **Automated Contextual Subtitling**: Software that uses GPT-4 to generate subtitles not just for direct translation but with contextual adjustments ensuring cultural relevance and comprehension for diverse audiences.
2. **Creative Subtitling**: Tools for subtitlers that suggest multiple translation options per idiom, based on varying levels of formality or directness, enhancing viewer engagement and satisfaction.

### Educational Language Learning Platforms
1. **Contextual Learning Modules**: Use GPT-4 to design language learning curricula that focus on the use of idioms in everyday conversation, providing learners with the cultural context behind the expressions.
2. **Interactive Scenarios**: Develop interactive scenarios where learners can see idioms used in different contexts, helping to solidify understanding and usage.

### Localized Content Generation for Marketers
1. **Automated Content Localization**: Tools that automatically adapt content (including idioms and colloquialisms) to fit local markets more effectively, enhancing user engagement and brand relatability.
2. **Marketing Aide for Cultural Nuances**: Software that suggests modifications or enhancements to marketing materials to make them more appealing to specific regional audiences.

These product ideas not only focus on translating language but also emphasize understanding and conveying underlying cultural contexts, which could dramatically improve communication and user experience across various digital platforms.

--------------------------------------------------------------------------------
[9]:
[TaskUnit(status=done)]
[10]:
from IPython.display import display, Markdown
display(Markdown(blog_post_task.output))

Final Answer:

Unveiling the Magic of GPT-4: Transforming Language and Culture

In the ever-evolving world of artificial intelligence, GPT-4 stands out as a beacon of innovation, pushing the boundaries of what’s possible in language processing. Imagine a world where language barriers crumble, and cultural nuances are effortlessly understood. That’s the promise of GPT-4, especially when it comes to translating complex idiomatic expressions in East Asian languages. This isn’t just about words; it’s about capturing the essence of culture and context, making communication more authentic and relatable.

One of the coolest applications of GPT-4 is in the realm of multilingual customer support chatbots. These aren’t your average bots; they’re culturally savvy, understanding and employing idiomatic language to create natural interactions. Imagine chatting with a bot that gets your cultural references and responds in a way that feels genuinely human. It’s like having a conversation with a local, no matter where you are in the world. This level of cultural sensitivity can revolutionize customer engagement, making interactions more meaningful and effective.

For authors and creators, GPT-4 opens up a world of possibilities in translation tools. Picture a tool that not only translates your work but also preserves the cultural meanings and sentiments behind it. Whether you’re an author looking to share your story with a global audience or a writer wanting to incorporate foreign idioms authentically, GPT-4 has got you covered. It’s like having a cultural ambassador by your side, ensuring your message resonates with readers from different backgrounds.

The media industry, too, can benefit from GPT-4’s advancements. Automated contextual subtitling is a game-changer, generating subtitles that go beyond direct translation to ensure cultural relevance and comprehension. Imagine watching a foreign film where the subtitles capture not just the dialogue but the cultural nuances, enhancing your viewing experience. It’s like having a personal translator who understands the subtleties of language and culture, making global content more accessible and enjoyable.

In the world of education, GPT-4 is a powerful ally for language learners. With contextual learning modules, students can dive into the world of idioms, understanding their use in everyday conversation. Interactive scenarios bring these expressions to life, helping learners grasp their meaning and usage in different contexts. It’s like having a language coach who not only teaches you the words but also the stories behind them, enriching your learning journey.

GPT-4 is not just about language; it’s about bridging cultures and enhancing communication. Whether it’s through chatbots, translation tools, media applications, or educational platforms, the potential is limitless. As we continue to explore and harness the power of GPT-4, one thing is clear: the future of language processing is here, and it’s more exciting than ever.

The writer agent used the AutoGen chat tool to retrieve data for the post!

Using any motleycrew-supported tools with Autogen agents

Now let’s do it the other way around: give tools (or agents, for that matter) from motleycrew to AutoGen agents. The code is based on https://microsoft.github.io/autogen/docs/tutorial/tool-use

[11]:
from motleycrew.tools import MotleyTool, RetryConfig
from langchain_community.tools import DuckDuckGoSearchRun

search_tool = MotleyTool.from_supported_tool(
    DuckDuckGoSearchRun(),
    retry_config=RetryConfig(max_retries=5)  # for retrying rate limit errors
)  # Any tools or even motleycrew's agents can be converted to MotleyTool like this!

# Let's first define the assistant agent that suggests tool calls.
assistant = autogen.ConversableAgent(
    name="Assistant",
    system_message="You are a helpful AI assistant. "
    "You can provide useful and up-to-date data using web search. "
    "Return 'TERMINATE' when the task is done.",
    llm_config=llm_config,
)

# The user proxy agent is used for interacting with the assistant agent
# and executes tool calls.
user_proxy = autogen.ConversableAgent(
    name="User",
    llm_config=False,
    is_termination_msg=lambda msg: msg.get("content") is not None and "TERMINATE" in msg["content"],
    human_input_mode="NEVER",
)

# Make an autogen-compatible tool from the MotleyTool.
autogen_tool = search_tool.to_autogen_tool()

# Register the tool signature with the assistant agent.
assistant.register_for_llm(name="search_tool", description="Web search tool")(autogen_tool)

# Register the tool function with the user proxy agent.
user_proxy.register_for_execution(name="search_tool")(autogen_tool)
[11]:
<function motleycrew.tools.tool.MotleyTool.to_autogen_tool.<locals>.autogen_tool_fn(input: str) -> str>
[12]:
chat_result = user_proxy.initiate_chat(assistant, message="What was the first computer?")
User (to Assistant):

What was the first computer?

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
Assistant (to User):

***** Suggested tool call (call_lp7DaZO1WUlu7YsMZazwsZX2): search_tool *****
Arguments:
{"input":"first computer"}
****************************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION search_tool...
User (to Assistant):

User (to Assistant):

***** Response from calling tool (call_lp7DaZO1WUlu7YsMZazwsZX2) *****
Learn how the English mathematician and inventor Charles Babbage designed the Difference Engine and the Analytical Engine, the first mechanical digital computers. Explore their features, components, and challenges in this article from Britannica. ENIAC was built by the United States during World War II and completed in 1946. It was the first general-purpose digital computer that could execute different instructions based on data values. Learn about the history of computing, from Charles Babbage's mechanical difference and analytical engines to the first electronic computer, the Atanasoff-Berry Computer. Discover how World War II and personal computers shaped the evolution of computers. Computer - ENIAC, Electronic, Computing: In the United States, government funding went to a project led by John Mauchly, J. Presper Eckert, Jr., and their colleagues at the Moore School of Electrical Engineering at the University of Pennsylvania; their objective was an all-electronic computer. Under contract to the army and under the direction of Herman Goldstine, work began in early 1943 on ... Learn about 10 of the earliest devices that shaped the concepts of computation and automated calculation, from ancient Greek gears to modern electronic circuits. Discover how they were used for astronomy, cryptography, physics, and more.
**********************************************************************

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
Assistant (to User):

The title of "first computer" can be attributed to different machines depending on the criteria used (mechanical vs. digital, programmable vs. non-programmable).

1. **Difference Engine and Analytical Engine by Charles Babbage**:
   - Charles Babbage, an English mathematician and inventor, designed the Difference Engine and the Analytical Engine in the 19th century. These were mechanical digital computers meant to automate the process of computing tables and had components analogous to a CPU, memory, and input/output paths.
   - Although the full versions of these engines were never completed during Babbage’s lifetime, they are often considered the first designs for a programmable computer.

2. **Atanasoff-Berry Computer (ABC)**:
   - Conceived in the 1930s by John Atanasoff and Clifford Berry, the ABC was the first electronic digital computer. It was designed to solve systems of linear equations and was not Turing-complete nor programmable in the modern sense.

3. **ENIAC (Electronic Numerical Integrator and Computer)**:
   - Developed by John Mauchly and J. Presper Eckert during World War II, and completed in 1946. ENIAC was considered the first high-speed, Turing-complete, digital computer capable of being reprogrammed to solve a full range of computing problems.

The ENIAC often gets credited as the first "general-purpose digital computer," given its electronic and programmable nature, distinguishing it from earlier mechanical or less flexible designs.

--------------------------------------------------------------------------------
User (to Assistant):



--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
Assistant (to User):

If you have any more questions or need further information about historical computers or any other topic, feel free to ask.

--------------------------------------------------------------------------------
User (to Assistant):



--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
Assistant (to User):

TERMINATE

--------------------------------------------------------------------------------
[17]:
final_result = 'The agent used the search tool!'
print(final_result)
The agent used the search tool!