Controlling Autonomous Agents: The Doer Qualities of ChatGPT & Co.

Language Models: A New Tool for Complex Projects

Language models, also known as Large Language Models (LLMs), have the ability to undertake complex projects. However, the user needs to prompt the language model repeatedly for each step, resulting in a breakdown of the presentation’s structure and its content for individual slides. Several new projects are being developed to enable language models to undertake such complex tasks independently.

One such platform is AgentGPT, which is based on GPT-3.5. Users can give the agent a job, such as “Create me a 30-page presentation about LLMs in journalism: How can journalists use them and what are the dangers?” The agent then works through the tasks, including gathering research and information, creating an outline for the presentation, and incorporating the gathered information into the written text. is another such platform. However, unlike AgentGPT, the agent only executes sub-steps based on approval from the user. Additionally, queries search engines and evaluates the returned results.

The emergence of such projects is inspired by Yohei Nakajima’s BabyAGI, which was developed using a JavaScript script less than 9 KB in length. Essentially, the agents use the OpenAI API to complete tasks or break down complex tasks into smaller steps.

Small Autonomous World

Smallville is a 2D world that was created by researchers from Stanford University and Google to demonstrate how human behavior can be depicted in interactive applications. The world consists of 25 agents who “live and work” in the virtual environment. The system uses a language model, GPT-3.5, which provides the agents’ actions and conversations as text inputs.

The language model plays through the actions of the 25 agents in parallel. The researchers have provided an interactive demo whereby observers can follow the agents for 48 hours and understand what each agent is doing and who they are speaking to in text form. Smallville has received extensive media coverage and ignited a boom in agent systems.

Language Models as Robot Controllers?

With the ability of language models to be integrated into virtual environments, researchers are exploring the possibility of using them in the physical world to control robots. Microsoft’s AI research department has conducted a study into using language models to control robots, but a significant challenge is the fact that language models tend to write creative texts rather than precise instructions.

Additionally, language models lack knowledge of the correct control of robot components. To overcome this challenge, researchers introduced a level of correction whereby human check AI-generated commands.


Language models offer a new tool for undertaking complex projects. AgentGPT,, and Smallville are examples of the many projects being developed to enable language models to undertake complex tasks independently. However, as with any new tool, the challenges of using language models to control robots in the physical world must be addressed.

