UCB CS294/194-196 Large Language Model Agents

⭐️⭐️⭐️

This course focuses on the development and application of Large Language Models (LLMs) as agents capable of interacting with the world and performing various tasks. The curriculum is designed to explore the foundational concepts, essential abilities, and infrastructures necessary for LLM agents, as well as their applications across diverse domains.

Author

UCB

Course Link

Course Structure and Content:

Foundations of LLMs: The course begins by discussing the fundamental concepts underlying LLMs, including their architectures and training methodologies.
Essential Abilities for Task Automation: Students will delve into the reasoning and planning capabilities of LLMs, focusing on how these models can utilize tools and perform complex tasks autonomously.
Agent Development Infrastructure: The curriculum covers the infrastructures required for developing LLM agents, such as retrieval-augmented generation techniques and agentic AI frameworks.
Application Domains:
- Code Generation and Data Science: Exploration of how LLMs can assist in software development and data analysis tasks. • Multimodal Agents and Robotics: Investigation into LLMs’ capabilities in processing and integrating multiple data modalities, including their applications in robotics.
- Web Automation and Scientific Discovery: Study of LLM agents’ roles in automating web interactions and contributing to scientific research.
- Evaluation and Benchmarking: The course emphasizes the importance of assessing LLM agents’ performance through rigorous evaluation and benchmarking methodologies.
Privacy, Safety, and Ethics: Discussions on the limitations, potential risks, and ethical considerations associated with current LLM agents, along with insights into directions for further improvement.
Human-Agent Interaction and Alignment: Examination of how LLM agents interact with humans, focusing on personalization and alignment with user intentions.
Multi-Agent Collaboration: Exploration of scenarios where multiple LLM agents collaborate to achieve complex objectives.

Lecture 01: LLM Reasoning

Note

AI should be able to learn from just a few examples, like what humans usually do. Huamns can learn from just a few examples because humans can reason

The basic LLM is just parrot mimic human languages. We can add a reasoning process before get answer. For example:

The reasoning process can be see as intermediate steps before get answer. So, we can conclude see that:

Note

Key Idea:

Derive the Final Answer through Intermediate Steps

How we can add those property?

We need,

Training with intermediate steps
Filetuning with intermediate steps
Prompting with intermediate steps

with curated dataset, for example, GSM8K (Cobbe et al., n.d.). One of the example data is:

{"question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?", 
"answer": "Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72"}

BUT, why intermediate steps are helpful?

One of the theory proposed by (Li et al., n.d.) is that:

Constant-depth transformers can solve any inherently serial problem as long as it generates sufficiently long intermediate reasoning steps
Transformers which directly generate final answers either requires a huge depth to solve or cannot solve at all

that means:

Generating more intermediate steps (think longer)
Too long to generate? Calling external tools, e.g. MCTS

Further more, we can trigger step by step reasoning without using demonstrate examples:

by just add prompt let's think step by step

Howevery, this zero-shot(not provide any example) is worse than few-shot learning(provide several examples). How can we improve the zero-shot learning ability. One of the good discover is that LLMs are analogical reasoners (Yasunaga et al., n.d.). Which mean:

adaptively generate relevant examples and knowledge, rather than just using a fix set of examples

What more? Well, we are so lazy, is there any method that we can not even pass let's think step by step prompts? (chai?) propose a chain-of-thought decoding, which enable LLM do the reasoning process without explicit prompts.

One notice is that, we don’t want intermediate steps, what we want is just the final answer. How we can utilise the reason process to get better answer. One basic idea is we merge all the intermediate steps. This is call Self-Consistency (10.48550/arXiv.2203.11171?), which get the most highest probability answer,

So far, we talk good side about the LLM. what is the silver side? Well, LLM is easily distracted by irrelevant context. And it cannot Self-Correct Reasoning.

References

Cobbe, Karl, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, et al. n.d. “Training Verifiers to Solve Math Word Problems.” https://doi.org/10.48550/arXiv.2110.14168.

Li, Zhiyuan, Hong Liu, Denny Zhou, and Tengyu Ma. n.d. “Chain of Thought Empowers Transformers to Solve Inherently Serial Problems.” https://doi.org/10.48550/arXiv.2402.12875.

Yasunaga, Michihiro, Xinyun Chen, Yujia Li, Panupong Pasupat, Jure Leskovec, Percy Liang, Ed H. Chi, and Denny Zhou. n.d. “Large Language Models as Analogical Reasoners.” https://doi.org/10.48550/arXiv.2310.01714.