
AI Agents for Science Curriculum
The following curriculum outlines topics to be covered and readings, and provides the slides presented in class (minus purely administrative material).
Several guest lectures are not included.
Lecture 1: What is an agent?
Introduces AI agents and and the sense-plan-act-learn loop. Motivates scientific Discovery Platforms (SDPs): AI-native systems that connect reasoning models with scientific resources.
Slides: Lecture 1 slides.
Readings:
- Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects, Cheng et al. (Arxiv, 2024).
- Artificial intelligence and illusions of understanding in scientific research, Messeri & Crockett (Nature, 2023).
- The Shift from Models to Compound AI Systems – The Berkeley Artificial Intelligence Research Blog.
Lecture 2: Frontiers of Language Models
Surveys frontier reasoning models: general-purpose LLMs (GPT, Claude), domain-specific foundation models (materials, bio, weather), and hybrids. Covers techniques for eliciting better reasoning: prompting, chain-of-thought, retrieval-augmented generation (RAG), fine-tuning, and tool-augmented reasoning.
Slides: Lecture 2 slides.
Readings:
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.
- ReAct: Synergizing Reasoning and Acting in Language Models.
Lecture 3: Systems for Agents
Discusses architectures and frameworks for building multi-agent systems, with emphasis on inter-agent communication, orchestration, and lifecycle management.
Slides: Lecture 3 slides.
Readings:
- AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation.
- LangGraph.
- AIOS: LLM Agent Operating System.
Lecture 4: Retrieval Augmented Generation (RAG) and Vector Databases
Covers how to augment reasoning models with external knowledge bases, vector search, and hybrid retrieval methods.
Slides: Lecture 4 slides.
Readings:
Lecture 5: Tool Calling
Introduces methods for invoking external tools from reasoning models. Focus on model context protocol (MCP), schema design, and execution management.
Slides: Lecture 5 slides.
Readings:
Lecture 6: HPC Systems and Self Driving Labs
How SDPs connect to HPC workflows and experimental labs. Covers distributed coordination, robotics, and federated agents.
Slides: Lecture 6 slides.
Readings:
- Self-Driving Laboratories for Chemistry and Materials Science, Chemical Reviews.
- Empowering Scientific Workflows with Federated Agents.
Lecture 7: Human–AI Workflows
Explores how scientists and agents collaborate: trust boundaries, interaction design, and debugging.
Slides: Lecture 7 slides.
Readings:
- Guidelines for Human-AI Interaction, Amershi et al. (CHI, 2019).
- Interactive Debugging and Steering of Multi-Agent AI Systems (CHI, 2025).
Lecture 8: AI co-scientists for accelerating scientific discovery
Guest lecture by Dr. Arvind Ramanathan.
Slides: Lecture 8 slides.
Readings:
- The Virtual Lab of AI agents designs new SARS-CoV-2 nanobodies, Nature.
- The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search.
Lecture 9: Human–AI Workflows, continued
Further discussion of how scientists and agents collaborate
Slides: Lecture 9 slides.
Lecture 10: Benchmarking and Evaluation
Frameworks for assessing agents and SDPs: robustness, validity, and relevance.
Slides: Lecture 10 slides.
Readings:
- MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering.
- AI Agents That Matter.
- EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants.
Lecture 11: Failures and Safety
Examines why multi-agent systems fail and methods for safety and guardrails.
Slides: Lecture 11 slides.
Readings:
- Why Do Multi-Agent LLM Systems Fail?.
- AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection.
- Improve accuracy by adding Automated Reasoning checks in Amazon Bedrock Guardrails.
Lecture 12: Novelty and Plagiarism
Explores originality, credit, and the risks of plagiarism in AI-generated science.
Slides: Lecture 12 slides.
Readings:
- Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers.
- All That Glitters is Not Novel: Plagiarism in AI Generated Research.
Assignment A5: Capstone project planning (novel contributions).
Lecture 13: Building Agents and Workflows
Pipelines, workflow composition, and self-improving systems.
Slides: Lecture 13 slides.
Readings:
- AFlow: Automating Agentic Workflow Generation.
- DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines.
Lecture 14: Finetuning
Covers approaches to adapt agents with reinforcement learning and real-world training.
Slides: Lecture 14 slides.
Readings: