The following draft curriculum outlines topics to be covered and potential readings.
Introduces the concept of Scientific Discovery Platforms (SDPs): AI-native systems that connect reasoning models with scientific resources. We’ll explore motivating case studies (wildfire hazard, antimicrobials, climate modeling) and outline the challenges of integrating AI into rigorous science.
Readings:
Surveys frontier reasoning models: general-purpose LLMs (GPT, Claude), domain-specific foundation models (materials, bio, weather), and hybrids. Covers techniques for eliciting better reasoning: prompting, chain-of-thought, retrieval-augmented generation (RAG), fine-tuning, and tool-augmented reasoning.
Readings:
Assignment A1: Implement a ReACT style agent.
Discusses architectures and frameworks for building multi-agent systems, with emphasis on inter-agent communication, orchestration, and lifecycle management.
Readings:
Covers how to augment reasoning models with external knowledge bases, vector search, and hybrid retrieval methods.
Readings:
Assignment A2: Hybrid retrieval.
Introduces methods for invoking external tools from reasoning models. Focus on model context protocol (MCP), schema design, and execution management.
Readings:
How SDPs connect to HPC workflows and experimental labs. Covers distributed coordination, robotics, and federated agents.
Readings:
Assignment A3: Implement Distributed Battleship (and/or Implement MCP toolbox).
Explores how scientists and agents collaborate: trust boundaries, interaction design, and debugging.
Readings:
Frameworks for assessing agents and SDPs: robustness, validity, and relevance.
Readings:
Examines why multi-agent systems fail and methods for safety and guardrails.
Readings:
Assignment A4: Implement evaluation harness.
Case studies of SDPs in biology and materials.
Readings:
Explores originality, credit, and the risks of plagiarism in AI-generated science.
Readings:
Assignment A5: Capstone project planning (novel contributions).
Pipelines, workflow composition, and self-improving systems.
Readings:
Assignment A6: Generating HPC workflows.
Covers approaches to adapt agents with reinforcement learning and real-world training.
Readings:
Discusses ethical and policy dimensions: dual-use concerns, bias, carbon footprint, open science vs IP.
Suggested Readings:
Strategies for scaling: distributed compute, HPC, cloud-native orchestration. Covers resilience, scheduling, and cost/energy considerations.
Suggested Readings:
Demonstration of automation pipelines with monitoring, logging, and adaptive workflows. Emphasis on debugging and error recovery.
Suggested Readings:
Explores frontiers: multi-agent collaboration, embodied co-scientists, integration with digital twins. Students speculate on SDPs in 2030.
Readings:
Students present draft capstone plans, receive structured peer critique, and refine. Instructor provides guidance on scope, deliverables, and evaluation.
Suggested Readings: