Getting Started: Building Scientific Agents

Production scientific agents need two capabilities:

  1. Intelligence — LLM-powered reasoning, tool calling, and decision making
  2. Distribution — Running across machines, institutions, and federated infrastructure

No single framework does both well. That’s why we use two complementary frameworks:

Framework What It Does Strengths
LangGraph LLM reasoning and tool orchestration ReAct patterns, state management, tool calling
Academy Distributed agent execution Cross-machine messaging, federation, HPC integration

LangGraph + Academy stack diagram

Learning Path

Learning path: LLM Agents → Distributed → Production

This guide walks you through both frameworks, building toward production agents:

Stage Guide What You Learn
1 LLM Agents Build agents that reason and call tools (LangGraph)
2 Distributed Agents Run agents across machines (Academy)
3 Production Agents Combine LangGraph + Academy for real deployments

Time investment: Each stage builds on the previous. Plan to work through them in order.


Stage 1: LLM Agents (LangGraph)

Learn to build agents that can reason and use tools.

from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

@tool
def calculate(expression: str) -> str:
    """Evaluate a math expression."""
    return str(eval(expression))

llm = ChatOpenAI(model="gpt-4o-mini")
agent = create_react_agent(llm, [calculate])
agent.invoke({"messages": [HumanMessage(content="What is 347 * 892?")]})

Key concepts: Tools, ReAct loop, state graphs, memory

Start Stage 1 →


Stage 2: Distributed Agents (Academy)

Learn to run agents across machines and pass messages between them.

from academy.agent import Agent, action
from academy.manager import Manager

class ComputeAgent(Agent):
    @action
    async def run_simulation(self, params: dict) -> dict:
        # Could run on HPC, lab instrument, cloud...
        return {"energy": -127.5, "status": "completed"}

async with await Manager.from_exchange_factory(factory) as manager:
    compute = await manager.launch(ComputeAgent)
    result = await compute.run_simulation({"temp": 300})

Key concepts: Agent classes, @action methods, Handles, Manager

Start Stage 2 →


Stage 3: Production Agents (LangGraph + Academy)

Combine both: LLM reasoning inside distributed Academy agents.

class ResearchAgent(Agent):
    """Academy agent with LangGraph intelligence."""

    @action
    async def research(self, task: str) -> dict:
        # LangGraph handles reasoning
        result = self._langgraph_agent.invoke({
            "messages": [HumanMessage(content=task)]
        })
        return {"findings": result["messages"][-1].content}

Key patterns:

Start Stage 3 →


Quick Decision Guide

“I just want to experiment with LLM agents locally” → Start with Stage 1 (LangGraph only)

“I need agents on HPC/federated infrastructure but no LLM yet” → Start with Stage 2 (Academy only)

“I need LLM-powered agents on DOE infrastructure” → Work through all three stages, ending with Stage 3


Examples by Stage

Stage Examples Pattern
1 Calculator, RAG, Database, API LangGraph tools
1 Conversation, LangGraph Pipeline LangGraph state
2 AcademyBasic, RemoteTools Academy messaging
2 Persistent, Federated Academy patterns
3 Hybrid LangGraph + Academy
3 HPC Job, CharacterizeChemicals Production deployment

Prerequisites

LLM access: Examples support OpenAI, Ollama, and FIRST backends. See LLM Configuration for setup. All examples include mock mode for testing without API keys.