Introduction to LLM Agents · Introduction to LLM Agents · 3 min read
Where to go next
The agent you built in this course works. For a single-task agent with two tools and a step limit, plain Python is the right tool. The problems appear when you try to make the agent more capable.
Plain Python is enough to understand how agents work. It is not enough to build agents that are reliable in production.
What plain Python cannot handle well
Recovering from crashes. The run_agent function has no memory between calls. If it crashes on step 3 of 5, the next call starts from the beginning. For a task that takes two seconds, this is fine. For a task that takes twenty minutes and calls ten external APIs, restarting from scratch after a crash is not acceptable.
Parallel work. Some tasks benefit from running multiple tool calls simultaneously: fetch the weather for three cities at once rather than one at a time. The for loop in lesson 5 is sequential. Making it parallel requires threading or async code, and the state management that comes with it.
Complex routing. The agent in this course decides between two tools. Real agents may need to decide between twenty, or route to different sub-agents depending on the task type. A chain of if/else statements handles three cases. It does not handle thirty.
Testability. The run_agent function is one block. Testing one part of it, such as the routing logic, requires running the entire function including real API calls. A flat function cannot be tested in pieces.
Checkpointing and resume. If you want to pause an agent mid-run and resume it later, or inspect its state at any point, you need that state to be serializable and stored somewhere. A local Python variable cannot do that.
What graph-based frameworks add
A graph-based framework like LangGraph solves each of these problems by making the agent’s structure explicit:
- Each step becomes a named node that can be tested in isolation.
- The routing between steps is declared as edges with explicit conditions, not buried in if/else logic.
- State is a typed dictionary that is saved to a database at every node, enabling crash recovery and resume.
- Parallel branches are a first-class concept, not something you have to implement with threads.
The trade-off is that graph-based agents require more upfront structure. For a two-tool agent, that structure is overhead. For an agent with ten tools, multiple sub-agents, and a self-improvement loop, that structure is what keeps it manageable.
Where to go from here
The next course, Building Production Agents with LangGraph, covers exactly that structure. It assumes you understand what an agent is, what a tool does, and what the loop looks like, which you now do. It starts where this course ends: with the decision to use LangGraph and the architectural patterns that make agents reliable at scale.
The concepts you will encounter there, including state separation, graph modeling, checkpointing, and observability, all address the limitations described above. You now have enough context to understand why each pattern exists, not just what it does.