Tree of thoughts · NorthGradient

Chain-of-thought works well when there is a clear path from problem to answer. But some problems have no clear path: the right approach only emerges after exploring a few wrong ones, or the correct first step is not obvious until you see where it leads. For these, committing to a single chain is a liability.

Tree of thoughts trades the linear path for a branching one. Instead of following one chain to the end, the model generates several candidate next steps at each point, evaluates which are most promising, and continues from the best while discarding the rest.

Tree of thoughts lets the model explore multiple reasoning paths and evaluate them before committing, which is useful when the right approach only becomes clear after some exploration.

How it works

Treat reasoning as a search problem rather than a generation problem. At each step, the model produces several next steps instead of one. Each is evaluated: does this path move toward a solution or away from one? The most promising are kept and expanded; the rest are pruned.

This creates a tree where each node is a reasoning step and each branch is an alternative continuation. The model searches it, guided by its own evaluations, until it reaches a leaf node representing a complete solution.

Problem: Plan a three-day itinerary for a city trip with limited budget,
         including one cultural site, one food experience, and one free activity per day.

Step 1 candidates:
  A. Start with the free activities to anchor each day's schedule.
  B. Start with the cultural sites since they often have fixed opening hours.
  C. Start with food experiences to cluster them by neighbourhood.

Evaluate: Option B is most constrained. Starting with fixed constraints
          first leaves more flexibility for the rest. Continue with B.

Step 2 candidates from B:
  ...

The prompt has the model generate candidates, evaluate them, and select before proceeding. The evaluation step distinguishes this from chain-of-thought: the model is not just reasoning, it is reasoning about its own reasoning.

When tree of thoughts helps

The technique earns its complexity on a specific class of problems: planning tasks where early decisions close off later options, constraint satisfaction problems balancing multiple requirements, and creative tasks with tight constraints where many approaches look plausible but most fail at some step.

A useful test: if you can imagine a skilled human saying “let me think about a few approaches before committing,” tree of thoughts is probably right. If the path is relatively direct, chain-of-thought is enough.

How to prompt for it

Tree of thoughts is a pattern, not a single instruction, with different implementations depending on how much control you need.

The simplest is one prompt that asks the model to generate alternatives, evaluate them, and select:

You are solving the following problem: [problem statement]

Work through it using the following process:
1. Generate three possible first steps.
2. Evaluate each one: which is most likely to lead to a good solution, and why?
3. Choose the best one and explain your choice.
4. Repeat this process for each subsequent step until you reach a solution.

A more rigorous implementation uses separate prompts: one generates candidate next steps, a second evaluates them, a third selects and continues. This gives more control over each stage but requires a chain of prompts rather than a single one.

A diagram contrasting a linear chain-of-thought path with a branching tree-of-thoughts structure, showing candidate generation, evaluation, and pruning at each node.

The cost

Tree of thoughts is far more expensive than chain-of-thought in tokens and computation. Multiple candidates at each step multiply output length, and a multi-prompt implementation multiplies API calls too.

It is worth paying for when the problem genuinely requires exploration and the stakes are high, not as a default. Use chain-of-thought when there is a clear path; escalate to tree of thoughts only when the path itself needs to be discovered.

What neither technique guarantees

Both make the model’s reasoning visible and more structured. Neither guarantees correct reasoning: a well-structured tree can still arrive at the wrong answer. What they offer is not correctness but transparency: the reasoning is there to inspect, and errors are easier to find and fix than in an opaque direct answer.

In the next lesson, we’ll move to Chapter 7 and look at how to control the structure and format of the model’s output directly, independent of the reasoning that produced it.