NorthGradient
Start reading
Chapter 6 · Lesson 1 Browse lessons

Prompt Design · Chapter 6 · Lesson 1 · 6 min read

Chain-of-thought prompting

A straightforward factual question deserves a direct answer. But some questions require working through several steps, each depending on the one before. Asked for a direct answer, the model often gets these wrong, not for lack of knowledge, but because it compresses multi-step reasoning into a single output.

The fix is to ask the model to show its work.

Asking the model to reason step by step before answering improves accuracy on problems that require chained logic, because it forces each intermediate step to be made explicit and therefore checkable.

Why direct answers fail on complex problems

When a model answers directly, the reasoning happens implicitly, never appearing in the output. If it goes wrong in the middle, there is no way to see where, and no mechanism to catch the error before committing.

This is most visible with arithmetic. A multi-step word problem often gets a confident, wrong answer. Worked through one step at a time, it is more likely correct, because each step is written out, checked against the previous one, and used as input to the next.

The same holds for logical reasoning, causal analysis, and any task where intermediate conclusions feed into a final one.

The two ways to trigger chain-of-thought

There are two ways to ask a model to reason step by step, suited to different situations.

The first is few-shot chain-of-thought. You provide one or more worked examples, each showing the full reasoning trace then the answer. The model sees the pattern and applies it.

Question: A store sells apples for $0.50 each and oranges for $0.75 each.
          If someone buys 4 apples and 3 oranges, how much do they spend?

Reasoning: 4 apples at $0.50 each is 4 × 0.50 = $2.00.
           3 oranges at $0.75 each is 3 × 0.75 = $2.25.
           Total spent is $2.00 + $2.25 = $4.25.

Answer: $4.25

---

Question: A store sells notebooks for $1.20 each and pens for $0.40 each.
          If someone buys 5 notebooks and 6 pens, how much do they spend?

Reasoning:

Having seen a complete reasoning trace, the model produces one for the new problem in the same format.

The second is zero-shot chain-of-thought. Instead of examples, you add a phrase that cues the model to reason before answering.

A store sells notebooks for $1.20 each and pens for $0.40 each.
If someone buys 5 notebooks and 6 pens, how much do they spend?

Think step by step.

“Think step by step” is the most common cue and works across a wide range of tasks. Phrasings like “Let’s work through this carefully” have similar effects. Put the cue at the end of the prompt, close to where the model begins generating, so it carries recency weight.

Zero-shot is easier because it needs no examples. Few-shot is more reliable on complex or domain-specific tasks, where the examples show the exact format and depth of reasoning you expect.

What chain-of-thought cannot fix

Chain-of-thought improves reasoning over problems the model already has the knowledge to solve. It does not supply missing knowledge. If the model does not know a fact, reasoning step by step just produces a confident, well-structured chain that arrives at the wrong answer.

It also does not guarantee correct reasoning. A trace can look coherent and still be wrong. The trace makes errors visible and catchable, but does not eliminate them.

Finally, it adds length to the output. When only the answer is needed, the extra tokens are overhead. Use it when the reasoning genuinely helps, not by default.

A diagram contrasting a direct answer prompt with a chain-of-thought prompt, showing how the reasoning trace makes intermediate steps explicit and catchable.
A diagram contrasting a direct answer prompt with a chain-of-thought prompt, showing how the reasoning trace makes intermediate steps explicit and catchable.

When to use it

Chain-of-thought is most valuable when:

  • The answer depends on several intermediate conclusions
  • Errors are hard to spot in a final answer but visible in intermediate steps
  • You want to inspect or verify the reasoning itself

For simple factual retrieval, classification, or format conversion, it adds overhead without benefit. Use it when it earns its place, not as a default.

In the next lesson, we’ll look at tree of thoughts, which extends chain-of-thought by letting the model explore multiple reasoning paths rather than committing to one.