Evolution of LLM Prompt Engineering

Insights

6 min read

LLMs are great at many things, especially when we ask nicely. Over the last few years, prompt engineering grew from “just ask it” to a set of techniques that make models more reliable, better reasoners, and less… imaginative with facts. Here’s an overview of the different techniques that have emerged so far.

I. Zero‑Shot and Few‑Shot Prompting

Zero‑shot prompting is the “no instructions manual” mode: ask for translation, a summary, a quick classification, and the model usually delivers. Magic? Not quite, just lots of pre-training.

Translate the following text to French: 🧑🏻‍💻 Hi buddy! Welcome to our blogpost!
🤖 Salut copain! Bienvenue sur notre article de blog!

Note that in the prompt above we didn’t provide the model with any examples. The model was able to understand the task without any additional knowledge. That’s the zero-shot capabilities at work! Quite impressive huh? This capability is particularly useful in scenarios where it’s impractical or costly to fine-tune a model for every specific task or prompt. It allows users to leverage pre-trained language models for a wide range of applications without the need for extensive task-specific training.

When the task is niche or you want a specific style, add a couple of examples. That’s few‑shot prompting. Think of it as showing the model the vibe you want, then saying “your turn.” It’s great for format, tone, and domain hints, but it won’t suddenly make the model a mathematician.

🧑🏻‍💻 A "whatpu" is a small, furry animal native to Tanzania. An example of a sentence that uses the word whatpu is: We were traveling in Africa and we saw these very cute whatpus. Now, give me an example of a sentence that uses the word whatpu. 🤖 While hiking through the Tanzanian wilderness, our guide pointed out a rare whatpu, a small and furry creature that quickly scurried across our path.

II. Chain‑of‑Thought (CoT)

People might be surprised that LLMs struggle with a simple computation sometimes. However, this task may involve intermediate steps, and LLMs may struggle with these. The previous tricks might not be sufficient.

CoT prompting gets the model to show its work before answering. For arithmetic, logic, or multi‑step analysis, this often boosts accuracy. You can include worked examples, or add a simple nudge like “Let’s think step by step.”

🧑🏻‍💻 Let's think step by step. Question: Alice is taller than Bob. Bob is taller than Charlie. Who is the shortest? 🤖 If Alice > Bob and Bob > Charlie, then Alice > Bob > Charlie. So Charlie is the shortest. Answer: Charlie.

Caveat: the model can still wander. If the steps go off‑track, the final answer will too. Friendly, but fallible.

III. Tree‑of‑Thought (ToT)

To go one step further than CoT, ToT prompting says, “Let’s explore a few options before we commit.” The model considers multiple reasoning paths and picks the best branch. It’s more thorough than CoT, but expect extra latency and complexity. More brain power, a bit more coffee.

🧑🏻‍💻 Try to explore different options before deciding on your final answer.

Question: Should I bring an umbrella today?

🤖 Thought 1: If it's cloudy, there might be rain, so an umbrella is a safe choice.
Thought 2: If I check the forecast and there's only a 10% chance of rain, maybe not.
Thought 3: If I’m walking far and don’t want to risk it, better safe than soaked.

Final Answer: Bring the umbrella. Worst case, you look overly cautious. Best case, you stay dry.

IV. The Limits of CoT and ToT

Both techniques only use what the model already knows. If you need company‑specific facts or up‑to‑date details, they might sound confident while being confidently wrong (which is called hallucinating by the way!). Fine‑tuning or stuffing dozens of examples in the prompt isn’t always practical.

V. Retrieval‑Augmented Generation (RAG)

RAG fetches relevant documents (FAQs, policies, emails, docs) and hands them to the model with your question. The result: grounded, auditable answers without retraining. It’s like giving the model open‑book privileges. At this point, we’re going towards something that’s a bit more complex than just prompting.

For instance, if we need to respond to an email, we simply need to query an external dataset to identify the most similar emails and gather additional information about the context of the email. Efficient and reliable.

One gotcha: retrieval usually happens before reasoning. If the initial query misses something, the context can be slightly off.

VI. ReAct: Reasoning + Acting

Yet, with RAG, there’s no guarantee that the LLM can answer the question, as it lacks access to external knowledge when determining the steps needed for the correct response.

ReAct mixes thinking with doing. The loop is simple:

Thought: decide the next step
Action: call a tool (search, calculator, API, database)
Observation: read results, then repeat or answer

🧑🏻‍💻 What would be my interest rate based on a 30y 450'000 mortgage? 🤖 Thought: I need to search the mortgage calculator with specific parameters for year amount and loan. Action: Search[Mortgage Calculator, 30, 450'000] Observation: The interest rate for the given scenario would be 1.25%. Thought: The observation shows that for 30 year long mortgage of 450'000 dollars the interest rate would be 1.25% Action: FinalAnswer[1.25%]

Combine ReAct with RAG and you get iterative reasoning that can fetch missing facts, verify assumptions, and cut down on hallucinations. In other words: less guessing, more knowing.

VII. ReAct: Reasoning + Acting

Zero‑shot: fast baseline for common tasks
Few‑shot: steer style and format with examples
CoT: add intermediate steps for structured reasoning
ToT: explore alternatives for hard problems
RAG: ground answers in your knowledge base
ReAct + RAG: reason, retrieve, and validate when accuracy really matters

VIII. Takeaways

Start simple, layer techniques as tasks get harder
Ground answers when correctness counts
Use ReAct with retrieval to couple thinking with fresh facts

Prompt engineering keeps evolving, but the direction is clear: better reasoning + better access to knowledge. That’s how LLMs graduate from clever text machines to dependable teammates.

Evolution of LLM Prompt Engineering

I. Zero‑Shot and Few‑Shot Prompting

II. Chain‑of‑Thought (CoT)

III. Tree‑of‑Thought (ToT)

IV. The Limits of CoT and ToT

V. Retrieval‑Augmented Generation (RAG)

VI. ReAct: Reasoning + Acting

VII. ReAct: Reasoning + Acting

VIII. Takeaways

Exploring AI or Looking to Move Forward?

Related applications

AI Assisted Deviation Management

Skwiz for Intelligent Document Processing

MailFlow for Customer Support Automation

Evolution of LLM Prompt Engineering

I. Zero‑Shot and Few‑Shot Prompting

II. Chain‑of‑Thought (CoT)

III. Tree‑of‑Thought (ToT)

IV. The Limits of CoT and ToT

V. Retrieval‑Augmented Generation (RAG)

VI. ReAct: Reasoning + Acting

VII. ReAct: Reasoning + Acting

VIII. Takeaways

Exploring AI or Looking to Move Forward?

Related Resources

Explore more practical AI insights.

Combining LLMs and Knowledge Graphs: A Practical Guide

When AIs Start Talking: Multi-Agent Systems Explained Simply

How to Write AI-Friendly Documents That Don’t Confuse Your LLM

Combining LLMs and Knowledge Graphs: A Practical Guide

When AIs Start Talking: Multi-Agent Systems Explained Simply

How to Write AI-Friendly Documents That Don’t Confuse Your LLM

Related applications

AI Assisted Deviation Management

Skwiz for Intelligent Document Processing

MailFlow for Customer Support Automation