Prompt engineering vs. RAG vs. fine-tuning

Definitions:

Prompt Engineering a technique that involves designing prompts for natural language processing models. This process improves accuracy and relevancy in responses, optimizing the performance of the model.
Retrieval Augmented Generation (RAG) improves LLM performance by retrieving data from external sources and incorporating it into a prompt. RAG allows businesses to achieve customized solutions while maintaining data relevance and optimizing costs.
Fine-tuning retrains an existing LLM using example data, resulting in a new “custom” LLM that has been optimized for the provided examples.

Customers should start with Prompt Engineering and/or RAG before considering fine-tuning. This may save significant time and cost, and will likely address the use case. Fine-tuning should only be used in rare circumstances once Prompt Engineering and RAG have been tested.

Requirement	Start with	Why?
Steer model with a few examples	Prompt engineering	Easy to craft and quick experimentation, very low barrier to entry
Simple & quick implementation	Prompt engineering, RAG	Easy tooling with Azure OpenAI on Your Data, PromptFlow, LangChain
Improve model relevancy	RAG	Retrieve relevant information from your own datasets to insert into prompts
Up to date information	RAG	Query up to date information from your own databases, search engineers, etc. to insert into prompts
Factual grounding	RAG	Ability to reference & inspect retrieved data
Optimize for specific tasks	Fine tuning	Fine tuning is great at steering your model for specific tasks like summarizing data in a specific format
Instructions won’t fit in a prompt	Fine tuning	Fine tuning moves few-shot examples into the training step but increases the quantity of examples are needed to train.
Lower costs	It depends	⚠️Prompt engineering & RAG have lower upfront costs but long prompts are more expensive; training for FT is expensive but may cut prompt length. The choice will always depend on the use case & data.
Complex, novel data or domains	Prompt Engineering + RAG+ Fine Tuning	⚠️ This is a highrisk area. Fine tuning can retrain the model to recognize new domains, but RAG is needed to avoid plausible confabulations. Make sure customers don’t try to retrain for unapproved uses!