Prompt engineering vs. RAG vs. fine-tuning
Definitions:
- Prompt Engineering a technique that involves designing prompts for natural language processing models. This process improves accuracy and relevancy in responses, optimizing the performance of the model.
- Retrieval Augmented Generation (RAG) improves LLM performance by retrieving data from external sources and incorporating it into a prompt. RAG allows businesses to achieve customized solutions while maintaining data relevance and optimizing costs.
- Fine-tuning retrains an existing LLM using example data, resulting in a new “custom” LLM that has been optimized for the provided examples.
Customers should start with Prompt Engineering and/or RAG before considering fine-tuning. This may save significant time and cost, and will likely address the use case. Fine-tuning should only be used in rare circumstances once Prompt Engineering and RAG have been tested.
| Requirement | Start with | Why? |
|---|---|---|
| Steer model with a few examples | Prompt engineering | Easy to craft and quick experimentation, very low barrier to entry |
| Simple & quick implementation | Prompt engineering, RAG | Easy tooling with Azure OpenAI on Your Data, PromptFlow, LangChain |
| Improve model relevancy | RAG | Retrieve relevant information from your own datasets to insert into prompts |
| Up to date information | RAG | Query up to date information from your own databases, search engineers, etc. to insert into prompts |
| Factual grounding | RAG | Ability to reference & inspect retrieved data |
| Optimize for specific tasks | Fine tuning | Fine tuning is great at steering your model for specific tasks like summarizing data in a specific format |
| Instructions won’t fit in a prompt | Fine tuning | Fine tuning moves few-shot examples into the training step but increases the quantity of examples are needed to train. |
| Lower costs | It depends | ⚠️Prompt engineering & RAG have lower upfront costs but long prompts are more expensive; training for FT is expensive but may cut prompt length. The choice will always depend on the use case & data. |
| Complex, novel data or domains | Prompt Engineering + RAG+ Fine Tuning | ⚠️ This is a highrisk area. Fine tuning can retrain the model to recognize new domains, but RAG is needed to avoid plausible confabulations. Make sure customers don’t try to retrain for unapproved uses! |