This guide focuses on success criteria that are controllable through prompt engineering.
Not every success criteria or failing eval is best solved by prompt engineering. For example, latency and cost can be sometimes more easily improved by selecting a different model.
Prompt engineering is far faster than other methods of model behavior control, such as finetuning, and can often yield leaps in performance in far less time. Here are some reasons to consider prompt engineering over finetuning:
Resource efficiency: Fine-tuning requires high-end GPUs and large memory, while prompt engineering only needs text input, making it much more resource-friendly.
Cost-effectiveness: For cloud-based AI services, fine-tuning incurs significant costs. Prompt engineering uses the base model, which is typically cheaper.
Maintaining model updates: When providers update models, fine-tuned versions might need retraining. Prompts usually work across versions without changes.
Time-saving: Fine-tuning can take hours or even days. In contrast, prompt engineering provides nearly instantaneous results, allowing for quick problem-solving.
Minimal data needs: Fine-tuning needs substantial task-specific, labeled data, which can be scarce or expensive. Prompt engineering works with few-shot or even zero-shot learning.
Flexibility & rapid iteration: Quickly try various approaches, tweak prompts, and see immediate results. This rapid experimentation is difficult with fine-tuning.
Domain adaptation: Easily adapt models to new domains by providing domain-specific context in prompts, without retraining.
Comprehension improvements: Prompt engineering is far more effective than finetuning at helping models better understand and utilize external content such as retrieved documents
Preserves general knowledge: Fine-tuning risks catastrophic forgetting, where the model loses general knowledge. Prompt engineering maintains the model’s broad capabilities.
Transparency: Prompts are human-readable, showing exactly what information the model receives. This transparency aids in understanding and debugging.
The prompt engineering pages in this section have been organized from most broadly effective techniques to more specialized techniques. When troubleshooting performance, we suggest you try these techniques in order, although the actual impact of each technique will depend on our use case.