What Does It Mean to Enrich a Dataset?
Enriching a dataset means adding synthetic datapoints to increase its size and diversity. This helps make your dataset more representative of real-world scenarios and improves model training. For advanced use cases and practical examples, see our Enrichment Guide.Quick Start: How to Enrich a Dataset
1
Step 1: Open Enrich
Click the ✨ Enrich button to open the enrichment window.

If your dataset hasn’t been split yet, you’ll see a reminder about creating a validation set to avoid data leakage. This is not a blocker. For details, check our Best Practices.
2
Step 2: Choose Number of Datapoints
Use the slider to set how many additional datapoints you want to generate (from 10 to 10000).
3
Step 3: Advanced Settings (Optional)
You can guide the enrichment process with additional parameters:
- Creativity (Temperature) –
- Higher values → more diverse but less predictable results.
- Lower values → more consistent and relevant results.
- User Instructions – Add custom instructions to control style, tone, or constraints.

Writing Effective Instructions
For best results, consider:- Be specific about output format and structure.
- Provide examples of desired outputs.
- Define acceptable boundaries and tone.
- Include domain-specific terminology.
- State the purpose of augmentation clearly.
- Indicate diversity needs (e.g., vary sentence structure).
- Set limits on length, complexity, or style.
- Explain how to handle edge cases.
“Generate more customer service responses” Try: “Generate professional customer service responses about shipping delays, using a sympathetic tone, offering specific solutions, and keeping responses 50–75 words long.”For deeper use cases, refer to the Enrichment Guide.
4
Step 4: Review Results
Once enrichment is complete, review the new datapoints that were generated.
Make sure the outputs meet your expectations before moving on.
