Enrich a Dataset
Learn how to enrich a dataset with synthetic data.
What Does It Mean To Enrich a Dataset?
Enriching a dataset means adding synthetic data to your dataset. This is useful to increase the size of your dataset and to make it more representative of the real world.
Quick Start: How To Enrich a Dataset
Click ✨ Enrich to get started
The Enrich window will open up.
Select The Number of Datapoints You Want To Generate
You can specify the number of datapoints you want to generate.
Use the slider to add anywhere from 10 to 1000 additional datapoints to your dataset.
Change The Creativity
Use the slider to add anywhere from 10 to 1000 additional datapoints to your dataset.
-
Higher values produce more diverse but potentially less relevant results.
-
Lower values produce more relevant but potentially less diverse results.
Add Instructions For The Augmentation (Optional) and Start the Data Augmentation
You can add instructions for the augmentation to guide the AI on how to generate the data.
To Write Effective AI Augmentation Instructions, Consider These Suggestions:
- Be specific about the desired output format and structure
- Include examples of desired transformations
- Define boundaries (what’s acceptable vs. not acceptable)
- Specify tone, style, and voice requirements
- Include domain-specific terminology or conventions
- State the purpose of the augmentation to align with goals
- Indicate data diversity requirements (e.g., “vary sentence structure”)
- Provide clear constraints on length, complexity, or technical level
- Mention how to handle edge cases or unusual inputs
- Request consistent formatting for easier processing
For instance, rather than “Generate more customer service responses,” try: “Generate 5 additional professional customer service responses that address the customer’s concern about shipping delays, use a sympathetic tone, offer specific solutions, and keep responses between 50-75 words.”
Check Your Results
You see the amount of that were generated.
Comb through the results and get ready for the next step.
Next Steps
Now that you have a larger dataset, you can use it to train your model.