What Does It Mean To Enrich a Dataset?

Enriching a dataset means adding synthetic data to your dataset. This is useful to increase the size of your dataset and to make it more representative of the real world.

Quick Start: How To Enrich a Dataset

1

Click ✨ Enrich to get started

The Enrich window will open up.

2

Select The Number of Datapoints You Want To Generate

You can specify the number of datapoints you want to generate.

Use the slider to add anywhere from 10 to 1000 additional datapoints to your dataset.

3

Change The Creativity

Use the slider to add anywhere from 10 to 1000 additional datapoints to your dataset.

  • Higher values produce more diverse but potentially less relevant results.

  • Lower values produce more relevant but potentially less diverse results.

4

Add Instructions For The Augmentation (Optional) and Start the Data Augmentation

You can add instructions for the augmentation to guide the AI on how to generate the data.

To Write Effective AI Augmentation Instructions, Consider These Suggestions:

  1. Be specific about the desired output format and structure
  2. Include examples of desired transformations
  3. Define boundaries (what’s acceptable vs. not acceptable)
  4. Specify tone, style, and voice requirements
  5. Include domain-specific terminology or conventions
  6. State the purpose of the augmentation to align with goals
  7. Indicate data diversity requirements (e.g., “vary sentence structure”)
  8. Provide clear constraints on length, complexity, or technical level
  9. Mention how to handle edge cases or unusual inputs
  10. Request consistent formatting for easier processing

For instance, rather than “Generate more customer service responses,” try: “Generate 5 additional professional customer service responses that address the customer’s concern about shipping delays, use a sympathetic tone, offer specific solutions, and keep responses between 50-75 words.”

5

Check Your Results

You see the amount of that were generated.

Comb through the results and get ready for the next step.

Next Steps

Now that you have a larger dataset, you can use it to train your model.