Get Started With Evaluations
Learn how to evaluate a model in Prem.
Start Here β
You can only evaluate a model if you have a snapshot of your dataset.
Create an New Evaluation
To create a new evaluation, you need to have a fine-tuned model. If you donβt have a fine-tuned model, you can create one by following the steps in the Fine-Tuning section.
You need to fill out the following fields:
- Evaluation Prompt: The prompt to use for the evaluation.
Example of an Evaluation Prompt
Convert medical reports into ICD-10 codes.
You will be given a textual medical report. The model must return a list of ICD-10 codes.
Evaluate:
Are the codes valid and from the ICD-10 standard? Do the codes align with the diagnosis mentioned in the report? Are there unnecessary or incorrect codes predicted?
- Dataset: The dataset to use for the evaluation.
- Snapshot: The snapshot of the dataset to use for the evaluation.
- Models: You can select multiple models to evaluate.
Once you have filled out the fields click the βCreate Evaluationβ button.
Check Your Results
Once you have created an evaluation, you can check your results by clicking on the evaluation and each datapoint to get more details.
Here are Some Results to Keep in Mind:
- Average Score: The average score of the evaluation based on the fine-tuned model.
- Datapoint: The user message from the dataset.
- Name of the model the evaluation was run on: The name of the model the evaluation was run on.
- System Prompt: The system prompt used for the evaluation.
- User Message: The user message from the dataset.
- Original Assistant Response: The original assistant message from the dataset.
- Model Score: The score of the model chosen for the evaluation.
Next Step: Fine-Tuned Model β Playground
Fine-Tuned Model β Playground
Click here to learn how to use your fine-tuned model in the playground.