On this page

Evaluation Rules
Create Your Own Metrics
Check out these other resources to learn more about evaluations:

Traditional deterministic metrics used for language models (such as token accuracy) are insufficient for evaluating LLMs. Therefore, we define several abstract evaluation concepts as:

Correctness: Measures the factual accuracy and correctness of the response
Conciseness: Measures how concise and to-the-point the response is without unnecessary information
Hallucination: Measures whether the response contains made-up or fabricated information

Each metric has its own defined rules. In case of metrics like: token-accuracy has code based rules. For metrics like above, we have natural language conditioned rules called Rubrics.

Evaluation Rules

You can edit evaluation metrics by rules the LLM should follow and rules the LLM should not follow. Just hover over each rule, and click on the pencil icon to edit it. When you’re done, click on the checkmark icon to save your changes. Gif of changing evaluation rules

Create Your Own Metrics

You can create your own metrics by clicking on the Create Metric button.

Give your metric a name
Write a description for your metric
Click Generate Rules to generate the rules for your metric

So now when you are creating a new evaluation, you can select your custom metric from the list. Image of selecting a custom metric

Check out these other resources to learn more about evaluations:

Overview

Click here to learn more about evaluations.

Get Started with Evaluations

Click here to learn how to create your own evaluation.

Bring Your Own Evaluation

Click here to learn how to bring your own evaluation to Prem.

Bring Your Own Evaluation How to write good metric rules for Evaluation

Get started

Datasets 🗃️

Fine-Tuning 🛠️

Inference 🏃‍♂️

Agentic Evaluations 📈

User Guides 📚

Playground 🛝

Stats 📊

Resources 🧰

Cookbook 🍳

Evaluation Metrics

Evaluation Rules

Create Your Own Metrics

Check out these other resources to learn more about evaluations:

Overview

Get Started with Evaluations

Bring Your Own Evaluation

Get started

Datasets 🗃️

Fine-Tuning 🛠️

Inference 🏃‍♂️

Agentic Evaluations 📈

User Guides 📚

Playground 🛝

Stats 📊

Resources 🧰

Cookbook 🍳

​Evaluation Rules

​Create Your Own Metrics

​Check out these other resources to learn more about evaluations:

Overview

Get Started with Evaluations

Bring Your Own Evaluation

Evaluation Rules

Create Your Own Metrics

Check out these other resources to learn more about evaluations: