- 16 Jan 2025
- 3 Minutes to read
- Print
- DarkLight
- PDF
Performance Report
- Updated on 16 Jan 2025
- 3 Minutes to read
- Print
- DarkLight
- PDF
This article applies to these versions of LandingLens:
LandingLens | LandingLens on Snowflake |
✓ | ✓ |
Clicking an evaluation set score on the Models page opens the Performance Report tab (you can also click a model on the Models page and then select this tab).
This report shows how the model performed on the selected evaluation set (and not for the entire dataset). You can select different sets from the Evaluation Set drop-down menu.
The bottom part of the report compares the ground truth (your labels) to the model's predictions. You can filter by prediction type (False Positive, False Negative, Mis-Classification, and Correct Predictions) and sort by model performance.
The Performance Report and Build Tab May Have Different Results
The results in the Performance Report might be different than the results in the Build tab. This is because the Performance Report is based on a specific version of a dataset—the images and labels never change.
However, the results on the Build tab are “live” and might change based on any updates to images or labels.
For example, let’s say that you train a model and create an evaluation set based on the dataset currently in the Build tab. You then add images and labels. This leads to the performance and results being different, as shown in the screenshots below.
Adjust Threshold
If you have an Object Detection or Segmentation project and want to see how the model performs on the evaluation set with a different Confidence Threshold, click Adjust Threshold and select a different score.
Overall Score for the Evaluation Set
The Performance Report includes a score for the evaluation set (and not for the entire dataset). The type of score depends on the project type:
Object Detection and Classification: F1 Score
The Performance Report includes the F1 score for Object Detection and Classification projects.
For Object Detection, the F1 score combines precision and recall into a single score, creating a unified measure that assesses the model’s effectiveness in minimizing false positives and false negatives. A higher F1 score indicates the model is balancing the two factors well. LandingLens uses micro-averaging to calculate the F1 score.
For Classification, the F1, Precision, and Recall scores are identical. This is because Classification models have only two prediction outcomes: "Correct" and "Misclassified". Therefore, the F1, Precision, and Recall scores for Classification models are all calculated using this algorithm:
Segmentation: Intersection Over Union (IoU)
The Performance Report includes the Intersection over Union (IoU) score for Segmentation projects.
Intersection over Union (IoU) is used to measure the accuracy of the model by measuring the overlap between the predicted and actual masks in an image. A higher IoU indicates better agreement between the ground truth and predicted mask. LandingLens does not include the implicit background and micro-averaging in the calculation of the IoU.
Download CSV of Evaluation Set
For Object Detection and Classification projects, click Download CSV to download a CSV of information about the images in the evaluation set. The CSV includes several data points for each image, including the labels ("ground truth") and model's predictions.
CSV Data for Evaluation Set
The CSV includes the information described in the following table.
Item | Description | Example |
---|---|---|
Image ID | Unique ID assigned to the image. | 30243316 |
Image Name | The file name of the image uploaded to LandingLens. | sample_003.jpg |
Image Path | The URL of where the image is stored. | s3://path/123/abc.jpg |
Model ID | Unique ID assigned to the model. | a3c5e461-0786-4b17-b0a8-9a4bfb8c1460 |
Model Name | The name of the model in LandingLens. | Model-06-04-2024_5 |
GT_Class | The classes you assigned to the image (ground truth or “GT”) . For Object Detection, this also includes the number of objects you labeled. | {"Scratch":3} |
PRED_Class | The classes the model predicted. For Object Detection, this also includes the number of objects predicted. If the model didn't predict any objects, the value is {"null":1}. | {" Scratch":2} |
Model_Correct | If the model's prediction matched the original label (ground truth or “GT”), the value is TRUE. If the model's prediction didn't match the original label (ground truth or “GT”), the value is FALSE. Only applicable to Classification projects. | TRUE |
PRED_Confidence | The model's confidence score for its prediction. Only applicable to Classification projects. | 0.9987245 |
GT-PRED JSON | The JSON output comparing the original labels (ground truth or "GT") to the model's predictions. For more information, go to JSON Output. | {"gtDefectName":"No Fire","predDefectName":"No Fire","predConfidence":0.9684047} |