Evaluation¶
We describe below the different evaluation metrics used to evaluate methods for each of the tasks. We will soon add a link here to the github repo with the evaluation code. More information on the evaluation can also be found in the challenge design document.
Important notes¶
- For the segmentation predictions, the expected output is a 3D image with an intensity range between 0 and 1. If this is not the case, the output will not be considered in the evaluation and default worst results will be assigned to that case.
- To quantify the evaluation measures, the segmentation outputs will be binarised for all metrics except the measures of volume differences (for which a clipped version of the output will be used for assessment
- The uncertainty measures won't be evaluated during the validation phase (Task 3).
Evaluation code online¶
The evaluation code can be found in the Where is Valdo GitHub repository for all tasks (the code will soon be updated to also include the uncertainty metrics)
Description of evaluation measures per task¶
Task 1 Enlarged perivascular spaces detection and segmentation For the detection and segmentation of enlarged perivascular spaces, the following metrics will be used:
- Dice Similarity Coefficient (DSC) Volumetry
- Absolute Volume Difference Volumetry
- Detection F1 Detection
- Absolute Count Difference Detection
Task 2 Cerebral microbleeds detection and segmentation For the detection and segmentation of cerebral microbleeds, the following metrics will be used:
- Dice Similarity Coefficient (DSC) Volumetry
- Absolute Volume Difference Volumetry
- Detection F1 Detection
- Absolute Count Difference Detection
Task 3 Lacunes detection, segmentation and uncertainty For the detection and segmentation of lacunes, the following evaluation metrics will be used:
- Dice Similarity Coefficient (DSC) Volumetry
- Absolute Volume Difference Volumetry
- Detection F1 Detection
- Uncertainty Dice Similarity Coefficient (DSC) Uncertainty
- Uncertainty Detection F1 Uncertainty