Using the solaris
CLI to score model performance¶
Once you have generated your model predictions and converted predictions to vector format, you’ll be ready to score your predictions! Let’s go through a test case for some “predictions” from the SpaceNet 4 dataset. Just to show you what those look like:
Ground truth and prediction data formats¶
Predictions¶
[2]:
import pandas as pd
import solaris as sol
import os
preds = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_preds_competition.csv'))
preds.head()
[2]:
ImageId | BuildingId | PolygonWKT_Pix | Confidence | |
---|---|---|---|---|
0 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 0 | POLYGON ((0.00 712.83, 158.37 710.28, 160.59 6... | 1 |
1 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 1 | POLYGON ((665.82 0.00, 676.56 1.50, 591.36 603... | 1 |
2 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 0 | POLYGON ((182.62 324.15, 194.25 323.52, 197.97... | 1 |
3 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 1 | POLYGON ((92.99 96.94, 117.20 99.64, 114.72 12... | 1 |
4 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 2 | POLYGON ((0.82 29.96, 3.48 40.71, 2.80 51.00, ... | 1 |
The file shows the image ID, the polygon geometry in WKT format, and a BuildingId
counter to distinguish between buildings in a single image. The Confidence
field in this case has no meaning, but can be provided if desired.
Ground Truth¶
[3]:
truth = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_truth_competition.csv'))
truth.head()
[3]:
ImageId | BuildingId | PolygonWKT_Pix | PolygonWKT_Geo | |
---|---|---|---|---|
0 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 0 | POLYGON ((476.88 884.61, 485.59 877.64, 490.50... | 1 |
1 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 1 | POLYGON ((459.45 858.97, 467.41 853.09, 463.37... | 1 |
2 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 2 | POLYGON ((407.34 754.17, 434.90 780.55, 420.27... | 1 |
3 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 3 | POLYGON ((311.00 760.22, 318.38 746.78, 341.02... | 1 |
4 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | 4 | POLYGON ((490.49 742.67, 509.81 731.14, 534.12... | 1 |
More or less the same thing. So, how does scoring work?
Scoring functions in the solaris
CLI¶
Once you have installed solaris, you will have access to the spacenet_eval
command in your command line prompt. This command has a number of possible arguments to control mask creation, described below. If you need a refresher on these within your command line, you can always run spacenet_eval -h
for usage instructions.
–proposal_csv, -p: [str] The full path to a CSV-formatted proposal file containing the same columns shown above.
–truth_csv, -t: [str] The full path to a CSV-formatted ground truth file containing the same columns shown above.
–challenge, -c, [str, one of
('off-nadir', 'spacenet-buildings2')
] The challenge being scored. Because the SpaceNet Off-Nadir Building Footprint Extraction Challenge was scored slightly differently from previous challenges to accommodate the different look angles, the challenge type must be specified here.–output_file, -o: [str] The path to the output files to be saved. Two files will be saved: the summary file with the name provided in this argument, and one with
'_full'
added before the'.csv'
extension, which contains the image-by-image breakdown of scores.
Assuming you have the two files shown above as your examples:
$ spacenet_eval --proposal_csv /path/to/sample_preds_competition.csv --truth_csv /path/to/sample_truth_competition.csv --challenge 'off-nadir' --output_file /path/to/outputs.csv
Let’s look at what the outputs would look like:
Summary¶
[4]:
result_summary = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results.csv'))
result_summary
[4]:
F1Score | FalseNeg | FalsePos | Precision | Recall | TruePos | |
---|---|---|---|---|---|---|
0 | 1.0 | 0 | 0 | 1.0 | 1.0 | 2319 |
In this case, the score is perfect because the predictions and ground truth were literally identical.
Here’s the image-by-image breakout:
Detailed results¶
[6]:
full_result = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results_full.csv'))
full_result.head()
[6]:
F1Score | FalseNeg | FalsePos | Precision | Recall | TruePos | imageID | iou_field | nadir-category | |
---|---|---|---|---|---|---|---|---|---|
0 | 1.0 | 0 | 0 | 1.0 | 1.0 | 80 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | iou_score | Nadir |
1 | 1.0 | 0 | 0 | 1.0 | 1.0 | 112 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | iou_score | Nadir |
2 | 1.0 | 0 | 0 | 1.0 | 1.0 | 72 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | iou_score | Nadir |
3 | 1.0 | 0 | 0 | 1.0 | 1.0 | 1 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | iou_score | Nadir |
4 | 1.0 | 0 | 0 | 1.0 | 1.0 | 52 | Atlanta_nadir8_catid_10300100023BC100_743501_3... | iou_score | Nadir |
These are five rows from the full result file, where each row indicates the scores for a single image chip.