Using the solaris CLI to score model performance

Once you have generated your model predictions and converted predictions to vector format, you’ll be ready to score your predictions! Let’s go through a test case for some “predictions” from the SpaceNet 4 dataset. Just to show you what those look like:

Ground truth and prediction data formats

Predictions

[2]:
import pandas as pd
import solaris as sol
import os

preds = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_preds_competition.csv'))
preds.head()
[2]:
ImageId BuildingId PolygonWKT_Pix Confidence
0 Atlanta_nadir8_catid_10300100023BC100_743501_3... 0 POLYGON ((0.00 712.83, 158.37 710.28, 160.59 6... 1
1 Atlanta_nadir8_catid_10300100023BC100_743501_3... 1 POLYGON ((665.82 0.00, 676.56 1.50, 591.36 603... 1
2 Atlanta_nadir8_catid_10300100023BC100_743501_3... 0 POLYGON ((182.62 324.15, 194.25 323.52, 197.97... 1
3 Atlanta_nadir8_catid_10300100023BC100_743501_3... 1 POLYGON ((92.99 96.94, 117.20 99.64, 114.72 12... 1
4 Atlanta_nadir8_catid_10300100023BC100_743501_3... 2 POLYGON ((0.82 29.96, 3.48 40.71, 2.80 51.00, ... 1

The file shows the image ID, the polygon geometry in WKT format, and a BuildingId counter to distinguish between buildings in a single image. The Confidence field in this case has no meaning, but can be provided if desired.

Ground Truth

[3]:
truth = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_truth_competition.csv'))
truth.head()
[3]:
ImageId BuildingId PolygonWKT_Pix PolygonWKT_Geo
0 Atlanta_nadir8_catid_10300100023BC100_743501_3... 0 POLYGON ((476.88 884.61, 485.59 877.64, 490.50... 1
1 Atlanta_nadir8_catid_10300100023BC100_743501_3... 1 POLYGON ((459.45 858.97, 467.41 853.09, 463.37... 1
2 Atlanta_nadir8_catid_10300100023BC100_743501_3... 2 POLYGON ((407.34 754.17, 434.90 780.55, 420.27... 1
3 Atlanta_nadir8_catid_10300100023BC100_743501_3... 3 POLYGON ((311.00 760.22, 318.38 746.78, 341.02... 1
4 Atlanta_nadir8_catid_10300100023BC100_743501_3... 4 POLYGON ((490.49 742.67, 509.81 731.14, 534.12... 1

More or less the same thing. So, how does scoring work?

Scoring functions in the solaris CLI

Once you have installed solaris, you will have access to the spacenet_eval command in your command line prompt. This command has a number of possible arguments to control mask creation, described below. If you need a refresher on these within your command line, you can always run spacenet_eval -h for usage instructions.

  • –proposal_csv, -p: [str] The full path to a CSV-formatted proposal file containing the same columns shown above.

  • –truth_csv, -t: [str] The full path to a CSV-formatted ground truth file containing the same columns shown above.

  • –challenge, -c, [str, one of ('off-nadir', 'spacenet-buildings2') ] The challenge being scored. Because the SpaceNet Off-Nadir Building Footprint Extraction Challenge was scored slightly differently from previous challenges to accommodate the different look angles, the challenge type must be specified here.

  • –output_file, -o: [str] The path to the output files to be saved. Two files will be saved: the summary file with the name provided in this argument, and one with '_full' added before the '.csv' extension, which contains the image-by-image breakdown of scores.

Assuming you have the two files shown above as your examples:

$ spacenet_eval --proposal_csv /path/to/sample_preds_competition.csv --truth_csv /path/to/sample_truth_competition.csv --challenge 'off-nadir' --output_file /path/to/outputs.csv

Let’s look at what the outputs would look like:

Summary

[4]:
result_summary = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results.csv'))
result_summary
[4]:
F1Score FalseNeg FalsePos Precision Recall TruePos
0 1.0 0 0 1.0 1.0 2319

In this case, the score is perfect because the predictions and ground truth were literally identical.

Here’s the image-by-image breakout:

Detailed results

[6]:
full_result = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results_full.csv'))
full_result.head()
[6]:
F1Score FalseNeg FalsePos Precision Recall TruePos imageID iou_field nadir-category
0 1.0 0 0 1.0 1.0 80 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir
1 1.0 0 0 1.0 1.0 112 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir
2 1.0 0 0 1.0 1.0 72 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir
3 1.0 0 0 1.0 1.0 1 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir
4 1.0 0 0 1.0 1.0 52 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir

These are five rows from the full result file, where each row indicates the scores for a single image chip.