Using the `solaris` CLI to score model performance¶

Once you have generated your model predictions and converted predictions to vector format, you’ll be ready to score your predictions! Let’s go through a test case for some “predictions” from the SpaceNet 4 dataset. Just to show you what those look like:

Ground truth and prediction data formats¶

Predictions¶

[2]:

import pandas as pd
import solaris as sol
import os

preds = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_preds_competition.csv'))
preds.head()

[2]:

	ImageId	BuildingId	PolygonWKT_Pix	Confidence
0	Atlanta_nadir8_catid_10300100023BC100_743501_3...	0	POLYGON ((0.00 712.83, 158.37 710.28, 160.59 6...	1
1	Atlanta_nadir8_catid_10300100023BC100_743501_3...	1	POLYGON ((665.82 0.00, 676.56 1.50, 591.36 603...	1
2	Atlanta_nadir8_catid_10300100023BC100_743501_3...	0	POLYGON ((182.62 324.15, 194.25 323.52, 197.97...	1
3	Atlanta_nadir8_catid_10300100023BC100_743501_3...	1	POLYGON ((92.99 96.94, 117.20 99.64, 114.72 12...	1
4	Atlanta_nadir8_catid_10300100023BC100_743501_3...	2	POLYGON ((0.82 29.96, 3.48 40.71, 2.80 51.00, ...	1

The file shows the image ID, the polygon geometry in WKT format, and a BuildingId counter to distinguish between buildings in a single image. The Confidence field in this case has no meaning, but can be provided if desired.

Ground Truth¶

[3]:

truth = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_truth_competition.csv'))
truth.head()

[3]:

	ImageId	BuildingId	PolygonWKT_Pix	PolygonWKT_Geo
0	Atlanta_nadir8_catid_10300100023BC100_743501_3...	0	POLYGON ((476.88 884.61, 485.59 877.64, 490.50...	1
1	Atlanta_nadir8_catid_10300100023BC100_743501_3...	1	POLYGON ((459.45 858.97, 467.41 853.09, 463.37...	1
2	Atlanta_nadir8_catid_10300100023BC100_743501_3...	2	POLYGON ((407.34 754.17, 434.90 780.55, 420.27...	1
3	Atlanta_nadir8_catid_10300100023BC100_743501_3...	3	POLYGON ((311.00 760.22, 318.38 746.78, 341.02...	1
4	Atlanta_nadir8_catid_10300100023BC100_743501_3...	4	POLYGON ((490.49 742.67, 509.81 731.14, 534.12...	1

More or less the same thing. So, how does scoring work?

Scoring functions in the `solaris` CLI¶

Once you have installed solaris, you will have access to the spacenet_eval command in your command line prompt. This command has a number of possible arguments to control mask creation, described below. If you need a refresher on these within your command line, you can always run spacenet_eval -h for usage instructions.

–proposal_csv, -p: [str] The full path to a CSV-formatted proposal file containing the same columns shown above.
–truth_csv, -t: [str] The full path to a CSV-formatted ground truth file containing the same columns shown above.
–challenge, -c, [str, one of ('off-nadir', 'spacenet-buildings2') ] The challenge being scored. Because the SpaceNet Off-Nadir Building Footprint Extraction Challenge was scored slightly differently from previous challenges to accommodate the different look angles, the challenge type must be specified here.
–output_file, -o: [str] The path to the output files to be saved. Two files will be saved: the summary file with the name provided in this argument, and one with '_full' added before the '.csv' extension, which contains the image-by-image breakdown of scores.

Assuming you have the two files shown above as your examples:

$ spacenet_eval --proposal_csv /path/to/sample_preds_competition.csv --truth_csv /path/to/sample_truth_competition.csv --challenge 'off-nadir' --output_file /path/to/outputs.csv

Let’s look at what the outputs would look like:

Summary¶

[4]:

result_summary = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results.csv'))
result_summary

[4]:

	F1Score	FalseNeg	FalsePos	Precision	Recall	TruePos
0	1.0	0	0	1.0	1.0	2319

In this case, the score is perfect because the predictions and ground truth were literally identical.

Here’s the image-by-image breakout:

Detailed results¶

[6]:

full_result = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results_full.csv'))
full_result.head()

[6]:

	F1Score	Precision	Recall	TruePos	imageID	iou_field	nadir-category
0	1.0	1.0	1.0	80	Atlanta_nadir8_catid_10300100023BC100_743501_3...	iou_score	Nadir
1	1.0	1.0	1.0	112	Atlanta_nadir8_catid_10300100023BC100_743501_3...	iou_score	Nadir
2	1.0	1.0	1.0	72	Atlanta_nadir8_catid_10300100023BC100_743501_3...	iou_score	Nadir
3	1.0	1.0	1.0	1	Atlanta_nadir8_catid_10300100023BC100_743501_3...	iou_score	Nadir
4	1.0	1.0	1.0	52	Atlanta_nadir8_catid_10300100023BC100_743501_3...	iou_score	Nadir

These are five rows from the full result file, where each row indicates the scores for a single image chip.

Using the solaris CLI to score model performance¶