Scoring model performance with the solaris python API

This tutorial describes how to run evaluation of a proposal (CSV or .geojson) for a single chip against a ground truth (CSV or .geojson) for the same chip.

CSV Eval


  1. Imports

  2. Load ground truth CSV

  3. Load proposal CSV

  4. Perform evaluation


For this test case we will use the eval submodule within solaris.

# imports
import os
import solaris as sol
from import data_dir
import pandas as pd  # just for visualizing the outputs

Load ground truth CSV

We will first instantiate an Evaluator() object, which is the core class eval uses for comparing predicted labels to ground truth labels. Evaluator() takes one argument - the path to the CSV or .geojson ground truth label object. It can alternatively accept a pre-loaded GeoDataFrame of ground truth label geometries.

ground_truth_path = os.path.join(data_dir, 'sample_truth.csv')

evaluator = sol.eval.base.Evaluator(ground_truth_path)
Evaluator sample_truth.csv

At this point, evaluator has the following attributes:

  • ground_truth_fname: the filename corresponding to the ground truth data. This is simply 'GeoDataFrame' if a GDF was passed during instantiation.

  • ground_truth_GDF: GeoDataFrame-formatted geometries for the ground truth polygon labels.

  • ground_truth_GDF_Edit: A deep copy of eval_object.ground_truth_GDF which is edited during the process of matching ground truth label polygons to proposals.

  • ground_truth_sindex: The RTree/libspatialindex spatial index for rapid spatial referencing.

  • proposal_GDF: An empty GeoDataFrame instantiated to hold proposals later.

Load proposal CSV

Next we will load in the proposal CSV file. Note that the proposalCSV flag must be set to true for CSV data. If the CSV contains confidence column(s) that indicate confidence in proprosals, the name(s) of the column(s) should be passed as a list of strings with the conf_field_list argument; because no such column exists in this case, we will simply pass conf_field_list=[]. There are additional arguments available (see the method documentation) which can be used for multi-class problems; those will be covered in another recipe. The defaults suffice for single-class problems.

proposals_path = os.path.join(data_dir, 'sample_preds.csv')
evaluator.load_proposal(proposals_path, proposalCSV=True, conf_field_list=[])

Perform evaluation

Evaluation iteratively steps through the proposal polygons in eval_object.proposal_GDF and determines if any of the polygons in eval_object.ground_truth_GDF_Edit have IoU overlap > miniou (see the method documentation) with that proposed polygon. If one does, that proposal polygon is scored as a true positive. The matched ground truth polygon with the highest IoU (in case multiple had IoU > miniou) is removed from eval_object.ground_truth_GDF_Edit so it cannot be matched against another proposal. If no ground truth polygon matches with IoU > miniou, that proposal polygon is scored as a false positive. After iterating through all proposal polygons, any remaining ground truth polygons in eval_object.ground_truth_GDF_Edit are scored as false negatives.

There are several additional arguments to this method related to multi-class evaluation which will be covered in a later recipe. See the method documentation for usage.

The prediction outputs a list of dicts for each class evaluated (only one dict in this single-class case). The dict(s) have the following keys:

  • 'class_id': The class being scored in the dict, 'all' for single-class scoring.

  • 'iou_field': The name of the column in eval_object.proposal_GDF for the IoU score for this class. See the method documentation for more information.

  • 'TruePos': The number of polygons in eval_object.proposal_GDF that matched a polygon in eval_object.ground_truth_GDF_Edit.

  • 'FalsePos': The number of polygons in eval_object.proposal_GDF that had no match in eval_object.ground_truth_GDF_Edit.

  • 'FalseNeg': The number of polygons in eval_object.ground_truth_GDF_Edit that had no match in eval_object.proposal_GDF.

  • 'Precision': The precision statistic for IoU between the proposals and the ground truth polygons.

  • 'Recall': The recall statistic for IoU between the proposals and the ground truth polygons.

  • 'F1Score': Also known as the SpaceNet Metric, the F1 score for IoU between the proposals and the ground truth polygons.

151it [00:00, 153.44it/s]
[{'class_id': 'all',
  'iou_field': 'iou_score_all',
  'TruePos': 151,
  'FalsePos': 0,
  'FalseNeg': 0,
  'Precision': 1.0,
  'Recall': 1.0,
  'F1Score': 1.0}]

In this case, the score is perfect because the polygons in the ground truth CSV and the proposal CSV are identical. At this point, a new proposal CSV can be loaded (for example, for a new nadir angle at the same chip location) and scoring can be repeated.

GeoJSON Eval

The same operation can be completed with .geojson-formatted ground truth and proposal files. See the example below, and see the detailed explanation above for a description of each step’s operations.

ground_truth_geojson = os.path.join(data_dir, 'gt.geojson')
proposal_geojson = os.path.join(data_dir, 'pred.geojson')

evaluator = sol.eval.base.Evaluator(ground_truth_geojson)
evaluator.load_proposal(proposal_geojson, proposalCSV=False, conf_field_list=[])
28it [00:00, 76.20it/s]
[{'class_id': 'all',
  'iou_field': 'iou_score_all',
  'TruePos': 8,
  'FalsePos': 20,
  'FalseNeg': 20,
  'Precision': 0.2857142857142857,
  'Recall': 0.2857142857142857,
  'F1Score': 0.2857142857142857}]

(Note that the above comes from a different chip location and different proposal than the CSV example, hence the difference in scores)