{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using the `solaris` CLI to score model performance\n", "\n", "Once you have [generated your model predictions](cli_ml_pipeline.ipynb) and [converted predictions to vector format](api_mask_to_vector.ipynb), you'll be ready to score your predictions! Let's go through a test case for some \"predictions\" from the SpaceNet 4 dataset. Just to show you what those look like:\n", "\n", "## Ground truth and prediction data formats\n", "\n", "#### Predictions" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ImageIdBuildingIdPolygonWKT_PixConfidence
0Atlanta_nadir8_catid_10300100023BC100_743501_3...0POLYGON ((0.00 712.83, 158.37 710.28, 160.59 6...1
1Atlanta_nadir8_catid_10300100023BC100_743501_3...1POLYGON ((665.82 0.00, 676.56 1.50, 591.36 603...1
2Atlanta_nadir8_catid_10300100023BC100_743501_3...0POLYGON ((182.62 324.15, 194.25 323.52, 197.97...1
3Atlanta_nadir8_catid_10300100023BC100_743501_3...1POLYGON ((92.99 96.94, 117.20 99.64, 114.72 12...1
4Atlanta_nadir8_catid_10300100023BC100_743501_3...2POLYGON ((0.82 29.96, 3.48 40.71, 2.80 51.00, ...1
\n", "
" ], "text/plain": [ " ImageId BuildingId \\\n", "0 Atlanta_nadir8_catid_10300100023BC100_743501_3... 0 \n", "1 Atlanta_nadir8_catid_10300100023BC100_743501_3... 1 \n", "2 Atlanta_nadir8_catid_10300100023BC100_743501_3... 0 \n", "3 Atlanta_nadir8_catid_10300100023BC100_743501_3... 1 \n", "4 Atlanta_nadir8_catid_10300100023BC100_743501_3... 2 \n", "\n", " PolygonWKT_Pix Confidence \n", "0 POLYGON ((0.00 712.83, 158.37 710.28, 160.59 6... 1 \n", "1 POLYGON ((665.82 0.00, 676.56 1.50, 591.36 603... 1 \n", "2 POLYGON ((182.62 324.15, 194.25 323.52, 197.97... 1 \n", "3 POLYGON ((92.99 96.94, 117.20 99.64, 114.72 12... 1 \n", "4 POLYGON ((0.82 29.96, 3.48 40.71, 2.80 51.00, ... 1 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "import solaris as sol\n", "import os\n", "\n", "preds = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_preds_competition.csv'))\n", "preds.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The file shows the image ID, the polygon geometry in WKT format, and a `BuildingId` counter to distinguish between buildings in a single image. The `Confidence` field in this case has no meaning, but can be provided if desired.\n", "\n", "#### Ground Truth" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ImageIdBuildingIdPolygonWKT_PixPolygonWKT_Geo
0Atlanta_nadir8_catid_10300100023BC100_743501_3...0POLYGON ((476.88 884.61, 485.59 877.64, 490.50...1
1Atlanta_nadir8_catid_10300100023BC100_743501_3...1POLYGON ((459.45 858.97, 467.41 853.09, 463.37...1
2Atlanta_nadir8_catid_10300100023BC100_743501_3...2POLYGON ((407.34 754.17, 434.90 780.55, 420.27...1
3Atlanta_nadir8_catid_10300100023BC100_743501_3...3POLYGON ((311.00 760.22, 318.38 746.78, 341.02...1
4Atlanta_nadir8_catid_10300100023BC100_743501_3...4POLYGON ((490.49 742.67, 509.81 731.14, 534.12...1
\n", "
" ], "text/plain": [ " ImageId BuildingId \\\n", "0 Atlanta_nadir8_catid_10300100023BC100_743501_3... 0 \n", "1 Atlanta_nadir8_catid_10300100023BC100_743501_3... 1 \n", "2 Atlanta_nadir8_catid_10300100023BC100_743501_3... 2 \n", "3 Atlanta_nadir8_catid_10300100023BC100_743501_3... 3 \n", "4 Atlanta_nadir8_catid_10300100023BC100_743501_3... 4 \n", "\n", " PolygonWKT_Pix PolygonWKT_Geo \n", "0 POLYGON ((476.88 884.61, 485.59 877.64, 490.50... 1 \n", "1 POLYGON ((459.45 858.97, 467.41 853.09, 463.37... 1 \n", "2 POLYGON ((407.34 754.17, 434.90 780.55, 420.27... 1 \n", "3 POLYGON ((311.00 760.22, 318.38 746.78, 341.02... 1 \n", "4 POLYGON ((490.49 742.67, 509.81 731.14, 534.12... 1 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "truth = pd.read_csv(os.path.join(sol.data.data_dir, 'sample_truth_competition.csv'))\n", "truth.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "More or less the same thing. So, how does scoring work?\n", "\n", "\n", "## Scoring functions in the `solaris` CLI\n", "\n", "Once you have [installed solaris](../../installation.html), you will have access to the `spacenet_eval` command in your command line prompt. This command has a number of possible arguments to control mask creation, described below. If you need a refresher on these within your command line, you can always run `spacenet_eval -h` for usage instructions.\n", "\n", "### `spacenet_eval` arguments\n", "\n", "- __--proposal\\_csv__, __-p__: \\[str\\] The full path to a CSV-formatted proposal file containing the same columns shown above.\n", "- __--truth\\_csv__, __-t__: \\[str\\] The full path to a CSV-formatted ground truth file containing the same columns shown above.\n", "- __--challenge__, __-c__, \\[str, one of `('off-nadir', 'spacenet-buildings2')` \\] The challenge being scored. Because the SpaceNet Off-Nadir Building Footprint Extraction Challenge was scored slightly differently from previous challenges to accommodate the different look angles, the challenge type must be specified here.\n", "- __--output\\_file__, __-o__: \\[str\\] The path to the output files to be saved. Two files will be saved: the summary file with the name provided in this argument, and one with `'_full'` added before the `'.csv'` extension, which contains the image-by-image breakdown of scores.\n", "\n", "### `spacenet_eval` CLI usage example\n", "\n", "Assuming you have the two files shown above as your examples:\n", "\n", "\n", "```console\n", "$ spacenet_eval --proposal_csv /path/to/sample_preds_competition.csv --truth_csv /path/to/sample_truth_competition.csv --challenge 'off-nadir' --output_file /path/to/outputs.csv\n", "```\n", "\n", "Let's look at what the outputs would look like:\n", "\n", "#### Summary" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
F1ScoreFalseNegFalsePosPrecisionRecallTruePos
01.0001.01.02319
\n", "
" ], "text/plain": [ " F1Score FalseNeg FalsePos Precision Recall TruePos\n", "0 1.0 0 0 1.0 1.0 2319" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result_summary = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results.csv'))\n", "result_summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case, the score is perfect because the predictions and ground truth were literally identical.\n", "\n", "Here's the image-by-image breakout:\n", "\n", "#### Detailed results" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
F1ScoreFalseNegFalsePosPrecisionRecallTruePosimageIDiou_fieldnadir-category
01.0001.01.080Atlanta_nadir8_catid_10300100023BC100_743501_3...iou_scoreNadir
11.0001.01.0112Atlanta_nadir8_catid_10300100023BC100_743501_3...iou_scoreNadir
21.0001.01.072Atlanta_nadir8_catid_10300100023BC100_743501_3...iou_scoreNadir
31.0001.01.01Atlanta_nadir8_catid_10300100023BC100_743501_3...iou_scoreNadir
41.0001.01.052Atlanta_nadir8_catid_10300100023BC100_743501_3...iou_scoreNadir
\n", "
" ], "text/plain": [ " F1Score FalseNeg FalsePos Precision Recall TruePos \\\n", "0 1.0 0 0 1.0 1.0 80 \n", "1 1.0 0 0 1.0 1.0 112 \n", "2 1.0 0 0 1.0 1.0 72 \n", "3 1.0 0 0 1.0 1.0 1 \n", "4 1.0 0 0 1.0 1.0 52 \n", "\n", " imageID iou_field nadir-category \n", "0 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir \n", "1 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir \n", "2 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir \n", "3 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir \n", "4 Atlanta_nadir8_catid_10300100023BC100_743501_3... iou_score Nadir " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "full_result = pd.read_csv(os.path.join(sol.data.data_dir, 'competition_test_results_full.csv'))\n", "full_result.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These are five rows from the full result file, where each row indicates the scores for a single image chip." ] } ], "metadata": { "kernelspec": { "display_name": "solaris", "language": "python", "name": "solaris" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }