Converting GeoJSON labels to COCO-formatted labels using Solaris

Now, you can automatically generate COCO .jsons from GeoJSON vector labels and georegistered image files. Let’s look at a couple of exmaples of how to do so. All of these cases use the solaris.data.coco.geojson2coco() function. For more information about the COCO specification, see the COCO dataset website.

Syntax

The solaris.data.coco.geojson2coco() takes the following arguments:

  • image_src: a str or list or dict defining source image(s) to use in the dataset. These are required not only to list as part of the dataset, but also to convert georegistered labels to pixel coordinates. This argument can be:

    1. a string path to an image (e.g. `"path/to/a/geotiff.tif"`)
    2. the path to a directory containing a bunch of images (e.g. `"/path/to/geotiff/dir/"`)
    3. a list of image paths (e.g. `["path/to/geotiff_1.tif", "path/to/geotiff_2.tif"]`)
    4. a dictionary corresponding to COCO-formatted image records (e.g.
    
    [
        {
          "id": 1,
          "file_name": "path/to/geotiff.tif",
          "height": 640,
          "width": 640,
        },
        {etc.}
    ]
    
    5. a string path to a COCO JSON containing image records (e.g. `"path/to/coco_dataset.json"`)
    

    If image_src is a directory, the recursive flag will be used to determine whetheror not to descend into sub-directories.

  • label_src: str or list of source labels to use in the dataset. This can be a string path to a geojson, the path to a directory containing multiple geojsons, or a list of geojson file paths. If a directory, the recursive flag will determine whether or not to descend into sub-directories.

  • output_path : an optional str path to save the JSON-formatted COCO records to. If not provided, the records will only be returned as a dict, and not saved to file.

  • image_ext: The string extension to use to identify images when searching directories. Only has an effect if image_src is a directory path. Defaults to ".tif".

  • matching_re : A regular expression pattern to match filenames between image_src and label_src if both are directories of multiple files. This has no effect if those arguments do not both correspond to directories or lists of files. If this isn’t provided, it is assumed that label filenames and image filenames differ only in their extensions, and filenames will be compared for identity to find matches.

  • category_attribute: The str name of an attribute in the geojson that specifies which category a given instance corresponds to. If not provided, it’s assumed that only one class of object is present in the dataset, which will be termed "other" in the output json.

  • preset_categories: An optional pre-set list of dicts of categories to use for labels. These categories should be formatted per the COCO category specification.

  • include_other: A boolean which, if set to True, and preset_categories is provided, causes objects that don’t fall into the specified categories to be kept in the dataset. They will be passed into a category named "other" with its own associated category id. If False, objects whose categories don’t match a category from preset_categories will be dropped.

  • info_dict: An optional dict with the following key-value pairs:

    • "year": int year of creation

    • "version": str version of the dataset

    • "description": str string description of the dataset

    • "contributor": str who contributed the dataset

    • "url": str URL where the dataset can be found

    • "date_created": datetime.datetime when the dataset was created

    If info_dict isn’t provided, it will be left out of the .json created by solaris.

  • license_dict: An optional dict containing the licensing information for the dataset, with the following key-value pairs:

    • "name": str the name of the license.

    • "url": str a link to the dataset’s license.

    Note: This implementation assumes that all of the data uses one license. If multiple licenses are provided, the image records will not be assigned a license ID.

  • recursive: If image_src and/or label_src are directories, setting this flag to True will induce solaris to descend into subdirectories to find files. By default, solaris does not traverse the directory tree.

  • verbose : Verbose text output. By default, none is provided; if True or 1, information-level outputs are provided; if 2, extremely verbose text is output.

Examples

See the two examples below for usage of this function.

Example 1: A dataset with one image and one json (for example, untiled geospatial imagery files)

In this example, we’ll load in a single image and geojson. Because there’s only one file for each, labels will be converted to their pixel coordinates within the only image included. In addition, we’ll specify a property of the items in the geojson, "truncated", to separate into two classes. Note that we won’t include any license information or info metadata since we’re not providing that during dataset creation.

[3]:
import solaris as sol
from solaris.data import data_dir
import os
import json

sample_geojson = os.path.join(data_dir, 'geotiff_labels.geojson')
sample_image = os.path.join(data_dir, 'sample_geotiff.tif')

coco_dict = sol.data.coco.geojson2coco(sample_image, sample_geojson,
                                       category_attribute='truncated')
  0%|          | 0/1 [00:00<?, ?it/s]/Users/nweir/code/cosmiq_repos/solaris/solaris/data/coco.py:218: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.

To retain the current behavior and silence the warning, pass 'sort=True'.

  ignore_index=True)
100%|██████████| 1/1 [00:00<00:00,  4.41it/s]

Let’s see what that looks like!

[5]:
from IPython.display import JSON
JSON(coco_dict)
[5]:
<IPython.core.display.JSON object>

In case the above doesn’t render for you, the raw text is below.

[6]:
print(coco_dict)
{'annotations': [{'id': 1, 'image_id': 1, 'category_id': 1, 'segmentation': [0.0, 2.845103836618364, 7.787239895900711, 7.813573766499758, 6.348949391860515, 21.166115891188383, 5.487595358863473, 29.24418894201517, 19.3797596283257, 37.85056554712355, 18.118415302364156, 57.70217224024236, 0.0, 54.131107677705586, 0.0, 2.845103836618364], 'area': 608.3880075917921, 'bbox': [0.0, 2.845103836618364, 19.3797596283257, 54.857068403624], 'iscrowd': 0}, {'id': 2, 'image_id': 1, 'category_id': 2, 'segmentation': [27.38481539185159, 226.1645903000608, 34.46586190746166, 226.48033855389804, 34.72251786501147, 221.01391235832125, 44.8147500208579, 221.47823364380747, 44.453276831656694, 229.49973394535482, 56.44128756551072, 230.05102432798594, 54.999366192379966, 261.5376432267949, 46.934077847748995, 267.3053462980315, 25.54191842698492, 266.33953956048936, 27.38481539185159, 226.1645903000608], 'area': 1175.2086036457465, 'bbox': [25.54191842698492, 221.01391235832125, 30.8993691385258, 46.29143393971026], 'iscrowd': 0}, {'id': 3, 'image_id': 1, 'category_id': 1, 'segmentation': [60.03418597159907, 884.8732050526887, 73.8337494416628, 900.0, 51.516283753560856, 900.0, 47.80893106292933, 895.9360736850649, 60.03418597159907, 884.8732050526887], 'area': 214.14410906402435, 'bbox': [47.80893106292933, 884.8732050526887, 26.02481837873347, 15.126794947311282], 'iscrowd': 0}, {'id': 4, 'image_id': 1, 'category_id': 2, 'segmentation': [65.83512698789127, 443.34588148258626, 86.05315328529105, 444.11831593420357, 84.12356285331771, 493.6842159954831, 63.905484846793115, 492.9117766721174, 65.83512698789127, 443.34588148258626], 'area': 1003.6164099476883, 'bbox': [63.905484846793115, 443.34588148258626, 22.14766843849793, 50.338334512896836], 'iscrowd': 0}, {'id': 5, 'image_id': 1, 'category_id': 2, 'segmentation': [87.2731370574329, 72.93001714255661, 106.98580074869096, 84.21334314905107, 97.70029512513429, 100.08772260416299, 91.15104462415911, 98.73803176078945, 53.36824434134178, 78.81699287053198, 59.59959887806326, 67.12329848110676, 79.5907216486521, 77.6452063396573, 87.2731370574329, 72.93001714255661], 'area': 832.0140045611614, 'bbox': [53.36824434134178, 67.12329848110676, 53.617556407349184, 32.96442412305623], 'iscrowd': 0}, {'id': 6, 'image_id': 1, 'category_id': 2, 'segmentation': [87.33356586564332, 502.7506626434624, 90.79576571006328, 511.5002211164683, 93.23485574219376, 550.6385944513604, 70.7970443549566, 553.24962748494, 70.8928169994615, 544.9904495924711, 70.136441974435, 526.1424378501251, 69.54070870671421, 501.6971649955958, 87.33356586564332, 502.7506626434624], 'area': 1055.98129961667, 'bbox': [69.54070870671421, 501.6971649955958, 23.694147035479546, 51.55246248934418], 'iscrowd': 0}, {'id': 7, 'image_id': 1, 'category_id': 2, 'segmentation': [89.06576380180195, 561.6383464429528, 91.58933665603399, 579.8661990063265, 94.85523002082482, 592.7489638356492, 93.41836131247692, 606.9227179316804, 85.23654213640839, 610.96199104283, 73.38412117748521, 610.6515495320782, 71.78515014378354, 605.9850031817332, 72.83866719854996, 588.2692635813728, 72.19266184815206, 561.761005807668, 89.06576380180195, 561.6383464429528], 'area': 965.97420241684, 'bbox': [71.78515014378354, 561.6383464429528, 23.07007987704128, 49.3236445998773], 'iscrowd': 0}, {'id': 8, 'image_id': 1, 'category_id': 2, 'segmentation': [73.42513769492507, 652.7116207033396, 73.87162248673849, 640.5596289457753, 73.96795534505509, 634.6088027460501, 89.67092586541548, 635.2249299148098, 95.65740334033035, 635.5673428568989, 95.31320596928708, 642.0125183537602, 90.37084288243204, 643.375933191739, 86.37565372907557, 643.2291440004483, 83.40025364165194, 643.7899634344503, 78.97129776002839, 645.6513671381399, 77.39582740026526, 652.6148558100685, 73.42513769492507, 652.7116207033396], 'area': 226.76552441704672, 'bbox': [73.42513769492507, 634.6088027460501, 22.232265645405278, 18.102817957289517], 'iscrowd': 0}, {'id': 9, 'image_id': 1, 'category_id': 2, 'segmentation': [104.26538560027257, 379.3509592106566, 95.71053624199703, 432.8294323347509, 83.6166091109626, 427.17569901794195, 71.57171053020284, 424.2952412031591, 74.16042645578273, 415.48699966818094, 82.98187345149927, 415.6049499800429, 84.05773156136274, 402.6163706937805, 81.36960026691668, 392.8713309513405, 80.55370765388943, 388.34107194375247, 84.53470388124697, 378.76643434073776, 104.26538560027257, 379.3509592106566], 'area': 991.4279262138889, 'bbox': [71.57171053020284, 378.76643434073776, 32.69367507006973, 54.06299799401313], 'iscrowd': 0}, {'id': 10, 'image_id': 1, 'category_id': 2, 'segmentation': [105.87982941744849, 313.81173481605947, 127.49495613621548, 320.8758743349463, 111.21164846420288, 370.3255268894136, 89.93423808692023, 363.40850543417037, 96.1564225088805, 344.47919271234423, 106.8362458597403, 336.29499020427465, 112.88286360329948, 326.15949539188296, 113.55506858509034, 319.46216831356287, 105.87982941744849, 313.81173481605947], 'area': 946.9628953122234, 'bbox': [89.93423808692023, 313.81173481605947, 37.56071804929525, 56.513792073354125], 'iscrowd': 0}, {'id': 11, 'image_id': 1, 'category_id': 2, 'segmentation': [129.5470552511979, 257.30139927752316, 162.33337076054886, 274.0591311287135, 154.6747992991004, 288.89506167545915, 124.72422339068726, 273.46653464064, 123.76762919081375, 268.4957895213738, 129.5470552511979, 257.30139927752316], 'area': 606.3783174309181, 'bbox': [123.76762919081375, 257.30139927752316, 38.56574156973511, 31.593662397935987], 'iscrowd': 0}, {'id': 12, 'image_id': 1, 'category_id': 2, 'segmentation': [133.95478952932172, 97.67253606952727, 150.39837071765214, 108.12547634728253, 153.35539799532853, 113.66894138418138, 158.35767206153832, 117.05394644103944, 162.1479135560803, 127.83751212060452, 171.07018301589414, 132.85823319572955, 166.4350645239465, 144.5352517813444, 156.12809146079235, 140.59148615878075, 151.76094630081207, 135.0823942553252, 140.6975575806573, 129.0484553426504, 133.83045004075393, 121.51388919819146, 124.8308775019832, 113.32103253901005, 133.95478952932172, 97.67253606952727], 'area': 905.6790360714577, 'bbox': [124.8308775019832, 97.67253606952727, 46.23930551391095, 46.862715711817145], 'iscrowd': 0}, {'id': 13, 'image_id': 1, 'category_id': 2, 'segmentation': [213.84757142001763, 865.6317731337622, 231.50301338662393, 882.3587670447305, 216.42948372568935, 899.1955095911399, 210.5498044961132, 889.5282784951851, 214.30827019084245, 884.5313852354884, 213.079774370417, 877.5474507408217, 203.49479921744205, 874.2963477959856, 213.84757142001763, 865.6317731337622], 'area': 392.57584507364606, 'bbox': [203.49479921744205, 865.6317731337622, 28.008214169181883, 33.56373645737767], 'iscrowd': 0}, {'id': 14, 'image_id': 1, 'category_id': 2, 'segmentation': [231.21248426428065, 330.48871645797044, 248.18980536190793, 356.7320148414001, 241.67447824659757, 364.39305018913, 225.53970709559508, 372.71031646989286, 220.28613743791357, 362.82807022240013, 226.06819452391937, 359.3577359961346, 222.16971050621942, 354.03699261229485, 211.39998442842625, 359.29360383190215, 203.3048722210806, 347.08347506821156, 231.21248426428065, 330.48871645797044], 'area': 942.4941668682975, 'bbox': [203.3048722210806, 330.48871645797044, 44.88493314082734, 42.22160001192242], 'iscrowd': 0}, {'id': 15, 'image_id': 1, 'category_id': 2, 'segmentation': [226.15747217368335, 126.1882931953296, 227.64633786818013, 117.96173016168177, 236.09353622514755, 119.48706156853586, 234.58611978427507, 127.7140777958557, 226.15747217368335, 126.1882931953296], 'area': 71.7025184022421, 'bbox': [226.15747217368335, 117.96173016168177, 9.9360640514642, 9.75234763417393], 'iscrowd': 0}, {'id': 16, 'image_id': 1, 'category_id': 2, 'segmentation': [237.77231920976192, 153.3169838031754, 243.4796773325652, 156.68477841839194, 245.5267063616775, 153.83819461707026, 253.28920675627887, 154.69214213639498, 252.36845179693773, 160.33013889566064, 273.4036889746785, 171.02616648748517, 271.2190176327713, 181.93320011347532, 268.2431232582312, 190.08504440169781, 263.83925927826203, 197.53924131486565, 260.0654399082996, 201.13819517660886, 252.25840627076104, 199.2199343442917, 229.10111278295517, 187.5324132423848, 221.94777155457996, 183.48959933500737, 237.77231920976192, 153.3169838031754], 'area': 1507.8715455558022, 'bbox': [221.94777155457996, 153.3169838031754, 51.45591742009856, 47.82121137343347], 'iscrowd': 0}, {'id': 17, 'image_id': 1, 'category_id': 2, 'segmentation': [392.3872257217299, 671.1492497138679, 417.30380885861814, 671.2074872627854, 418.62818518141285, 684.4039134653285, 393.0598451150581, 685.027445490472, 392.3872257217299, 671.1492497138679], 'area': 341.9972755053627, 'bbox': [392.3872257217299, 671.1492497138679, 26.240959459682927, 13.878195776604116], 'iscrowd': 0}, {'id': 18, 'image_id': 1, 'category_id': 1, 'segmentation': [415.6500815402251, 870.6108930064365, 423.3889202498831, 878.856587799266, 425.6205423306674, 893.4736175602302, 417.6680606456939, 900.0, 385.6889950442128, 900.0, 415.6500815402251, 870.6108930064365], 'area': 640.7200900905971, 'bbox': [385.6889950442128, 870.6108930064365, 39.93154728645459, 29.389106993563473], 'iscrowd': 0}, {'id': 19, 'image_id': 1, 'category_id': 2, 'segmentation': [407.2164936910849, 293.369396366179, 427.77757996553555, 294.5104229282588, 424.95420863106847, 345.4521803893149, 401.8091458447743, 344.1744022862986, 404.039300782606, 303.9233255367726, 406.6232181608211, 304.0600705072284, 407.2164936910849, 293.369396366179], 'area': 1155.038723289968, 'bbox': [401.8091458447743, 293.369396366179, 25.96843412076123, 52.0827840231359], 'iscrowd': 0}, {'id': 20, 'image_id': 1, 'category_id': 2, 'segmentation': [432.1206162075978, 225.95247913245112, 430.60763758723624, 245.3663629340008, 426.6382694914937, 244.7529080240056, 425.1836831646506, 258.9493404906243, 429.14439756423235, 259.2078733071685, 428.2809057792183, 275.56508298031986, 414.204479301814, 274.6432346571237, 414.4758333056234, 263.69406074192375, 411.21745124668814, 255.69423871394247, 405.99594805110246, 248.6523243561387, 406.9461998385377, 242.70285607129335, 410.4586205475498, 239.0436596525833, 410.1156435646117, 232.59303102549165, 406.6664985958487, 231.23442135937512, 407.21412571519613, 224.76207193825394, 432.1206162075978, 225.95247913245112], 'area': 926.2819276108769, 'bbox': [405.99594805110246, 224.76207193825394, 26.124668156495318, 50.80301104206592], 'iscrowd': 0}, {'id': 21, 'image_id': 1, 'category_id': 2, 'segmentation': [412.0414752406068, 165.4036012943834, 432.5184705699794, 166.1471375450492, 430.66934231179766, 216.68781219702214, 410.19229355221614, 215.94427104014903, 412.0414752406068, 165.4036012943834], 'area': 1036.2973770508409, 'bbox': [410.19229355221614, 165.4036012943834, 22.326177017763257, 51.28421090263873], 'iscrowd': 0}, {'id': 22, 'image_id': 1, 'category_id': 2, 'segmentation': [436.6716912172269, 114.92877714522183, 435.17816135426983, 145.7952524824068, 428.3110607606359, 155.7733131237328, 420.42894698819146, 155.34407423250377, 418.76928842114285, 153.5201010480523, 419.0462323431857, 149.65126746241003, 420.8805788680911, 143.41388408094645, 419.8326982872095, 136.22578839305788, 416.34996173810214, 131.20568466931581, 416.69879815378226, 115.06078892573714, 436.6716912172269, 114.92877714522183], 'area': 674.2933167606384, 'bbox': [416.34996173810214, 114.92877714522183, 20.32172947912477, 40.844535978510976], 'iscrowd': 0}, {'id': 23, 'image_id': 1, 'category_id': 2, 'segmentation': [459.1644711194094, 47.61499526724219, 455.6476237687748, 70.11859888583422, 450.8766771061346, 69.36933278851211, 446.62103112763725, 96.59647608082741, 426.43874416314065, 93.47081579640508, 434.2112922635861, 43.74008092097938, 459.1644711194094, 47.61499526724219], 'area': 1137.873532469484, 'bbox': [426.43874416314065, 43.74008092097938, 32.72572695626877, 52.856395159848034], 'iscrowd': 0}, {'id': 24, 'image_id': 1, 'category_id': 1, 'segmentation': [484.2024364131503, 0.0, 479.75414649397135, 9.601744243875146, 477.463500038255, 8.547827863134444, 464.7478086431511, 36.00354308541864, 446.46081846160814, 27.615649731829762, 459.25223012291826, 0.0, 484.2024364131503, 0.0], 'area': 730.4737448745893, 'bbox': [446.46081846160814, 0.0, 37.74161795154214, 36.00354308541864], 'iscrowd': 0}, {'id': 25, 'image_id': 1, 'category_id': 2, 'segmentation': [446.38990870770067, 842.2273999303579, 481.3434248256963, 828.2793587576598, 488.85468638362363, 846.9848243454471, 453.91915302863345, 860.910242264159, 446.38990870770067, 842.2273999303579], 'area': 758.0660568429099, 'bbox': [446.38990870770067, 828.2793587576598, 42.46477767592296, 32.63088350649923], 'iscrowd': 0}, {'id': 26, 'image_id': 1, 'category_id': 2, 'segmentation': [482.2356772432104, 357.92745217029005, 495.83988205646165, 360.03715515416116, 493.9617549048271, 372.0909278737381, 527.8789904229343, 377.3452287474647, 524.7309722411446, 397.46488589048386, 492.0437040710822, 392.40253333747387, 491.48141598375514, 395.98978219833225, 476.6471859868616, 393.6881181783974, 482.2356772432104, 357.92745217029005], 'area': 1201.892037899198, 'bbox': [476.6471859868616, 357.92745217029005, 51.23180443607271, 39.5374337201938], 'iscrowd': 0}, {'id': 27, 'image_id': 1, 'category_id': 2, 'segmentation': [536.0753469388001, 150.17036613915116, 535.8976141829044, 157.3439671061933, 537.3988791736774, 162.56776288338006, 536.2554783222731, 165.92503716237843, 539.3879669941962, 178.65563245117664, 533.924685027916, 182.25146871525794, 530.728569818195, 184.41585112269968, 524.4869354791008, 188.82972381450236, 524.3672584414016, 192.2951983232051, 522.0256408345886, 192.1969511229545, 522.3906255022157, 182.04453698452562, 525.8205245367717, 178.80905124824494, 530.0610870544333, 171.51414068136364, 528.5654665878974, 168.04368018638343, 524.6151287727989, 165.1658039363101, 525.0040782545693, 149.90775556955487, 536.0753469388001, 150.17036613915116], 'area': 404.6692002570957, 'bbox': [522.0256408345886, 149.90775556955487, 17.36232615960762, 42.38744275365025], 'iscrowd': 0}, {'id': 28, 'image_id': 1, 'category_id': 2, 'segmentation': [523.4165055262856, 198.2224921071902, 544.9352911333553, 199.05147654097527, 542.4260684649926, 249.9190295347944, 522.5057537995744, 248.40736349392682, 520.1315930527635, 239.36497628502548, 518.9120450657792, 233.5128228161484, 526.6003856104799, 229.04146504867822, 528.8170812996104, 220.97468123119324, 522.6861530637834, 212.42346664890647, 523.4165055262856, 198.2224921071902], 'area': 1021.3360981644006, 'bbox': [518.9120450657792, 198.2224921071902, 26.02324606757611, 51.6965374276042], 'iscrowd': 0}, {'id': 29, 'image_id': 1, 'category_id': 2, 'segmentation': [526.6318989666179, 261.5354417562485, 545.283115554601, 262.0126638803631, 544.4942456730641, 292.8397702910006, 552.0921549452469, 293.03174194227904, 551.5256321858615, 314.70872772019356, 517.8084108987823, 313.8443781072274, 518.1340601162519, 301.317967700772, 525.6015503380913, 301.4909356869757, 526.6318989666179, 261.5354417562485], 'area': 1238.4757250715768, 'bbox': [517.8084108987823, 261.5354417562485, 34.28374404646456, 53.17328596394509], 'iscrowd': 0}, {'id': 30, 'image_id': 1, 'category_id': 2, 'segmentation': [567.6196071440354, 79.48379767127335, 573.6413344931789, 77.47242644708604, 573.9580853967927, 84.36761443130672, 576.7988056694157, 88.18258353415877, 583.9389728535898, 90.16137413866818, 588.2799405395053, 90.78792378865182, 553.5269071373623, 129.635154761374, 549.2492460440844, 126.27696036919951, 545.6672062769067, 115.66594129707664, 547.9324978117365, 101.98241242859513, 567.6196071440354, 79.48379767127335], 'area': 1005.3323860159348, 'bbox': [545.6672062769067, 77.47242644708604, 42.61273426259868, 52.16272831428796], 'iscrowd': 0}, {'id': 31, 'image_id': 1, 'category_id': 2, 'segmentation': [545.0001820274629, 404.13968645595014, 545.5456960839219, 383.8837510570884, 561.4482773041818, 384.3170414939523, 561.6076893680729, 378.67540270090103, 573.1816837256774, 378.9923253301531, 573.0370411262847, 384.47823309339583, 594.233580631204, 385.0264944685623, 593.6913648874033, 405.41553003899753, 545.0001820274629, 404.13968645595014], 'area': 1054.6293162692457, 'bbox': [545.0001820274629, 378.67540270090103, 49.23339860374108, 26.7401273380965], 'iscrowd': 0}, {'id': 32, 'image_id': 1, 'category_id': 2, 'segmentation': [594.5955353775062, 341.2696067793295, 592.6359737580642, 346.1783115705475, 595.5670321371872, 351.4116119751707, 594.498773291707, 355.56610705144703, 595.03770820098, 359.39284333679825, 596.8756023284514, 362.4332278929651, 590.6566405876074, 362.45178297907114, 590.6322764432989, 356.1265549827367, 588.5902138527017, 356.1319851242006, 588.5435697012581, 341.2840873301029, 594.5955353775062, 341.2696067793295], 'area': 113.72690001497435, 'bbox': [588.5435697012581, 341.2696067793295, 8.332032627193257, 21.18217619974166], 'iscrowd': 0}, {'id': 33, 'image_id': 1, 'category_id': 2, 'segmentation': [642.5548559394665, 605.4954561004415, 650.1750435382128, 639.3135964740068, 630.8239523877855, 643.6478302329779, 625.8465431011282, 621.5068183001131, 613.9918430529069, 624.1488145207986, 611.3485644147731, 612.4494963856414, 642.5548559394665, 605.4954561004415], 'area': 833.1100217257011, 'bbox': [611.3485644147731, 605.4954561004415, 38.82647912343964, 38.15237413253635], 'iscrowd': 0}, {'id': 34, 'image_id': 1, 'category_id': 1, 'segmentation': [664.1072873058729, 0.0, 665.963440204272, 5.102927703410387, 663.4489023594651, 4.764764592982829, 657.1336445596535, 3.8756691990420222, 655.4026131848805, 1.4097766196355224, 653.3671769341454, 3.967581197619438, 653.9054305306636, 8.526797778904438, 655.4832516363822, 13.08284202683717, 651.7876941068098, 16.08068347070366, 648.9221955472603, 18.858505848795176, 649.2320157396607, 14.056635465472937, 646.6069010677747, 9.94786886498332, 646.0617618362885, 4.3456138940528035, 644.300418858882, 0.6374908359721303, 644.1158141889609, 0.0, 664.1072873058729, 0.0], 'area': 163.2888762358447, 'bbox': [644.1158141889609, 0.0, 21.847626015311107, 18.858505848795176], 'iscrowd': 0}, {'id': 35, 'image_id': 1, 'category_id': 2, 'segmentation': [719.5465582867619, 598.6448629098013, 720.3695895529818, 606.5043138191104, 724.4469213041011, 610.7773993844166, 725.2147377748042, 616.3742183251306, 723.0049092427362, 620.1570535134524, 718.3882041969337, 619.6482330150902, 713.9281129532028, 616.4276928380132, 703.0521553403232, 615.0506034400314, 697.5230599956121, 619.7578771309927, 691.4791235984303, 620.1051252679899, 691.3655090769753, 598.7110763099045, 719.5465582867619, 598.6448629098013], 'area': 591.1230200188784, 'bbox': [691.3655090769753, 598.6448629098013, 33.84922869782895, 21.512190603651106], 'iscrowd': 0}, {'id': 36, 'image_id': 1, 'category_id': 2, 'segmentation': [766.4948697979562, 219.65857510454953, 766.9394415735733, 203.64448958076537, 774.5338656448293, 199.1309275366366, 779.6678650518879, 196.49748022854328, 780.0790439015254, 206.49778971262276, 792.9675920906011, 206.67150652222335, 792.7963749796618, 220.19298242591321, 766.4948697979562, 219.65857510454953], 'area': 436.3985394080503, 'bbox': [766.4948697979562, 196.49748022854328, 26.472722292644903, 23.695502197369933], 'iscrowd': 0}, {'id': 37, 'image_id': 1, 'category_id': 2, 'segmentation': [794.3443439039402, 800.0892576370388, 808.1741715462413, 800.0180635405704, 811.0787292337045, 810.2460591299459, 808.0733455095906, 813.3824441283941, 807.4013454001397, 820.0798055976629, 768.3534275970887, 822.9638589080423, 759.5865474676248, 819.0049897767603, 760.6663332274184, 806.1938135968521, 774.9641209535766, 796.3894291333854, 783.0687794568948, 796.0362568320706, 791.1689595563803, 796.2602832280099, 794.3443439039402, 800.0892576370388], 'area': 1091.069005525542, 'bbox': [759.5865474676248, 796.0362568320706, 51.49218176607974, 26.927602075971663], 'iscrowd': 0}, {'id': 38, 'image_id': 1, 'category_id': 2, 'segmentation': [818.6449922658503, 145.89386002346873, 802.3041800016072, 146.51470147818327, 802.9269792409614, 162.90226284973323, 777.9049582753796, 163.82376996334642, 776.5785818777513, 128.4980932334438, 817.9413526556455, 126.95574354380369, 818.6449922658503, 145.89386002346873], 'area': 1195.1484685924327, 'bbox': [776.5785818777513, 126.95574354380369, 42.06641038809903, 36.86802641954273], 'iscrowd': 0}, {'id': 39, 'image_id': 1, 'category_id': 2, 'segmentation': [794.6951001100242, 2.116394373588264, 816.6974766298663, 1.4683247059583664, 818.0185485305265, 47.98086807690561, 804.6792487849016, 48.35088091529906, 795.6887790956534, 36.73992804996669, 794.6951001100242, 2.116394373588264], 'area': 972.2084233562225, 'bbox': [794.6951001100242, 1.4683247059583664, 23.323448420502245, 46.88255620934069], 'iscrowd': 0}, {'id': 40, 'image_id': 1, 'category_id': 2, 'segmentation': [816.4727658838965, 58.42847666423768, 817.5045415207278, 105.2588225658983, 796.3883123914711, 105.70768600795418, 795.4426827779971, 62.4043871788308, 803.5142542636022, 62.22955618426204, 806.4344622618519, 58.65132196247578, 816.4727658838965, 58.42847666423768], 'area': 955.3143319089635, 'bbox': [795.4426827779971, 58.42847666423768, 22.061858742730692, 47.2792093437165], 'iscrowd': 0}, {'id': 41, 'image_id': 1, 'category_id': 2, 'segmentation': [820.6129307732917, 800.003008636646, 838.7254208496306, 798.1625695805997, 841.7378241324332, 806.7232382101938, 859.6126089137979, 806.553273351863, 864.2543455683626, 813.4094758052379, 835.6097001582384, 818.8363996865228, 830.4454396180809, 813.3912856318057, 821.766802502796, 813.04821888078, 820.6129307732917, 800.003008636646], 'area': 511.57672611563873, 'bbox': [820.6129307732917, 798.1625695805997, 43.64141479507089, 20.673830105923116], 'iscrowd': 0}, {'id': 42, 'image_id': 1, 'category_id': 2, 'segmentation': [877.9025507906917, 363.2765975808725, 838.2836305517703, 366.3079837486148, 837.0310469446704, 349.9801886640489, 850.4099756779615, 348.9433259088546, 850.0006144659128, 343.5819130791351, 854.7069255011156, 343.20067395456135, 854.3236986373086, 338.14936562720686, 865.5350079541095, 337.29858210776, 865.9971866649576, 343.30238648783416, 884.2128435778432, 341.9032686809078, 885.3848932385445, 357.21201885771006, 877.9025507906917, 363.2765975808725], 'area': 983.4834268683098, 'bbox': [837.0310469446704, 337.29858210776, 48.353846293874085, 29.009401640854776], 'iscrowd': 0}, {'id': 43, 'image_id': 1, 'category_id': 1, 'segmentation': [886.5125979208387, 820.9232407584786, 886.2984008654021, 802.261728647165, 900.0, 802.1414167098701, 900.0, 818.7495461180806, 894.7975760672707, 816.548164521344, 888.9884018914308, 818.9095647959039, 886.5125979208387, 820.9232407584786], 'area': 216.4340088998764, 'bbox': [886.2984008654021, 802.1414167098701, 13.701599134597927, 18.78182404860854], 'iscrowd': 0}], 'categories': [{'id': 1, 'name': 1.0}, {'id': 2, 'name': 0.0}], 'images': [{'id': 1, 'file_name': 'sample_geotiff.tif', 'width': 900, 'height': 900}]}

And, to show the image with the labels overlaid:

[11]:
from matplotlib import pyplot as plt
from matplotlib import patches
import skimage
[17]:
im = skimage.io.imread(sample_image)
f, ax = plt.subplots(figsize=(10, 10))
ax.imshow(im, cmap='gray')
colors = ['', 'r', 'b']
for anno in coco_dict['annotations']:
    patch = patches.Rectangle((anno['bbox'][0], anno['bbox'][1]), anno['bbox'][2], anno['bbox'][3], linewidth=1, edgecolor=colors[anno['category_id']], facecolor='none')
    ax.add_patch(patch)
../../_images/tutorials_notebooks_api_coco_tutorial_8_0.png

It’s a little tough to see here, but building bounding boxes from the COCO dataset are boxed, with truncated buildings (at the edge of the image) in a different category.

Example 2: A dataset with multiple images and geojsons (for example, tiled SpaceNet datasets)

To use multiple images and geojsons, solaris needs a way to match them to one another. This can be done one of two ways: 1. If the images and their corresponding geojsons have the exact same filenames once extension and directory information are removed, then solaris can match them without any help. 2. You can provide a regex to extract substrings from image and geojson filenames that should be identical between matching files.

Since 2. is more complicated, we’ll show an example of doing that here. We’ll also include license information to show what that looks like.

[20]:
sample_geojsons = [os.path.join(data_dir, 'vectortile_test_expected/geoms_733601_3724734.geojson'),
                   os.path.join(data_dir, 'vectortile_test_expected/geoms_733601_3724869.geojson')]
sample_images = [os.path.join(data_dir, 'rastertile_test_expected/sample_geotiff_733601_3724734.tif'),
                 os.path.join(data_dir, 'rastertile_test_expected/sample_geotiff_733601_3724869.tif')]

coco_dict = sol.data.coco.geojson2coco(sample_images,
                                       sample_geojsons,
                                       matching_re=r'(\d+_\d+)',
                                       license_dict={'CC-BY 4.0': 'https://creativecommons.org/licenses/by/4.0/'},
                                       verbose=0)
100%|██████████| 2/2 [00:00<00:00, 17.27it/s]

Once again, we’ll display the json to show what the output looks like.

[21]:
JSON(coco_dict)
[21]:
<IPython.core.display.JSON object>
[22]:
print(coco_dict)
{'annotations': [{'id': 1, 'image_id': 1, 'category_id': 1, 'segmentation': [60.03418597159907, 74.87320505268872, 73.8337494416628, 90.0, 51.516283753560856, 90.0, 47.80893106292933, 85.93607368506491, 60.03418597159907, 74.87320505268872], 'area': 214.14410906402435, 'bbox': [47.80893106292933, 74.87320505268872, 26.02481837873347, 15.126794947311282], 'iscrowd': 0}, {'id': 2, 'image_id': 2, 'category_id': 1, 'segmentation': [90.0, 11.015026673674583, 70.7970443549566, 13.249627484939992, 70.8928169994615, 4.990449592471123, 70.69254911504686, 0.0, 90.0, 0.0, 90.0, 11.015026673674583], 'area': 232.6028019573394, 'bbox': [70.69254911504686, 0.0, 19.30745088495314, 13.249627484939992], 'iscrowd': 0}, {'id': 3, 'image_id': 2, 'category_id': 1, 'segmentation': [89.06576380180195, 21.638346442952752, 90.0, 28.386366279795766, 90.0, 68.61032488476485, 85.23654213640839, 70.96199104283005, 73.38412117748521, 70.6515495320782, 71.78515014378354, 65.98500318173319, 72.83866719854996, 48.2692635813728, 72.19266184815206, 21.76100580766797, 89.06576380180195, 21.638346442952752], 'area': 853.8212747899074, 'bbox': [71.78515014378354, 21.638346442952752, 18.21484985621646, 49.3236445998773], 'iscrowd': 0}], 'categories': [{'id': 1, 'name': 'other'}], 'licenses': [{'name': 'CC-BY 4.0', 'url': 'https://creativecommons.org/licenses/by/4.0/', 'id': 1}], 'images': [{'id': 1, 'file_name': 'sample_geotiff_733601_3724734.tif', 'width': 90, 'height': 90, 'license': 1}, {'id': 2, 'file_name': 'sample_geotiff_733601_3724869.tif', 'width': 90, 'height': 90, 'license': 1}]}

Still have questions?

Check the API documentation for sol.data.coco.geojson2coco or open an issue in the Solaris GitHub repo.