solaris.vector API reference

solaris.vector.polygon vector polygon management

solaris.vector.polygon.affine_transform_gdf(gdf, affine_obj, inverse=False, geom_col='geometry', precision=None)[source]

Perform an affine transformation on a GeoDataFrame.

Parameters
  • gdf (geopandas.GeoDataFrame, pandas.DataFrame, or str) – A GeoDataFrame, pandas DataFrame with a "geometry" column (or a different column containing geometries, identified by geom_col - note that this column will be renamed "geometry" for ease of use with geopandas), or the path to a saved file in .geojson or .csv format.

  • affine_obj (list or affine.Affine) – An affine transformation to apply to geom in the form of an [a, b, d, e, xoff, yoff] list or an affine.Affine object.

  • inverse (bool, optional) – Use this argument to perform the inverse transformation.

  • geom_col (str, optional) – The column in gdf corresponding to the geometry. Defaults to 'geometry'.

  • precision (int, optional) – Decimal precision to round the geometries to. If not provided, no rounding is performed.

solaris.vector.polygon.convert_poly_coords(geom, raster_src=None, affine_obj=None, inverse=False, precision=None)[source]

Georegister geometry objects currently in pixel coords or vice versa.

Parameters
  • geom (shapely.geometry.shape or str) – A shapely.geometry.shape, or WKT string-formatted geometry object currently in pixel coordinates.

  • raster_src (str, optional) – Path to a raster image with georeferencing data to apply to geom. Alternatively, an opened rasterio.Band object or osgeo.gdal.Dataset object can be provided. Required if not using affine_obj.

  • affine_obj (list or affine.Affine) – An affine transformation to apply to geom in the form of an [a, b, d, e, xoff, yoff] list or an affine.Affine object. Required if not using raster_src.

  • inverse (bool, optional) – If true, will perform the inverse affine transformation, going from geospatial coordinates to pixel coordinates.

  • precision (int, optional) – Decimal precision for the polygon output. If not provided, rounding is skipped.

Returns

A geometry in the same format as the input with its coordinate system transformed to match the destination object.

Return type

out_geom

solaris.vector.polygon.gdf_to_yolo(geodataframe, image, output_dir, column='single_id', im_size=(0, 0), min_overlap=0.66, remove_no_labels=1)[source]

Convert a geodataframe containing polygons to yolo/yolt format.

Parameters
  • geodataframe (str) – Path to a geopandas.GeoDataFrame with a column named 'geometry'. Can be created from a geojson with labels for unique objects. Can be converted to this format with geodataframe=gpd.read_file("./xView_30.geojson").

  • im_path (str) – Path to a georeferenced image (ie a GeoTIFF or png created with GDAL) that geolocates to the same geography as the geojson`(s). If a directory, the bounds of each GeoTIFF will be loaded in and all overlapping geometries will be transformed. This function will also accept a :class:`osgeo.gdal.Dataset or rasterio.DatasetReader with georeferencing information in this argument.

  • output_dir (str) – Path to an output directory where all of the yolo readable text files will be placed.

  • column (str, optional) – The column name that contians an unique integer id for each of object class.

  • im_size (tuple, optional) – A tuple specifying the x and y heighth of a an image. If specified as (0,0) (the default,) then the size is determined automatically.

  • min_overlap (float, optional) – A float value ranging from 0 to 1. This is a percantage. If a polygon does not overlap the image by at least min_overlap, the polygon is discarded. i.e. 0.66 = 66%. Default value of 0.66.

  • remove_no_labels (int, optional) – An int value of 0 or 1. If 1, any image not containing any objects will be moved to a directory in the same root path as your input image. If 0, no images will be moved. Default value of 1.

Returns

gdf – The txt file will be written to the output_dir, however the the output gdf itself is returned.

Return type

geopandas.GeoDataFrame.

solaris.vector.polygon.geojson_to_px_gdf(geojson, im_path, geom_col='geometry', precision=None, output_path=None, override_crs=False)[source]

Convert a geojson or set of geojsons from geo coords to px coords.

Parameters
  • geojson (str) – Path to a geojson. This function will also accept a pandas.DataFrame or geopandas.GeoDataFrame with a column named 'geometry' in this argument.

  • im_path (str) – Path to a georeferenced image (ie a GeoTIFF) that geolocates to the same geography as the geojson`(s). This function will also accept a :class:`osgeo.gdal.Dataset or rasterio.DatasetReader with georeferencing information in this argument.

  • geom_col (str, optional) – The column containing geometry in geojson. If not provided, defaults to "geometry".

  • precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.

  • output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.

  • override_crs (bool, optional) – Useful if the geojsons generated by the vector tiler or otherwise were saved out with a non EPSG code projection. True sets the gdf crs to that of the image, the inputs should have the same underlying projection for this to work. If False, and the gdf does not have an EPSG code, this function will fail.

Returns

output_df – A pandas.DataFrame with all geometries in geojson that overlapped with the image at im_path converted to pixel coordinates. Additional columns are included with the filename of the source geojson (if available) and images for reference.

Return type

pandas.DataFrame

solaris.vector.polygon.georegister_px_df(df, im_path=None, affine_obj=None, crs=None, geom_col='geometry', precision=None, output_path=None)[source]

Convert a dataframe of geometries in pixel coordinates to a geo CRS.

Parameters
  • df (pandas.DataFrame) – A pandas.DataFrame with polygons in a column named "geometry".

  • im_path (str, optional) – A filename or rasterio.DatasetReader object containing an image that has the same bounds as the pixel coordinates in df. If not provided, affine_obj and crs must both be provided.

  • affine_obj (list or affine.Affine, optional) – An affine transformation to apply to geom in the form of an [a, b, d, e, xoff, yoff] list or an affine.Affine object. Required if not using raster_src.

  • crs (valid CRS str, int, or rasterio.crs.CRS instance) – The coordinate reference system for the output GeoDataFrame as an EPSG code integer. Required if not providing a raster image to extract the information from.

  • geom_col (str, optional) – The column containing geometry in df. If not provided, defaults to "geometry".

  • precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.

  • output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.

solaris.vector.polygon.get_overlapping_subset(gdf, im=None, bbox=None, bbox_crs=None)[source]

Extract a subset of geometries in a GeoDataFrame that overlap with im.

Notes

This function uses RTree’s spatialindex, which is much faster (but slightly less accurate) than direct comparison of each object for overlap.

Parameters
  • gdf (geopandas.GeoDataFrame) – A geopandas.GeoDataFrame instance or a path to a geojson.

  • im (rasterio.DatasetReader or str, optional) – An image object loaded with rasterio or a path to a georeferenced image (i.e. a GeoTIFF).

  • bbox (list or shapely.geometry.Polygon, optional) – A bounding box (either a shapely.geometry.Polygon or a [bottom, left, top, right] list) from an image. Has no effect if im is provided (bbox is inferred from the image instead.) If bbox is passed and im is not, a bbox_crs should be provided to ensure correct geolocation - if it isn’t, it will be assumed to have the same crs as gdf.

  • bbox_crs (int, optional) – The coordinate reference system that the bounding box is in as an EPSG int. If not provided, it’s assumed that the CRS is the same as im (if provided) or gdf (if not).

Returns

output_gdf – A geopandas.GeoDataFrame with all geometries in gdf that overlapped with the image at im. Coordinates are kept in the CRS of gdf.

Return type

geopandas.GeoDataFrame

solaris.vector.polygon.remove_multipolygons(gdf)[source]

Filters out rows of a geodataframe containing MultiPolygons and GeometryCollections.

This function is optionally used in geojson2coco. For instance segmentation, where objects are composed of single polygons, multi part geometries need to be either removed or inspected manually to be resolved as a single geometry.

solaris.vector.graph graph and road network analysis

class solaris.vector.graph.Edge(nodes, edge_weight=None)[source]

An object to hold edge attributes.

nodes

Node instances connected by the edge.

Type

2-tuple of Node s

weight

The weight of the edge.

Type

int or float

get_node_idxs()[source]

Return the Node.idx for the nodes in the edge.

set_edge_weight(normalize_factor=None, inverse=False)[source]

Get the edge weight based on Euclidean distance between nodes.

Note

This method does not account for spherical deformation (i.e. does not use the Haversine equation). It is a simple linear distance.

Parameters
  • normalize_factor (int or float, optional) – a number to multiply (or divide, if inverse=True) the Euclidean distance by. Defaults to None (no normalization)

  • inverse (bool, optional) – if True, the Euclidean distance weight will be divided by normalize_factor instead of multiplied by it.

class solaris.vector.graph.Node(idx, x, y)[source]

An object to hold node attributes.

idx

The numerical index of the node. Used as a unique identifier when the nodes are added to the graph.

Type

int

x

Numeric x location of the node, in either a geographic CRS or in pixel coordinates.

Type

int or float

y

Numeric y location of the node, in either a geographic CRS or in pixel coordinates.

Type

int or float

class solaris.vector.graph.Path(edges=None, properties=None)[source]

An object to hold Edge s with common properties.

edges

A list of Edge s

Type

list of Edge s

properties

A dictionary of property: value pairs that provide relevant metadata about edges along the path (e.g. road type, speed limit, etc.)

Type

dict

add_data(property, value)[source]

Add a property: value pair to the Path.properties attribute.

add_edge(edge)[source]

Add an edge to the path.

set_edge_weights(data_key=None, inverse=False, overwrite=True)[source]

Calculate edge weights for all edges in the Path.

solaris.vector.graph.geojson_to_graph(geojson, graph_name=None, retain_all=True, valid_road_types=None, road_type_field='type', edge_idx=0, first_node_idx=0, weight_norm_field=None, inverse=False, workers=1, verbose=False, output_path=None)[source]

Convert a geojson of path strings to a network graph.

Parameters
  • geojson (str) – Path to a geojson file (or any other OGR-compatible vector file) to load network edges and nodes from.

  • graph_name (str, optional) – Name of the graph. If not provided, graph will be named 'unnamed' .

  • retain_all (bool, optional) – If True , the entire graph will be returned even if some parts are not connected. Defaults to True.

  • valid_road_types (list of int s, optional) –

    The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers 1-7, which map as follows:

    1: Motorway
    2: Primary
    3: Secondary
    4: Tertiary
    5: Residential
    6: Unclassified
    7: Cart track
    

  • road_type_field (str, optional) – The name of the property in the vector data that delineates road type. Defaults to 'type' .

  • edge_idx (int, optional) – The first index to use for an edge. This can be set to a higher value so that a graph’s edge indices don’t overlap with existing values in another graph.

  • first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.

  • weight_norm_field (str, optional) – The name of a field in geojson to pass to argument data_key in Path.set_edge_weights(). Defaults to None, in which case no weighting is performed (weights calculated solely using Euclidean distance.)

  • workers (int, optional) – Number of parallel processes to run for parallelization. Defaults to 1. Should not be greater than the number of CPUs available.

  • verbose (bool, optional) – Verbose print output. Defaults to False .

  • output_path (str, optional) – Path to a pickle file to save the output graph to. Nothing will be saved to disk if not provided.

Returns

G – A networkx.MultiDiGraph containing all of the nodes and edges from the geojson (or only the largest connected component if retain_all = False). Edge lengths are weighted based on geographic distance.

Return type

networkx.MultiDiGraph

solaris.vector.graph.get_nodes_paths(vector_file, first_node_idx=0, node_gdf=geopandas.GeoDataFrame, valid_road_types=None, road_type_field='type', workers=1, verbose=False)[source]

Extract nodes and paths from a vector file.

Parameters
  • vector_file (str) – Path to an OGR-compatible vector file containing line segments (e.g., JSON response from from the Overpass API, or a SpaceNet GeoJSON).

  • first_path_idx (int, optional) – The first index to use for a path. This can be set to a higher value so that a graph’s path indices don’t overlap with existing values in another graph.

  • first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.

  • node_gdf (geopandas.GeoDataFrame , optional) – A geopandas.GeoDataFrame containing nodes to add to the graph. New nodes will be added to this object incrementally during the function call.

  • valid_road_types (list of int s, optional) –

    The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers 1-7, which map as follows:

    1: Motorway
    2: Primary
    3: Secondary
    4: Tertiary
    5: Residential
    6: Unclassified
    7: Cart track
    

  • road_type_field (str, optional) – The name of the attribute containing road type information in vector_file. Defaults to 'type'.

  • workers (int, optional) – Number of worker processes to use for parallelization. Defaults to 1. Should not exceed the number of CPUs available.

  • verbose (bool, optional) – Verbose print output. Defaults to False.

Returns

nodes, paths

nodeslist

A list of Node s to be added to the graph.

pathslist

A list of Path s containing the Edge s and Node s to be added to the graph.

Return type

tuple of dict s

solaris.vector.graph.graph_to_geojson(G, output_path, encoding='utf-8', overwrite=False, verbose=False)[source]

Save graph to two geojsons: one containing nodes, the other edges. :param G: A graph object to save to geojson files. :type G: networkx.MultiDiGraph :param output_path: Path to save the geojsons to. '_nodes.geojson' and

'_edges.geojson' will be appended to output_path (after stripping the extension).

Parameters
  • encoding (str, optional) – The character encoding for the saved files.

  • overwrite (bool, optional) – Should files at output_path be overwritten? Defaults to no (False).

  • verbose (bool, optional) – Switch to print relevant values. Defaults to no (False).

Notes

This function is based on osmnx.save_load.save_graph_shapefile, with tweaks to make it work with our graph objects. It will save two geojsons: a file containing all of the nodes and a file containing all of the edges. When writing to geojson, must convert the coordinate reference system (crs) to string if it’s a dict, otherwise no crs will be appended to the geojson.

Returns

Return type

None

solaris.vector.graph.linestring_to_edges(linestring, node_gdf)[source]

Collect nodes in a linestring and add them to an edge.

Parameters
  • linestring (shapely.geometry.LineString) – A shapely.geometry.LineString object to extract nodes and edges from.

  • node_series (geopandas.GeoSeries) – A geopandas.GeoSeries containing a shapely.geometry.point.Point for every node to be added to the graph.

Returns

edges – A list of Edge s from linestring.

Return type

list

solaris.vector.graph.parallel_linestring_to_path(feature)[source]

Read in a feature line from a fiona-opened shapefile and get the edges.

Parameters

feature (dict) – An item from a fiona.open iterable with the key 'geometry' containing shapely.geometry.line.LineString s or shapely.geometry.line.MultiLineString s.

Returns

  • A list of Path s containing all edges in the LineString or

  • MultiLineString.

Notes

This function depends on node_series and valid_road_types, which are passed by an initializer as items in var_dict.

solaris.vector.mask vector <-> training mask interconversion

solaris.vector.mask.boundary_mask(footprint_msk=None, out_file=None, reference_im=None, boundary_width=3, boundary_type='inner', burn_value=255, **kwargs)[source]

Convert a dataframe of geometries to a pixel mask.

Note

This function requires creation of a footprint mask before it can operate; therefore, if there is no footprint mask already present, it will create one. In that case, additional arguments for footprint_mask() (e.g. df) must be passed.

By default, this function draws boundaries within the edges of objects. To change this behavior, use the boundary_type argument.

Parameters
  • footprint_msk (numpy.array, optional) – A filled in footprint mask created using footprint_mask(). If not provided, one will be made by calling footprint_mask() before creating the boundary mask, and the required arguments for that function must be provided as kwargs.

  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).

  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored

  • boundary_width (int, optional) – The width of the boundary to be created in pixels. Defaults to 3.

  • boundary_type ("inner" or "outer", optional) – Where to draw the boundaries: within the object ("inner") or outside of it ("outer"). Defaults to "inner".

  • burn_value (int, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.

  • **kwargs (optional) – Additional arguments to pass to footprint_mask() if one needs to be created.

Returns

boundary_mask – A pixel mask with 0s for non-object pixels and the same value as the footprint mask burn_value for the boundaries of each object.

Return type

numpy.array

solaris.vector.mask.buffer_df_geoms(df, buffer, meters=False, reference_im=None, geom_col='geometry', affine_obj=None)[source]

Buffer geometries within a pd.DataFrame or gpd.GeoDataFrame.

Parameters
  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If df lacks a crs attribute (isn’t a geopandas.GeoDataFrame ) and meters=True, then reference_im must be provided for georeferencing.

  • buffer (int or float) – The amount to buffer the geometries in df. In pixel units unless meters=True. This corresponds to width/2 in mask creation functions.

  • meters (bool, optional) – Should buffers be in pixel units (default) or metric units (if meters is True)?

  • reference_im (str or rasterio.DatasetReader, optional) – The path to a reference image covering the same geographic extent as the area labeled in df. Provided for georeferencing of pixel coordinate geometries in df or conversion of georeferenced geometries to pixel coordinates as needed. Required if meters is True and df lacks a crs attribute.

  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".

  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert geoms in df from a geographic crs to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.

Returns

buffered_df – A pandas.DataFrame in the original coordinate reference system with objects buffered per buffer.

Return type

pandas.DataFrame

See also

road_mask

Function to create road network masks.

contact_mask

Function to create masks of contact points between polygons.

solaris.vector.mask.contact_mask(df, contact_spacing=10, meters=False, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255)[source]

Create a pixel mask labeling closely juxtaposed objects.

Notes

This function identifies pixels in an image that do not correspond to objects, but fall within contact_spacing of >1 labeled object.

Parameters
  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.

  • contact_spacing (int or float, optional) – The desired maximum distance between adjacent polygons to be labeled as contact. Will be in pixel units unless meters=True is provided.

  • meters (bool, optional) – Should width be defined in units of meters? Defaults to no (False). If True and df is not in a CRS with metric units, the function will attempt to transform to the relevant CRS using df.to_crs() (if df is a geopandas.GeoDataFrame) or using the data provided in reference_im (if not).

  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).

  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.

  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".

  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to None, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.

  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.

  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.

  • out_type ('float' or 'int') –

  • burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value.

Returns

output_arr – A pixel mask with burn_value at contact points between polygons.

Return type

numpy.array

solaris.vector.mask.crs_is_metric(gdf)[source]

Check if a GeoDataFrame’s CRS is in metric units.

solaris.vector.mask.df_to_px_mask(df, channels=['footprint'], out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, **kwargs)[source]

Convert a dataframe of geometries to a pixel mask.

Parameters
  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.

  • channels (list, optional) –

    The mask channels to generate. There are three values that this can contain:

    • "footprint": Create a full footprint mask, with 0s at pixels

      that don’t fall within geometries and burn_value at pixels that do.

    • "boundary": Create a mask with geometries outlined. Use

      boundary_width to set how thick the boundary will be drawn.

    • "contact": Create a mask with regions between >= 2 closely

      juxtaposed geometries labeled. Use contact_spacing to set the maximum spacing between polygons to be labeled.

    Each channel correspond to its own shape plane in the output.

  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).

  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.

  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".

  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to None, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.

  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.

  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.

  • burn_value (int or float) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value.

  • kwargs – Additional arguments to pass to boundary_mask or contact_mask. See those functions for requirements.

Returns

mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value. Shape will be (shape[0], shape[1], len(channels)), with channels ordered per the provided channels list.

Return type

numpy.array

solaris.vector.mask.footprint_mask(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None)[source]

Convert a dataframe of geometries to a pixel mask.

Parameters
  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.

  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).

  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.

  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".

  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to None, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.

  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.

  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.

  • out_type ('float' or 'int') –

  • burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.

  • burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.

Returns

mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.

Return type

numpy.array

solaris.vector.mask.geojsons_to_masks_and_fill_nodata(rtiler, vtiler, label_tile_dir, fill_value=0)[source]

Converts tiled vectors to raster labels and fills nodata values in raster and vector tiles.

This function must be run after a raster tiler and vector tiler have already been initialized and the .tile() method for each has been called to generate raster and vector tiles. Geojson labels are first converted to rasterized masks, then the labels are set to 0 where the reference image, the corresponding image tile, has nodata values. Then, nodata areas in the image tile are filled in place with the fill_value. Only works for rasterizing all geometries as a single category with a burn value of 1. See test_tiler_fill_nodata in tests/test_tile/test_tile.py for an example.

Parameters
  • rtiler (RasterTiler) – The RasterTiler that has had it’s .tile() method called.

  • vtiler (VectorTiler) – The VectorTiler that has had it’s .tile() method called.

  • label_tile_dir (str) – The folder path to save rasterized labels. This is created if it doesn’t already exist.

  • fill_value (str, optional) – The value to use to fill nodata values in images. Defaults to 0.

Returns

rasterized_label_paths – A list of the paths to the rasterized instance masks.

Return type

list

solaris.vector.mask.instance_mask(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None, nodata_value=0)[source]

Convert a dataframe of geometries to a pixel mask.

Parameters
  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.

  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).

  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.

  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".

  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to None, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.

  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.

  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.

  • out_type ('float' or 'int') –

  • burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.

  • burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.

  • nodata_value (int or float, optional) – The value to use for nodata pixels in the mask. Defaults to 0 (the min value for uint8 arrays). Used if reference_im nodata value is a float. Ignored if reference_im nodata value is an int or if reference_im is not used. Take care when visualizing these masks, the nodata value may cause labels to not be visualized if nodata values are automatically masked by the software.

Returns

mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.

Return type

numpy.array

solaris.vector.mask.mask_to_poly_geojson(pred_arr, channel_scaling=None, reference_im=None, output_path=None, output_type='geojson', min_area=40, bg_threshold=0, do_transform=None, simplify=False, tolerance=0.5, **kwargs)[source]

Get polygons from an image mask.

Parameters
  • pred_arr (numpy.ndarray) – A 2D array of integers. Multi-channel masks are not supported, and must be simplified before passing to this function. Can also pass an image file path here.

  • channel_scaling (list-like, optional) –

    If pred_arr is a 3D array, this argument defines how each channel will be combined to generate a binary output. channel_scaling should be a list-like of length equal to the number of channels in pred_arr. The following operation will be performed to convert the multi-channel prediction to a 2D output

    sum(pred_arr[channel]*channel_scaling[channel])
    

    If not provided, no scaling will be performend and channels will be summed.

  • reference_im (str, optional) – The path to a reference geotiff to use for georeferencing the polygons in the mask. Required if saving to a GeoJSON (see the output_type argument), otherwise only required if do_transform=True.

  • output_path (str, optional) – Path to save the output file to. If not provided, no file is saved.

  • output_type ('csv' or 'geojson', optional) – If output_path is provided, this argument defines what type of file will be generated - a CSV (output_type='csv') or a geojson (output_type='geojson').

  • min_area (int, optional) – The minimum area of a polygon to retain. Filtering is done AFTER any coordinate transformation, and therefore will be in destination units.

  • bg_threshold (int, optional) – The cutoff in mask_arr that denotes background (non-object). Defaults to 0.

  • simplify (bool, optional) – If True, will use the Douglas-Peucker algorithm to simplify edges, saving memory and processing time later. Defaults to False.

  • tolerance (float, optional) – The tolerance value to use for simplification with the Douglas-Peucker algorithm. Defaults to 0.5. Only has an effect if simplify=True.

Returns

gdf – A GeoDataFrame of polygons.

Return type

geopandas.GeoDataFrame

solaris.vector.mask.preds_to_binary(pred_arr, channel_scaling=None, bg_threshold=0)[source]

Convert a set of predictions from a neural net to a binary mask.

Parameters
  • pred_arr (numpy.ndarray) – A set of predictions generated by a neural net (generally in float dtype). This can be a 2D array or a 3D array, in which case it will be convered to a 2D mask output with optional channel scaling (see the channel_scaling argument). If a filename is provided instead of an array, the image will be loaded using scikit-image.

  • channel_scaling (list-like of `float`s, optional) –

    If pred_arr is a 3D array, this argument defines how each channel will be combined to generate a binary output. channel_scaling should be a list-like of length equal to the number of channels in pred_arr. The following operation will be performed to convert the multi-channel prediction to a 2D output

    sum(pred_arr[channel]*channel_scaling[channel])
    

    If not provided, no scaling will be performend and channels will be summed.

  • bg_threshold (int or float, optional) – The cutoff to set to distinguish between background and foreground pixels in the final binary mask. Binarization takes place after channel scaling and summation (if applicable). Defaults to 0.

Returns

mask_arr – A 2D boolean numpy array with True for foreground pixels and False for background.

Return type

numpy.ndarray

solaris.vector.mask.road_mask(df, width=4, meters=False, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None, min_background_value=None, verbose=False)[source]

Convert a dataframe of geometries to a pixel mask.

Parameters
  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.

  • width (float or int, optional) – The total width to make a road (i.e. twice x if using road.buffer(x)). In pixel units unless meters is True.

  • meters (bool, optional) – Should width be defined in units of meters? Defaults to no (False). If True and df is not in a CRS with metric units, the function will attempt to transform to the relevant CRS using df.to_crs() (if df is a geopandas.GeoDataFrame) or using the data provided in reference_im (if not).

  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).

  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.

  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".

  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to None, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.

  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.

  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.

  • out_type ('float' or 'int') –

  • burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.

  • burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.

  • min_background_val (int) – Minimum value for mask background. Optional, ignore if None. Defaults to None.

  • verbose (str, optional) – Switch to print relevant values. Defaults to False.

Returns

mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.

Return type

numpy.array