solaris.vector
API reference¶
Contents
solaris.vector.polygon
vector polygon management¶
-
solaris.vector.polygon.
affine_transform_gdf
(gdf, affine_obj, inverse=False, geom_col='geometry', precision=None)[source]¶ Perform an affine transformation on a GeoDataFrame.
- Parameters
gdf (
geopandas.GeoDataFrame
,pandas.DataFrame
, or str) – A GeoDataFrame, pandas DataFrame with a"geometry"
column (or a different column containing geometries, identified by geom_col - note that this column will be renamed"geometry"
for ease of use with geopandas), or the path to a saved file in .geojson or .csv format.affine_obj (list or
affine.Affine
) – An affine transformation to apply to geom in the form of an[a, b, d, e, xoff, yoff]
list or anaffine.Affine
object.inverse (bool, optional) – Use this argument to perform the inverse transformation.
geom_col (str, optional) – The column in gdf corresponding to the geometry. Defaults to
'geometry'
.precision (int, optional) – Decimal precision to round the geometries to. If not provided, no rounding is performed.
-
solaris.vector.polygon.
convert_poly_coords
(geom, raster_src=None, affine_obj=None, inverse=False, precision=None)[source]¶ Georegister geometry objects currently in pixel coords or vice versa.
- Parameters
geom (
shapely.geometry.shape
or str) – Ashapely.geometry.shape
, or WKT string-formatted geometry object currently in pixel coordinates.raster_src (str, optional) – Path to a raster image with georeferencing data to apply to geom. Alternatively, an opened
rasterio.Band
object orosgeo.gdal.Dataset
object can be provided. Required if not using affine_obj.affine_obj (list or
affine.Affine
) – An affine transformation to apply to geom in the form of an[a, b, d, e, xoff, yoff]
list or anaffine.Affine
object. Required if not using raster_src.inverse (bool, optional) – If true, will perform the inverse affine transformation, going from geospatial coordinates to pixel coordinates.
precision (int, optional) – Decimal precision for the polygon output. If not provided, rounding is skipped.
- Returns
A geometry in the same format as the input with its coordinate system transformed to match the destination object.
- Return type
out_geom
-
solaris.vector.polygon.
gdf_to_yolo
(geodataframe, image, output_dir, column='single_id', im_size=(0, 0), min_overlap=0.66, remove_no_labels=1)[source]¶ Convert a geodataframe containing polygons to yolo/yolt format.
- Parameters
geodataframe (str) – Path to a
geopandas.GeoDataFrame
with a column named'geometry'
. Can be created from a geojson with labels for unique objects. Can be converted to this format withgeodataframe=gpd.read_file("./xView_30.geojson")
.im_path (str) – Path to a georeferenced image (ie a GeoTIFF or png created with GDAL) that geolocates to the same geography as the geojson`(s). If a directory, the bounds of each GeoTIFF will be loaded in and all overlapping geometries will be transformed. This function will also accept a :class:`osgeo.gdal.Dataset or
rasterio.DatasetReader
with georeferencing information in this argument.output_dir (str) – Path to an output directory where all of the yolo readable text files will be placed.
column (str, optional) – The column name that contians an unique integer id for each of object class.
im_size (tuple, optional) – A tuple specifying the x and y heighth of a an image. If specified as
(0,0)
(the default,) then the size is determined automatically.min_overlap (float, optional) – A float value ranging from 0 to 1. This is a percantage. If a polygon does not overlap the image by at least min_overlap, the polygon is discarded. i.e. 0.66 = 66%. Default value of 0.66.
remove_no_labels (int, optional) – An int value of 0 or 1. If 1, any image not containing any objects will be moved to a directory in the same root path as your input image. If 0, no images will be moved. Default value of 1.
- Returns
gdf – The txt file will be written to the output_dir, however the the output gdf itself is returned.
- Return type
-
solaris.vector.polygon.
geojson_to_px_gdf
(geojson, im_path, geom_col='geometry', precision=None, output_path=None, override_crs=False)[source]¶ Convert a geojson or set of geojsons from geo coords to px coords.
- Parameters
geojson (str) – Path to a geojson. This function will also accept a
pandas.DataFrame
orgeopandas.GeoDataFrame
with a column named'geometry'
in this argument.im_path (str) – Path to a georeferenced image (ie a GeoTIFF) that geolocates to the same geography as the geojson`(s). This function will also accept a :class:`osgeo.gdal.Dataset or
rasterio.DatasetReader
with georeferencing information in this argument.geom_col (str, optional) – The column containing geometry in geojson. If not provided, defaults to
"geometry"
.precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.
output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.
override_crs (bool, optional) – Useful if the geojsons generated by the vector tiler or otherwise were saved out with a non EPSG code projection. True sets the gdf crs to that of the image, the inputs should have the same underlying projection for this to work. If False, and the gdf does not have an EPSG code, this function will fail.
- Returns
output_df – A
pandas.DataFrame
with all geometries in geojson that overlapped with the image at im_path converted to pixel coordinates. Additional columns are included with the filename of the source geojson (if available) and images for reference.- Return type
-
solaris.vector.polygon.
georegister_px_df
(df, im_path=None, affine_obj=None, crs=None, geom_col='geometry', precision=None, output_path=None)[source]¶ Convert a dataframe of geometries in pixel coordinates to a geo CRS.
- Parameters
df (
pandas.DataFrame
) – Apandas.DataFrame
with polygons in a column named"geometry"
.im_path (str, optional) – A filename or
rasterio.DatasetReader
object containing an image that has the same bounds as the pixel coordinates in df. If not provided, affine_obj and crs must both be provided.affine_obj (list or
affine.Affine
, optional) – An affine transformation to apply to geom in the form of an[a, b, d, e, xoff, yoff]
list or anaffine.Affine
object. Required if not using raster_src.crs (valid CRS str, int, or
rasterio.crs.CRS
instance) – The coordinate reference system for the output GeoDataFrame as an EPSG code integer. Required if not providing a raster image to extract the information from.geom_col (str, optional) – The column containing geometry in df. If not provided, defaults to
"geometry"
.precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.
output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.
-
solaris.vector.polygon.
get_overlapping_subset
(gdf, im=None, bbox=None, bbox_crs=None)[source]¶ Extract a subset of geometries in a GeoDataFrame that overlap with im.
Notes
This function uses RTree’s spatialindex, which is much faster (but slightly less accurate) than direct comparison of each object for overlap.
- Parameters
gdf (
geopandas.GeoDataFrame
) – Ageopandas.GeoDataFrame
instance or a path to a geojson.im (
rasterio.DatasetReader
or str, optional) – An image object loaded with rasterio or a path to a georeferenced image (i.e. a GeoTIFF).bbox (list or
shapely.geometry.Polygon
, optional) – A bounding box (either ashapely.geometry.Polygon
or a[bottom, left, top, right]
list) from an image. Has no effect if im is provided (bbox is inferred from the image instead.) If bbox is passed and im is not, a bbox_crs should be provided to ensure correct geolocation - if it isn’t, it will be assumed to have the same crs as gdf.bbox_crs (int, optional) – The coordinate reference system that the bounding box is in as an EPSG int. If not provided, it’s assumed that the CRS is the same as im (if provided) or gdf (if not).
- Returns
output_gdf – A
geopandas.GeoDataFrame
with all geometries in gdf that overlapped with the image at im. Coordinates are kept in the CRS of gdf.- Return type
-
solaris.vector.polygon.
remove_multipolygons
(gdf)[source]¶ Filters out rows of a geodataframe containing MultiPolygons and GeometryCollections.
This function is optionally used in geojson2coco. For instance segmentation, where objects are composed of single polygons, multi part geometries need to be either removed or inspected manually to be resolved as a single geometry.
solaris.vector.graph
graph and road network analysis¶
-
class
solaris.vector.graph.
Edge
(nodes, edge_weight=None)[source]¶ An object to hold edge attributes.
-
set_edge_weight
(normalize_factor=None, inverse=False)[source]¶ Get the edge weight based on Euclidean distance between nodes.
Note
This method does not account for spherical deformation (i.e. does not use the Haversine equation). It is a simple linear distance.
- Parameters
normalize_factor (int or float, optional) – a number to multiply (or divide, if
inverse=True
) the Euclidean distance by. Defaults toNone
(no normalization)inverse (bool, optional) – if
True
, the Euclidean distance weight will be divided bynormalize_factor
instead of multiplied by it.
-
-
class
solaris.vector.graph.
Node
(idx, x, y)[source]¶ An object to hold node attributes.
-
idx
¶ The numerical index of the node. Used as a unique identifier when the nodes are added to the graph.
- Type
-
x
¶ Numeric x location of the node, in either a geographic CRS or in pixel coordinates.
- Type
int or float
-
y
¶ Numeric y location of the node, in either a geographic CRS or in pixel coordinates.
- Type
int or float
-
-
class
solaris.vector.graph.
Path
(edges=None, properties=None)[source]¶ An object to hold
Edge
s with common properties.-
properties
¶ A dictionary of property: value pairs that provide relevant metadata about edges along the path (e.g. road type, speed limit, etc.)
- Type
-
-
solaris.vector.graph.
geojson_to_graph
(geojson, graph_name=None, retain_all=True, valid_road_types=None, road_type_field='type', edge_idx=0, first_node_idx=0, weight_norm_field=None, inverse=False, workers=1, verbose=False, output_path=None)[source]¶ Convert a geojson of path strings to a network graph.
- Parameters
geojson (str) – Path to a geojson file (or any other OGR-compatible vector file) to load network edges and nodes from.
graph_name (str, optional) – Name of the graph. If not provided, graph will be named
'unnamed'
.retain_all (bool, optional) – If
True
, the entire graph will be returned even if some parts are not connected. Defaults toTrue
.valid_road_types (
list
ofint
s, optional) –The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers
1
-7
, which map as follows:1: Motorway 2: Primary 3: Secondary 4: Tertiary 5: Residential 6: Unclassified 7: Cart track
road_type_field (str, optional) – The name of the property in the vector data that delineates road type. Defaults to
'type'
.edge_idx (int, optional) – The first index to use for an edge. This can be set to a higher value so that a graph’s edge indices don’t overlap with existing values in another graph.
first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.
weight_norm_field (str, optional) – The name of a field in geojson to pass to argument
data_key
inPath.set_edge_weights()
. Defaults toNone
, in which case no weighting is performed (weights calculated solely using Euclidean distance.)workers (int, optional) – Number of parallel processes to run for parallelization. Defaults to 1. Should not be greater than the number of CPUs available.
verbose (bool, optional) – Verbose print output. Defaults to
False
.output_path (str, optional) – Path to a pickle file to save the output graph to. Nothing will be saved to disk if not provided.
- Returns
G – A
networkx.MultiDiGraph
containing all of the nodes and edges from the geojson (or only the largest connected component if retain_all =False
). Edge lengths are weighted based on geographic distance.- Return type
networkx.MultiDiGraph
-
solaris.vector.graph.
get_nodes_paths
(vector_file, first_node_idx=0, node_gdf=geopandas.GeoDataFrame, valid_road_types=None, road_type_field='type', workers=1, verbose=False)[source]¶ Extract nodes and paths from a vector file.
- Parameters
vector_file (str) – Path to an OGR-compatible vector file containing line segments (e.g., JSON response from from the Overpass API, or a SpaceNet GeoJSON).
first_path_idx (int, optional) – The first index to use for a path. This can be set to a higher value so that a graph’s path indices don’t overlap with existing values in another graph.
first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.
node_gdf (
geopandas.GeoDataFrame
, optional) – Ageopandas.GeoDataFrame
containing nodes to add to the graph. New nodes will be added to this object incrementally during the function call.valid_road_types (
list
ofint
s, optional) –The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers
1
-7
, which map as follows:1: Motorway 2: Primary 3: Secondary 4: Tertiary 5: Residential 6: Unclassified 7: Cart track
road_type_field (str, optional) – The name of the attribute containing road type information in vector_file. Defaults to
'type'
.workers (int, optional) – Number of worker processes to use for parallelization. Defaults to 1. Should not exceed the number of CPUs available.
verbose (bool, optional) – Verbose print output. Defaults to
False
.
- Returns
nodes, paths –
- Return type
tuple of dict s
-
solaris.vector.graph.
graph_to_geojson
(G, output_path, encoding='utf-8', overwrite=False, verbose=False)[source]¶ Save graph to two geojsons: one containing nodes, the other edges. :param G: A graph object to save to geojson files. :type G:
networkx.MultiDiGraph
:param output_path: Path to save the geojsons to.'_nodes.geojson'
and'_edges.geojson'
will be appended tooutput_path
(after stripping the extension).- Parameters
Notes
This function is based on
osmnx.save_load.save_graph_shapefile
, with tweaks to make it work with our graph objects. It will save two geojsons: a file containing all of the nodes and a file containing all of the edges. When writing to geojson, must convert the coordinate reference system (crs) to string if it’s a dict, otherwise no crs will be appended to the geojson.- Returns
- Return type
-
solaris.vector.graph.
linestring_to_edges
(linestring, node_gdf)[source]¶ Collect nodes in a linestring and add them to an edge.
- Parameters
linestring (
shapely.geometry.LineString
) – Ashapely.geometry.LineString
object to extract nodes and edges from.node_series (
geopandas.GeoSeries
) – Ageopandas.GeoSeries
containing ashapely.geometry.point.Point
for every node to be added to the graph.
- Returns
edges – A list of
Edge
s fromlinestring
.- Return type
-
solaris.vector.graph.
parallel_linestring_to_path
(feature)[source]¶ Read in a feature line from a fiona-opened shapefile and get the edges.
- Parameters
feature (dict) – An item from a
fiona.open
iterable with the key'geometry'
containingshapely.geometry.line.LineString
s orshapely.geometry.line.MultiLineString
s.- Returns
A list of
Path
s containing all edges in the LineString orMultiLineString.
Notes
This function depends on
node_series
andvalid_road_types
, which are passed by an initializer as items invar_dict
.
solaris.vector.mask
vector <-> training mask interconversion¶
-
solaris.vector.mask.
boundary_mask
(footprint_msk=None, out_file=None, reference_im=None, boundary_width=3, boundary_type='inner', burn_value=255, **kwargs)[source]¶ Convert a dataframe of geometries to a pixel mask.
Note
This function requires creation of a footprint mask before it can operate; therefore, if there is no footprint mask already present, it will create one. In that case, additional arguments for
footprint_mask()
(e.g.df
) must be passed.By default, this function draws boundaries within the edges of objects. To change this behavior, use the boundary_type argument.
- Parameters
footprint_msk (
numpy.array
, optional) – A filled in footprint mask created usingfootprint_mask()
. If not provided, one will be made by callingfootprint_mask()
before creating the boundary mask, and the required arguments for that function must be provided as kwargs.out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes).reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignoredboundary_width (int, optional) – The width of the boundary to be created in pixels. Defaults to 3.
boundary_type (
"inner"
or"outer"
, optional) – Where to draw the boundaries: within the object ("inner"
) or outside of it ("outer"
). Defaults to"inner"
.burn_value (int, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.**kwargs (optional) – Additional arguments to pass to
footprint_mask()
if one needs to be created.
- Returns
boundary_mask – A pixel mask with 0s for non-object pixels and the same value as the footprint mask burn_value for the boundaries of each object.
- Return type
numpy.array
-
solaris.vector.mask.
buffer_df_geoms
(df, buffer, meters=False, reference_im=None, geom_col='geometry', affine_obj=None)[source]¶ Buffer geometries within a pd.DataFrame or gpd.GeoDataFrame.
- Parameters
df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If df lacks acrs
attribute (isn’t ageopandas.GeoDataFrame
) andmeters=True
, then reference_im must be provided for georeferencing.buffer (int or float) – The amount to buffer the geometries in df. In pixel units unless
meters=True
. This corresponds to width/2 in mask creation functions.meters (bool, optional) – Should buffers be in pixel units (default) or metric units (if meters is
True
)?reference_im (str or
rasterio.DatasetReader
, optional) – The path to a reference image covering the same geographic extent as the area labeled in df. Provided for georeferencing of pixel coordinate geometries in df or conversion of georeferenced geometries to pixel coordinates as needed. Required if meters isTrue
and df lacks acrs
attribute.geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
.affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert geoms in df from a geographic crs to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.
- Returns
buffered_df – A
pandas.DataFrame
in the original coordinate reference system with objects buffered per buffer.- Return type
See also
road_mask
Function to create road network masks.
contact_mask
Function to create masks of contact points between polygons.
-
solaris.vector.mask.
contact_mask
(df, contact_spacing=10, meters=False, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255)[source]¶ Create a pixel mask labeling closely juxtaposed objects.
Notes
This function identifies pixels in an image that do not correspond to objects, but fall within contact_spacing of >1 labeled object.
- Parameters
df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.contact_spacing (int or float, optional) – The desired maximum distance between adjacent polygons to be labeled as contact. Will be in pixel units unless
meters=True
is provided.meters (bool, optional) – Should width be defined in units of meters? Defaults to no (
False
). IfTrue
and df is not in a CRS with metric units, the function will attempt to transform to the relevant CRS usingdf.to_crs()
(if df is ageopandas.GeoDataFrame
) or using the data provided in reference_im (if not).out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes).reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
.do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to
None
, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.out_type ('float' or 'int') –
burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value.
- Returns
output_arr – A pixel mask with burn_value at contact points between polygons.
- Return type
numpy.array
-
solaris.vector.mask.
df_to_px_mask
(df, channels=['footprint'], out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, **kwargs)[source]¶ Convert a dataframe of geometries to a pixel mask.
- Parameters
df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.channels (list, optional) –
The mask channels to generate. There are three values that this can contain:
"footprint"
: Create a full footprint mask, with 0s at pixelsthat don’t fall within geometries and burn_value at pixels that do.
"boundary"
: Create a mask with geometries outlined. Useboundary_width to set how thick the boundary will be drawn.
"contact"
: Create a mask with regions between >= 2 closelyjuxtaposed geometries labeled. Use contact_spacing to set the maximum spacing between polygons to be labeled.
Each channel correspond to its own shape plane in the output.
out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes).reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
.do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to
None
, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.burn_value (int or float) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value.kwargs – Additional arguments to pass to boundary_mask or contact_mask. See those functions for requirements.
- Returns
mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value. Shape will be
(shape[0], shape[1], len(channels))
, with channels ordered per the provided channels list.- Return type
numpy.array
-
solaris.vector.mask.
footprint_mask
(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None)[source]¶ Convert a dataframe of geometries to a pixel mask.
- Parameters
df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes).reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
.do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to
None
, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.out_type ('float' or 'int') –
burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.
- Returns
mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.
- Return type
numpy.array
-
solaris.vector.mask.
geojsons_to_masks_and_fill_nodata
(rtiler, vtiler, label_tile_dir, fill_value=0)[source]¶ Converts tiled vectors to raster labels and fills nodata values in raster and vector tiles.
This function must be run after a raster tiler and vector tiler have already been initialized and the .tile() method for each has been called to generate raster and vector tiles. Geojson labels are first converted to rasterized masks, then the labels are set to 0 where the reference image, the corresponding image tile, has nodata values. Then, nodata areas in the image tile are filled in place with the fill_value. Only works for rasterizing all geometries as a single category with a burn value of 1. See test_tiler_fill_nodata in tests/test_tile/test_tile.py for an example.
- Parameters
rtiler (RasterTiler) – The RasterTiler that has had it’s .tile() method called.
vtiler (VectorTiler) – The VectorTiler that has had it’s .tile() method called.
label_tile_dir (str) – The folder path to save rasterized labels. This is created if it doesn’t already exist.
fill_value (str, optional) – The value to use to fill nodata values in images. Defaults to 0.
- Returns
rasterized_label_paths – A list of the paths to the rasterized instance masks.
- Return type
-
solaris.vector.mask.
instance_mask
(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None, nodata_value=0)[source]¶ Convert a dataframe of geometries to a pixel mask.
- Parameters
df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes).reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
.do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to
None
, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.out_type ('float' or 'int') –
burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.
nodata_value (int or float, optional) – The value to use for nodata pixels in the mask. Defaults to 0 (the min value for
uint8
arrays). Used if reference_im nodata value is a float. Ignored if reference_im nodata value is an int or if reference_im is not used. Take care when visualizing these masks, the nodata value may cause labels to not be visualized if nodata values are automatically masked by the software.
- Returns
mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.
- Return type
numpy.array
-
solaris.vector.mask.
mask_to_poly_geojson
(pred_arr, channel_scaling=None, reference_im=None, output_path=None, output_type='geojson', min_area=40, bg_threshold=0, do_transform=None, simplify=False, tolerance=0.5, **kwargs)[source]¶ Get polygons from an image mask.
- Parameters
pred_arr (
numpy.ndarray
) – A 2D array of integers. Multi-channel masks are not supported, and must be simplified before passing to this function. Can also pass an image file path here.channel_scaling (
list
-like, optional) –If pred_arr is a 3D array, this argument defines how each channel will be combined to generate a binary output. channel_scaling should be a list-like of length equal to the number of channels in pred_arr. The following operation will be performed to convert the multi-channel prediction to a 2D output
sum(pred_arr[channel]*channel_scaling[channel])
If not provided, no scaling will be performend and channels will be summed.
reference_im (str, optional) – The path to a reference geotiff to use for georeferencing the polygons in the mask. Required if saving to a GeoJSON (see the
output_type
argument), otherwise only required ifdo_transform=True
.output_path (str, optional) – Path to save the output file to. If not provided, no file is saved.
output_type (
'csv'
or'geojson'
, optional) – Ifoutput_path
is provided, this argument defines what type of file will be generated - a CSV (output_type='csv'
) or a geojson (output_type='geojson'
).min_area (int, optional) – The minimum area of a polygon to retain. Filtering is done AFTER any coordinate transformation, and therefore will be in destination units.
bg_threshold (int, optional) – The cutoff in
mask_arr
that denotes background (non-object). Defaults to0
.simplify (bool, optional) – If
True
, will use the Douglas-Peucker algorithm to simplify edges, saving memory and processing time later. Defaults toFalse
.tolerance (float, optional) – The tolerance value to use for simplification with the Douglas-Peucker algorithm. Defaults to
0.5
. Only has an effect ifsimplify=True
.
- Returns
gdf – A GeoDataFrame of polygons.
- Return type
-
solaris.vector.mask.
preds_to_binary
(pred_arr, channel_scaling=None, bg_threshold=0)[source]¶ Convert a set of predictions from a neural net to a binary mask.
- Parameters
pred_arr (
numpy.ndarray
) – A set of predictions generated by a neural net (generally infloat
dtype). This can be a 2D array or a 3D array, in which case it will be convered to a 2D mask output with optional channel scaling (see the channel_scaling argument). If a filename is provided instead of an array, the image will be loaded using scikit-image.channel_scaling (list-like of `float`s, optional) –
If pred_arr is a 3D array, this argument defines how each channel will be combined to generate a binary output. channel_scaling should be a list-like of length equal to the number of channels in pred_arr. The following operation will be performed to convert the multi-channel prediction to a 2D output
sum(pred_arr[channel]*channel_scaling[channel])
If not provided, no scaling will be performend and channels will be summed.
bg_threshold (int or float, optional) – The cutoff to set to distinguish between background and foreground pixels in the final binary mask. Binarization takes place after channel scaling and summation (if applicable). Defaults to 0.
- Returns
mask_arr – A 2D boolean
numpy
array withTrue
for foreground pixels andFalse
for background.- Return type
-
solaris.vector.mask.
road_mask
(df, width=4, meters=False, out_file=None, reference_im=None, geom_col='geometry', do_transform=None, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None, min_background_value=None, verbose=False)[source]¶ Convert a dataframe of geometries to a pixel mask.
- Parameters
df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.width (float or int, optional) – The total width to make a road (i.e. twice x if using road.buffer(x)). In pixel units unless meters is
True
.meters (bool, optional) – Should width be defined in units of meters? Defaults to no (
False
). IfTrue
and df is not in a CRS with metric units, the function will attempt to transform to the relevant CRS usingdf.to_crs()
(if df is ageopandas.GeoDataFrame
) or using the data provided in reference_im (if not).out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes).reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
.do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to
None
, in which case the function attempts to infer whether or not a transformation is required based on the presence or absence of a CRS in df. IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.out_type ('float' or 'int') –
burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.
min_background_val (int) – Minimum value for mask background. Optional, ignore if
None
. Defaults toNone
.verbose (str, optional) – Switch to print relevant values. Defaults to
False
.
- Returns
mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.
- Return type
numpy.array