maplearn.filehandler package¶
File handlers
Read/write data from different kind of files
- Csv: tabular data as a text file
- Excel: tabular data as a Microsoft Excel file
- Shapefile: geographical vector file
- ImageGeo: geographical raster file
- FileHandler: abstract class to handle files
Submodules¶
maplearn.filehandler.csv module¶
CSV file reader and writer
With this class, you can read a text file or write a new one with your own dataset (Pandas Dataframe).
Examples:
- Read an existing file
>>> exch = Csv(os.path.join('maplearn path', 'datasets', 'ex1.xlsx')) >>> exch.read() >>> print(exch.data)
- Write a new Excel File from scratch
>>> exc = Excel(None) >>> out_file = os.path.join('maplearn path', 'tmp', 'scratch.xlsx') >>> df = pd.DataFrame({'A' : 1, 'B' : pd.Timestamp('20130102'), 'C' : pd.Series(2,index=list(range(4))), 'D' : np.array([3] * 4,dtype='int64')}) exc.write(path=out_file, data=df)
-
class
maplearn.filehandler.csv.
Csv
(path)¶ Bases:
maplearn.filehandler.filehandler.FileHandler
Handler to read and write attributes in a text file. It inherits from the abstract class FileHandler.
Attributes:
- FileHandler’s attributes
Args:
- path (str): path to the Csv file to open
-
open_
()¶ Opens the CSV file specified in dsn[‘path’]
-
read
()¶ Reads the content of the CSV file
-
write
(path=None, data=None, overwrite=True, **kwargs)¶ Write specified attributes in a text File
- Args:
- path (str): path to the Excel to create and write
- data (pandas DataFrame): dataset to write in the Excel file
- overwrite (bool): should the output Excel file be overwritten ?
maplearn.filehandler.excel module¶
Excel file reader and writer
With this class, you can read an Excel file or write a new one with your own dataset (Pandas Dataframe).
Examples:
- Read an existing Excel file
>>> exch = Excel(os.path.join('maplearn path', 'datasets', 'ex1.xlsx')) >>> exch.read() >>> print(exch.data)
- Write a new Excel File from scratch
>>> exc = Excel(None) >>> out_file = os.path.join('maplearn path', 'tmp', 'scratch.xlsx') >>> df = pd.DataFrame({'A' : 1, 'B' : pd.Timestamp('20130102'), 'C' : pd.Series(2,index=list(range(4))), 'D' : np.array([3] * 4,dtype='int64')}) exc.write(path=out_file, data=df)
-
class
maplearn.filehandler.excel.
Excel
(path, sheet=None)¶ Bases:
maplearn.filehandler.filehandler.FileHandler
Handler to read and write attributes in an Excel file. It inherits from the abstract class FileHandler.
Attributes:
- FileHandler’s attributes
Args:
- path (str): path to the Excel file to open
- sheet (str): name of the sheet to open
-
open_
()¶ Opens the Excel file specified in dsn[‘path’]
-
read
()¶ Reads the content of the opened Excel file
-
write
(path=None, data=None, overwrite=True, **kwargs)¶ Write specified attributes in an Excel File
- Args:
- path (str): path to the Excel to create and write
- data (pandas DataFrame): dataset to write in the Excel file
- overwrite (bool): should the output Excel file be overwritten ?
maplearn.filehandler.imagegeo module¶
Geographic Images (raster)
This class handles raster data with geographic dimension (projection system, bounding box expressed with coordinates).
- A raster data relies on:
- a matrix of pixels (data)
- geographic data (where to put this matrix on earth)
- Example:
>>> img = ImageGeo(os.path.join('maplearn_path', 'datasets', 'landsat_rennes.tif')) >>> img.read() >>> print(img.data)
-
class
maplearn.filehandler.imagegeo.
ImageGeo
(path=None, fmt='GTiff')¶ Bases:
maplearn.filehandler.filehandler.FileHandler
Handler of geographical rasters
- Args:
- path (str): path to the raster file to read
- fmt (str): format of the raster file (‘GTiff’… see GDAL
- documentation)
- Attributes:
- Several attributes are inherited from FileHandler class
-
data
¶ The dataset read from a file or to write in a file
-
data_2_img
(data, overwrite=False, na=None)¶ Transforms a data set (dataframe) into a matrix in order to export it as an image (inverse operation to __img_2_data () method).
- Args:
- data (dataframe): the dataset to transform
- overwrite (bool): should the result data property ?
- Returns:
- matrix: transformed dataset
-
img_2_data
()¶ Transforms the data set in order to make it easier to handle in following steps.
Converts the data set (matrix) into to 2 dimensions dataframes (where 1 line = 1 individual and 1 column = 1 feature)
- Returns:
- dataframe: transformed dataset (2 dimensions)
-
init_data
(dims, dtype=None)¶ Creates an empty matrix with specified dimension
- Args:
- dims (list): dimensions of the image to create
- dtype (str): numerical type of pixels
-
open_
()¶ Opens the Geographical Image to get information about projection system…
-
pixel2xy
(j, i)¶ Computes the geographic coordinate (X,Y) corresponding to the specified position in an image (column, row)
It does the inverse calculation of xy2pixel, and uses a gdal geomatrix
Source: http://geospatialpython.com/2011/02/clip-raster-using-shapefile.html
- Args:
- j (int): column position
- i (int): row position
- Returns:
- list: geographical coordinate of the pixel (lon and lat)
-
read
(dtype=None)¶ Reads the raster file and puts the matrix in data property
- Args:
- dtype (str): type values stored in pixels (int, float…)
-
set_geo
(transf=None, prj=None)¶ - Sets geographical dimension of a raster:
- the projection system
- the bounding box, whose coordinates are compatible with the given
projection system
- Args:
- prj (str): projection system
- transf (list): affine function to translate an image
Definition of ‘transf’ (to translate an image to the right place): [0] = top left x (x Origin) [1] = w-e pixel resolution (pixel Width) [2] = rotation, 0 if image is “north up” [3] = top left y (y Origin) [4] = rotation, 0 if image is “north up” [5] = n-s pixel resolution (pixel Height)
- TODO :
- Check compatibility between bounding box and image size
- Adds EPSG code corresponding to prj in __geo
-
write
(path=None, data=None, overwrite=True, **kwargs)¶ Writes a data in a raster file
- Args:
- path (str): raster file to write data into
- data (array): data to write
- overwrite (bool): should the raster file be overwritten?
-
xy2pixel
(lon, lat)¶ Computes the position in an image (column, row), given a geographic coordinate
Uses a gdal geomatrix (gdal.GetGeoTransform()) to calculate the pixel location of a geospatial coordinate (http://geospatialpython.com/2011/02/clip-raster-using-shapefile.html)
- Args:
- lon (float): longitude (X)
- lat (float): latitude (Y)
- Returns:
- list with the position in the image (column, row)
maplearn.filehandler.shapefile module¶
Shapefile reader and writer
With this class, you can read a shapefile or more precisely get attributes from a shapefile. You can also write a new shapefile using geometry from an original shapefile and adding the attributes you want.
Examples:
>>> shp = Shapefile(os.path.join('maplearn path', 'datasets',
'echantillon.shp'))
>>> shp.read()
>>> print(shp.data)
- TODO:
- Guess character encoding in shapefile’s attributes
-
class
maplearn.filehandler.shapefile.
Shapefile
(path)¶ Bases:
maplearn.filehandler.filehandler.FileHandler
Handler to read and write attributes in a shapefile. It inherits from the abstract class FileHandler.
Attributes:
- FileHandler’s attributes
- str_type (str): kind of geometry (polygon, point…)
- lst_flds (list): list of fields in dataset
-
open_
()¶ Opens the shapefile and put in __ds attribute, so attributes can then be read
-
read
()¶ Reads attributes associated to entities in the shapefile
- Returns:
- Pandas Dataframe: data (attributes) available in the shapefile
-
write
(path=None, data=None, overwrite=True, **kwargs)¶ Write attributes (and only attributes) in a new shapefile, using geometries of an original shapefile.
- Args:
- path (str): path to the shapefile to create and write
- data (pandas DataFrame): dataset to write in the shapefile
- overwrite (bool): should the output shapefile be overwritten ?
maplearn.filehandler.filehandler module¶
Handling files (abstract class)
This class is to handle generic files. FileHandler is not supposed to be called directly. Use rather one of the classes that inherits from it (ImageGeo, Excel, Shapefile…).
-
class
maplearn.filehandler.filehandler.
FileHandler
(path=None, **kwargs)¶ Bases:
object
Reads data from a generic file or write data into it.
- Attributes:
- _drv (object): driver to communicate with a file (necessary for some
- formats)
- _data (numpy array or pandas dataframe): dataset got from a file or
- to write into it. See data property.
- opened (bool): is the file opened or not ?
- Args:
- path (str): path the file to read data from
- **kwargs: additional settings to specify how to load data from file
-
data
¶ The dataset read from a file or to write in a file
-
dsn
¶ Dictionnary containing informations about data source. For example, path contains the path of the file to get data from. Other items can exist, which are specific to the data type (raster, vector or tabular, geographical or not…)
-
open_
()¶ Opens a file prior to write in it
-
read
()¶ Reads the dataset from the file mentioned during initialization
-
write
(path=None, data=None, overwrite=True, **kwargs)¶ Writes data in a file
- Args:
- path (str): path to the file to write into
- data (numpy array or pandas dataframe): the data to write
- overwrite (bool): should the file be overwritten if it exists ?