
Hi! We are BNIA-JFI.
This package was made to help with data handling
Included
- Functions built and used by BNIA for day to day tasks.
- Made to be shared via IPYNB/ Google Colab notebooks with in-built examples using 100% publicly accessible data & links.
- Online documentation and PyPi libraries created from the notebooks.
About this Tutorial:
You use can use these docs to learn from or as documentation when using the attached library.
TIPS
- Content covered in previous tutorials will be used in later tutorials.
- New code and or information should have explanations and or descriptions attached.
- Concepts or code covered in previous tutorials will be used without being explaining in entirety.
- If content can not be found in the current tutorial and is not covered in previous tutorials, please let me know.
- This notebook has been optimized for Google Colabs ran on a Chrome Browser.
- Statements found in the index page on view expressed, responsibility, errors and ommissions, use at risk, and licensing extend throughout the tutorial.
Objectives
By the end of this tutorial users should have an understanding of:
- Importing data with pandas and geopandas
- Querying data from Esri
- Retrieveing data programmatically
- This module assumes the data needs no handling prior to intake
- Loading data in a variety of formats
- Visualizing said data
Usage Instructions
Install the Package
The code is on PyPI so you can install the scripts as a python library using the command:
!pip install dataplay geopandas
Important: Contributers should follow the maintanance instructions and will not need to run this step.
Their modules will be retrieved from the VitalSigns-GDrive repo they have mounted into their Colabs Enviornment.
Then...
Import Modules
- Import the installed module into your code:
from VitalSigns.acsDownload import retrieve_acs_data
- use it
retrieve_acs_data(state, county, tract, tableId, year, saveAcs)
Now you could do something like merge it to another dataset!
from dataplay.merge import mergeDatasets mergeDatasets(left_ds=False, right_ds=False, crosswalk_ds=False, use_crosswalk = True, left_col=False, right_col=False, crosswalk_left_col = False, crosswalk_right_col = False, merge_how=False, interactive=True)
Getting Help
You can get information on the package, modules, and methods by using the help command.
Here we look at the package's modules:
help(dataplay)Help on package dataplay: NAME dataplay PACKAGE CONTENTS _nbdev corr geoms gifmap html intaker merge VERSION 0.0.37 FILE c:\python311\lib\site-packages\dataplay\__init__.py
Lets take a look at what functions the geoms module provides:
help(dataplay.geoms)Help on module dataplay.geoms in dataplay: NAME dataplay.geoms - # AUTOGENERATED! DO NOT EDIT! File to edit: notebooks/03_Map_Basics_Intake_and_Operations.ipynb (unless otherwise specified). FUNCTIONS map_points(data, lat_col='POINT_Y', lon_col='POINT_X', zoom_start=11, plot_points=True, cluster_points=False, pt_radius=15, draw_heatmap=False, heat_map_weights_col=None, heat_map_weights_normalize=True, heat_map_radius=15, popup=False) Creates a map given a dataframe of points. Can also produce a heatmap overlay Arg: df: dataframe containing points to maps lat_col: Column containing latitude (string) lon_col: Column containing longitude (string) zoom_start: Integer representing the initial zoom of the map plot_points: Add points to map (boolean) pt_radius: Size of each point draw_heatmap: Add heatmap to map (boolean) heat_map_weights_col: Column containing heatmap weights heat_map_weights_normalize: Normalize heatmap weights (boolean) heat_map_radius: Size of heatmap point Returns: folium map object readInGeometryData(url=False, porg=False, geom=False, lat=False, lng=False, revgeocode=False, save=False, in_crs=4326, out_crs=False) # reverseGeoCode, readFile, getGeoParams, main workWithGeometryData(method=False, df=False, polys=False, ptsCoordCol=False, polygonsCoordCol=False, polyColorCol=False, polygonsLabel='polyOnPoint', pntsClr='red', polysClr='white', interactive=False) # Cell # # Work With Geometry Data # Description: geomSummary, getPointsInPolygons, getPolygonOnPoints, mapPointsInPolygons, getCentroids DATA __all__ = ['workWithGeometryData', 'readInGeometryData', 'map_points'] FILE c:\python311\lib\site-packages\dataplay\geoms.py
And here we can look at an individual function and what it expects:
help(VitalSigns.acsDownload.retrieve_acs_data)
Examples
So heres an example:
Import your modules
Read in some data
Define our download parameters.
More information on these parameters can be found in the tutorials!
county = '510' state = '24' tableId = 'B19001' year = '17' saveAcs = False
And download the Baltimore City ACS data using the imported VitalSigns library.
Here we can import and display a geospatial dataset with special intake requirements.
Here we pull a list of Baltimore Cities CSA's
Now in this example we will load in a bunch of coorinates
geoloom_gdf = dataplay.geoms.readInGeometryData(url=geoloom_gdf_url, porg=False, geom='geometry', lat=False, lng=False, revgeocode=False, save=False, in_crs=4326, out_crs=False) geoloom_gdf = geoloom_gdf.dropna(subset=['geometry']) # geoloom_gdf = geoloom_gdf.drop(columns=['POINT_X','POINT_Y']) geoloom_gdf.head(1)
And here we get the number of points in each of our corresponding CSAs (polygons)
And we plot it with a legend
What were to happen if I wanted to create a interactive click map with the label of each csa (polygon) on each point?
Well we just run the reverse operation!
And then we can visualize it like:
pt_radius=1, draw_heatmap=True, heat_map_weights_col=None, heat_map_weights_normalize=True, heat_map_radius=15, popup='CSA2010')
These interactive visualizations can be exported to html using a tool found later in this document.
Its how I made this page!
If you like what you see, there is more in the package you will just have to explore.