Today we will run through a few experiments to work with data
We will be using a library created by bnia among others
import pandas as pd import geopandas as gpd import matplotlib.pyplot as plt import networkx as nx import warnings warnings.filterwarnings('ignore')Lets start from where we left off last time
baseurl = "https://services1.arcgis.com/mVFRs7NF4iFitgbY/ArcGIS/rest/services/" slug = "/FeatureServer/0/query?where=1%3D1&outFields=*&returnGeometry=true&f=pgeojson" url = baseurl+shortname+slug gdf = gpd.read_file(url).set_index('CSA2010').drop(axis='1', columns=['OBJECTID','Shape__Area','Shape__Length']) gdf.drop(axis='1', columns=['geometry']).to_csv(shortname+'.csv', quoting=csv.QUOTE_ALL)libcard11 | libcard12 | libcard13 | libcard14 | libcard15 | libcard16 | libcard17 | libcard18 | libcard19 | |
---|---|---|---|---|---|---|---|---|---|
CSA2010 | |||||||||
Allendale/Irvington/S. Hilton | 194.672258 | 206.326694 | 185.546032 | 318.616267 | 328.975766 | 276.006660 | 229.882222 | 233.397052 | 214.527964 |
Beechfield/Ten Hills/West Hills | 153.212655 | 153.131115 | 140.410959 | 249.510763 | 261.986301 | 225.782779 | 178.082192 | 175.391389 | 167.482061 |
Belair-Edison | 319.418925 | 310.289389 | 261.311438 | 443.959577 | 463.711530 | 401.067983 | 337.792834 | 343.706936 | 333.486449 |
Brooklyn/Curtis Bay/Hawkins Point | 229.726883 | 195.464439 | 187.109457 | 307.589693 | 352.383627 | 296.145475 | 272.344309 | 269.255073 | 254.370568 |
Canton | 267.777778 | 235.308642 | 169.382716 | 284.320988 | 299.753086 | 284.938272 | 269.259259 | 288.518519 | 309.259259 |
CSA2010 | index | Allendale/Irvington/S. Hilton | Beechfield/Ten Hills/West Hills | Belair-Edison | Brooklyn/Curtis Bay/Hawkins Point | Canton | Cedonia/Frankford | Cherry Hill | Chinquapin Park/Belvedere | Claremont/Armistead | Clifton-Berea | Cross-Country/Cheswolde | Dickeyville/Franklintown | Dorchester/Ashburton | Downtown/Seton Hill | Edmondson Village | Fells Point | Forest Park/Walbrook | Glen-Fallstaff | Greater Charles Village/Barclay | Greater Govans | Greater Mondawmin | Greater Roland Park/Poplar Hill | Greater Rosemont | Greenmount East | Hamilton | Harbor East/Little Italy | Harford/Echodale | Highlandtown | Howard Park/West Arlington | Inner Harbor/Federal Hill | Lauraville | Loch Raven | Madison/East End | Medfield/Hampden/Woodberry/Remington | Midtown | Midway/Coldstream | Morrell Park/Violetville | Mount Washington/Coldspring | North Baltimore/Guilford/Homeland | Northwood | Oldtown/Middle East | Orangeville/East Highlandtown | Patterson Park North & East | Penn North/Reservoir Hill | Pimlico/Arlington/Hilltop | Poppleton/The Terraces/Hollins Market | Sandtown-Winchester/Harlem Park | South Baltimore | Southeastern | Southern Park Heights | Southwest Baltimore | The Waverlies | Upton/Druid Heights | Washington Village/Pigtown | Westport/Mount Winans/Lakeland |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | libcard11 | 194.672258 | 153.212655 | 319.418925 | 229.726883 | 267.777778 | 216.283907 | 323.579615 | 236.719959 | 182.723849 | 279.521977 | 78.256867 | 114.362351 | 206.261666 | 381.01148 | 293.037975 | 249.253236 | 279.114631 | 125.050288 | 244.402416 | 289.954124 | 239.219052 | 486.512132 | 256.295758 | 284.946237 | 224.427011 | 319.585722 | 188.550389 | 362.068966 | 141.175389 | 314.352392 | 266.601483 | 212.788191 | 330.805809 | 328.329883 | 386.08522 | 275.249377 | 67.157519 | 268.962848 | 243.987632 | 272.066334 | 313.641353 | 281.458767 | 372.671661 | 299.338022 | 163.083954 | 277.624853 | 276.517186 | 246.175461 | 173.801917 | 183.604336 | 205.870841 | 358.957823 | 293.656933 | 318.916954 | 155.499368 |
https://seaborn.pydata.org/examples/horizontal_boxplot.html
import matplotlib.pyplot as plt sns.set_theme(style="ticks") # Initialize the figure with a logarithmic x axis f, ax = plt.subplots(figsize=(7, 6)) ax.set_xscale("log") # Load the example planets dataset planets = sns.load_dataset("planets") # Plot the orbital period with horizontal boxes sns.boxplot(x="value", y="index", data=test, whis=[0, 100], width=.6, palette="vlag") # Add in points to show each observation sns.stripplot(x="value", y="index", data=test, size=4, color=".3", linewidth=0) # Tweak the visual presentation ax.xaxis.grid(True) ax.set(ylabel="") sns.despine(trim=True, left=True)What we want is 1 record for every year and every CSA as a column. To do this, transpose the dataset. Set the CSA labels (first row) as our columns, relabel the index (for clarity) and cast our datatypes.
What we want is 1 record for every year and every CSA as a column. To do this, transpose the dataset. Set the CSA labels (first row) as our columns, relabel the index (for clarity) and cast our datatypes.
vs10to19Indt.columns = vs10to19Indt.iloc[0] vs10to19Indt = vs10to19Indt[1:] vs10to19Indt.index.name = 'variable' vs10to19Indt = vs10to19Indt.astype('float64') cor_matrix = vs10to19Indt.iloc[:,:].corr() #shows the first 5 rows cor_matrix.head(5)CSA2010 | Oldtown/Middle East | Loch Raven | Mount Washington/Coldspring | Greater Charles Village/Barclay | Dorchester/Ashburton | Lauraville | Orangeville/East Highlandtown | Cherry Hill | Greater Govans | Edmondson Village | Chinquapin Park/Belvedere | Belair-Edison | Pimlico/Arlington/Hilltop | Downtown/Seton Hill | Greater Roland Park/Poplar Hill | North Baltimore/Guilford/Homeland | Inner Harbor/Federal Hill | Greater Rosemont | Morrell Park/Violetville | Medfield/Hampden/Woodberry/Remington | Patterson Park North & East | Cedonia/Frankford | Harford/Echodale | Harbor East/Little Italy | Midtown | Howard Park/West Arlington | Northwood | South Baltimore | Washington Village/Pigtown | Southern Park Heights | Beechfield/Ten Hills/West Hills | Hamilton | Southwest Baltimore | Cross-Country/Cheswolde | Allendale/Irvington/S. Hilton | Greater Mondawmin | Glen-Fallstaff | Fells Point | Southeastern | Poppleton/The Terraces/Hollins Market | Canton | Brooklyn/Curtis Bay/Hawkins Point | Westport/Mount Winans/Lakeland | Forest Park/Walbrook | Penn North/Reservoir Hill | Sandtown-Winchester/Harlem Park | Highlandtown | Claremont/Armistead | Dickeyville/Franklintown | Upton/Druid Heights | Clifton-Berea | Madison/East End | The Waverlies | Midway/Coldstream | Greenmount East |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CSA2010 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Oldtown/Middle East | 1.000000 | 0.856143 | 0.773910 | 0.088855 | -0.608827 | -0.499527 | 0.814692 | 0.403881 | 0.251116 | 0.494936 | 0.199884 | -0.168226 | 0.678366 | 0.690350 | 0.528017 | 0.295356 | -0.064447 | -0.629057 | 0.304308 | 0.618626 | -0.388397 | -0.652071 | -0.082405 | -0.421950 | -0.185125 | -0.061869 | -0.172368 | -0.198497 | -0.440238 | -0.589253 | -0.500684 | -0.208732 | -0.597928 | 0.168858 | -0.541845 | -0.578843 | -0.538057 | -0.573336 | -0.146943 | -0.512577 | -0.213811 | -0.252186 | -0.653031 | -0.525645 | -0.665015 | -0.481733 | -0.309692 | -0.550863 | 0.265893 | -0.500873 | -0.681230 | -0.801715 | -0.578070 | -0.494797 | -0.267336 |
Loch Raven | 0.856143 | 1.000000 | 0.763849 | 0.099961 | -0.418352 | -0.260345 | 0.611999 | 0.627270 | 0.323862 | 0.385895 | 0.043460 | -0.061750 | 0.398526 | 0.616327 | 0.341074 | 0.376594 | -0.060724 | -0.402143 | 0.316809 | 0.521855 | -0.359277 | -0.502731 | -0.187982 | -0.524136 | -0.396105 | 0.201605 | -0.268888 | -0.413832 | -0.504003 | -0.629982 | -0.130710 | -0.343178 | -0.578836 | 0.290089 | -0.381243 | -0.510632 | -0.621275 | -0.610172 | -0.193038 | -0.659241 | -0.385994 | -0.232488 | -0.373138 | -0.578257 | -0.448557 | -0.573299 | -0.495425 | -0.710760 | -0.021265 | -0.492105 | -0.522877 | -0.790195 | -0.625502 | -0.642093 | -0.544531 |
Mount Washington/Coldspring | 0.773910 | 0.763849 | 1.000000 | 0.230172 | -0.298009 | -0.061855 | 0.720158 | 0.448142 | -0.147221 | 0.345549 | 0.191383 | 0.064190 | 0.587420 | 0.641132 | 0.504973 | 0.453759 | 0.054264 | -0.543299 | 0.292559 | 0.729633 | -0.342624 | -0.448081 | -0.115320 | -0.337317 | -0.117033 | -0.119977 | -0.045260 | -0.460318 | -0.240693 | -0.591640 | -0.451999 | -0.228413 | -0.471439 | 0.196364 | -0.302116 | -0.362125 | -0.155026 | -0.556791 | -0.029925 | -0.600164 | -0.235837 | -0.469413 | -0.424707 | -0.567754 | -0.711652 | -0.537917 | -0.194337 | -0.372815 | 0.149830 | -0.581333 | -0.889449 | -0.505319 | -0.424233 | -0.428873 | -0.260162 |
Greater Charles Village/Barclay | 0.088855 | 0.099961 | 0.230172 | 1.000000 | 0.658306 | 0.665591 | 0.188599 | 0.174670 | 0.329220 | 0.438693 | 0.663298 | 0.812766 | -0.255950 | 0.576518 | 0.570419 | 0.357486 | 0.798395 | 0.349247 | -0.042138 | 0.096213 | 0.726742 | 0.615049 | 0.440104 | 0.718092 | 0.656386 | -0.338772 | 0.443720 | 0.548300 | 0.544289 | 0.538334 | 0.321435 | 0.597435 | 0.508949 | 0.246747 | 0.589688 | 0.630992 | 0.261127 | 0.495503 | -0.071247 | 0.189300 | 0.644059 | 0.217443 | 0.163802 | 0.254908 | 0.201430 | 0.593543 | 0.529142 | 0.467451 | 0.033932 | 0.358678 | -0.172168 | 0.106381 | 0.184152 | 0.267192 | 0.520408 |
Dorchester/Ashburton | -0.608827 | -0.418352 | -0.298009 | 0.658306 | 1.000000 | 0.868057 | -0.449121 | -0.180266 | 0.071918 | -0.067083 | 0.293627 | 0.578843 | -0.663439 | -0.172799 | 0.095496 | 0.048037 | 0.595247 | 0.827268 | -0.364005 | -0.362925 | 0.805920 | 0.888107 | 0.160464 | 0.713537 | 0.622471 | -0.176538 | 0.313109 | 0.475831 | 0.814534 | 0.735080 | 0.550078 | 0.539397 | 0.612604 | -0.023456 | 0.835669 | 0.827755 | 0.542015 | 0.713982 | -0.061736 | 0.461562 | 0.470096 | 0.263571 | 0.733762 | 0.444707 | 0.570176 | 0.727140 | 0.547960 | 0.568392 | -0.208497 | 0.515962 | 0.359093 | 0.592483 | 0.350640 | 0.504560 | 0.442400 |