Home Dataplay Download And Load Merge Data Map Basics Intake An... Map Correlation Netw... Timelapse Data Gifs Retrieve Acs Data Pivot Table Sync Data Food Cook Food Fish Food Science History Dad History History Life DateIdeas Life Words

Don't Look! I'm changing!

URL Copied

 About This Tutorial:
  Data Prep
 A. Calculate The Correlation Matrix
 B. Create Graph
 C. Styling The Nodes Based On The Number Of Edges Linked (Degree)

BinderBinderBinderOpen Source Love svg3

NPM LicenseActivePython VersionsGitHub last commit

GitHub starsGitHub watchersGitHub forksGitHub followers

TweetTwitter Follow

About this Tutorial:

⚠️ The writing is a work in progress. The functions work but text retouching⚠️

Please read everything found on the mainpage before continuing; disclaimer and all.

In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterized by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes (Holland and Leinhardt, 1971; Watts and Strogatz, 1998).

Two versions of this measure exist: the global and the local. The global version was designed to give an overall indication of the clustering in the network, whereas the local gives an indication of the embeddedness of single nodes. - Geek for Geeks

Image Alt Text

Data Prep

Get only the columns we want to work with

What we want is 1 record for every year and every CSA as a column. To do this, transpose the dataset. Set the CSA labels (first row) as our columns, relabel the index (for clarity) and cast our datatypes.

a. Calculate the correlation matrix

cor_matrix contains the full correlation matrix. The table below shows a snapshot of the first 5 rows.

b. Create graph

C. Styling the nodes based on the number of edges linked (degree)

We want to create a linear regression for each CSA using {X: year, Y: value} for a given indicator

We may need to normalize the data for this to be useable