Home Labs Nb 2 Html Tests 211 Web Scraper Wordclouds Looking At Data Satellltes Naomi Nlp Nomi Food Cook Food Fish Food Science History Dad History History Life DateIdeas Life Words

Don't Look! I'm changing!

URL Copied

Intro

Start by importing pandas into the python enviornment

Click on the next 'cell' and hit undefined+undefined to execute the code within.

import xlsx

Now import the excel document using the library you just imported and sneak a peak at its contents.

You will need to drag and drop the file into the virtual directory (left hand side -> folder icon). This drag and drop feature is only permitted once a (any) code cell has been ran.

RefugeRegionComment
0Bald Knob National Wildlife RefugeIR1Condition of the road system from when it was ...
1Bald Knob National Wildlife RefugeIR1One complaint – can't ride a 4-wheeler in to f...
2Bald Knob National Wildlife RefugeIR1Not very many handicap trails.
3Bald Knob National Wildlife RefugeIR1Would like to see more area cleared on the roa...
4Bald Knob National Wildlife RefugeIR1Many of the information signs, especially the ...

Heres a whole bunch of things to import all at once.

If comments aren't in-lined they may be explained later or irrelevant

How many records, & columns were in that dataset, again?

(2286, 3)

Oh. Right...

I guess we only really need the Refuge/Comment Pairs

CommentRefuge
0Condition of the road system from when it was ...Bald Knob National Wildlife Refuge
1One complaint – can't ride a 4-wheeler in to f...Bald Knob National Wildlife Refuge
2Not very many handicap trails.Bald Knob National Wildlife Refuge
3Would like to see more area cleared on the roa...Bald Knob National Wildlife Refuge
4Many of the information signs, especially the ...Bald Knob National Wildlife Refuge

Nice! What Refuges are there?

Word Analysis

Lets take those comments and look into them a bit. Start by importing whats needed for this section.

[nltk_data] Downloading package stopwords to /root/nltk_data... [nltk_data] Unzipping corpora/stopwords.zip.

This next block will clean up the data. descriptions are given above most lines of code.

Now that we have our clean text. lets see those word counts!

wordscount
0refuge684
1road551
2parking399
3would383
4roads354

Aweomse. Lets save it to a csv

And plot these word counts horizontally as well.

Simple Word Cloud

We found this function online. Missing attribution.

It will draw a wordcloud for you if you give it data and color specifics.

This next function will display our wordcloud

Lets make two.

Start by Downloading the Fish and Wildlife logo

This function will list content in your current directory. You should see the jpg here. Take note.

8170-clarified-interpretation-could-change-u-s-fish-and-wildlife-policy.jpg NaomiKeywordCount.csv sample_data/ xport3.xlsx

Now 'Open' that image into a variable. then convert that picture in an array of array of numbers where each array in the array represents a RGB pixel of the picture, and each number in the sub array is the R G and B value

You'll now need to retrieve the 'poppins' font face from google (if thats the font face you want to use. Be sure to upload it the same way you did the excel sheet.

If you uploaded the raw file, you can upzip it using these two terminal commands

Then store the font face in a variable as well

Image Alt Text Image Alt Text

Very Pretty. Now lets save em.

Bigram!

And this bit gets the most common bigrams

Lets take a peak

bigramcount
0(t, h)5052
1(h, e)4899
2(r, e)4216
3(i, n)4033
4(e, r)3537
.........
1208((, C)1
1209(U, -)1
1210(9, 9)1
1211(i, u)1
1212(', n)1

1213 rows × 2 columns

Network Graph

We can save those bigrams as a csv, but also it'd be very nice to see how each of these bigrams relate to eachother. We didnt actually get to finish this so it'd be best to skip over this entire section.

NLP Sentiment Analysis

And this is where we can run sentiment analysis

This line will install the library

This one will import it and create a utility variable that we can use later.

This is the function that will create the predictions and return any label with a greater than 50% probability of being applicable to the comment.

We call the function here and store the results into a csv for each Refuge.

['Road', 'Trail', 'Parking', 'Boat', 'Access', 'Sign', 'Safety', 'Maintain'] 0 1 2

And thats the end! Everything below is scrap note and tests.

  1. Pos Neg

  1. Group by Theme

  1. Group by Refuge - Find Theme