Python and Colab

A free book covering Python data science with notebooks may be found here. It uses Jupyter Notebook, of which Google Colab is built off of.

Information for this section was pulled from a variety of resources. Click on the links to learn more!

The Colab Environment

Before we get into gritty details, take a moment to explore the Colab environment.

Setup & Configuration:

Begin by visiting https://colab.research.google.com
Click 'NEW PYTHON 3 NOTEBOOK'
For the most part, that is all it takes!
Many modules are already pre-installed on the virtual enviornment.

The following articles can help get you started. Excerpts have been selected and shown in block quotes.

Welcome to Colaboratory

Source

Colaboratory, or "Colab" for short, allows you to write and execute Python in your browser, with:

Zero configuration required
Free access to GPUs
Easy sharing

The document you are reading is not a static web page, but an interactive environment called a Colab notebook that lets you write and execute code.

To execute the code... use the keyboard shortcut "Command/Ctrl+Enter".

Colab notebooks allow you to combine markup, executable code, and text into a single document, along with images, HTML, LaTeX and more. When you create your own Colab notebooks, they are stored in your Google Drive account. You can easily share your Colab notebooks with co-workers or friends, allowing them to comment on your notebooks or even edit them. To learn more, see Overview of Colab. To create a new Colab notebook you can use the File menu above, or use the following link: create a new Colab notebook.

Colab notebooks are Jupyter notebooks that are hosted by Colab. To learn more about the Jupyter project, see jupyter.org.

All blockquotes in the section above was pulled from the header's link.

Colab Menu Bar

Everything you need can be found in your menu bar.

Follow the brief outline below:

File (accessible on the left hand drawer)

Locate in drive
New, Open, Upload, Save, Download
Save to Github or Drive

Edit

Undo
Select all, Cut, Copy, Paste, Delete
Find, Replace
Show/Hide all code
Clear all code outputs

View

Table of Contents (accessible on the left hand drawer)
Executed Code History
Diff Notebooks
Collapse Sections

Insert

Code/Text Cell
Section Header
Code Snippet (accessible on the left hand drawer)

Runtime

Run - This action can be used to execute all cells, or at least anything before, after, or in a selected cell.
Interrupt Execution - Just in case the code is caught in an eternal loop or is hanging.
Restart (and optionally re-run all) - Installed modules are kept but must be re-imported.
Factory reset runtime - Must re-install all modules

Tools

Command Palette - Clickable menu of shortcuts
Settings
- Site - Set theming
- Editor - Set indentation, fontsize, line width
- Misc - Enable 'Corgie' and or 'Kittie' Mode.
Keyboard Shortcuts

Help

FAQ
Ask a question on Stack Overflow

Overview of Colaboratory Features

Features in the header link's article are accessible from the Menu Bar.

Colaboratory "magics" are shorthand annotations that change how a cell's text is executed.

Much more on this is covered below. For now, observe what you can do with it:

Here, we use python magics in the first line of this code-block to have the remaining lines display HTML

With magics, you can execute terminal commands straight from a code block!

Preface your terminal command with a

so the interpreter knows the text is not Python.

warning: Use
```
cd
```
or
```
$cd
```
to change directories;
```
!cd
```
will not work as expected.

Which means a change directory command won't persist.

Unless you use

Python variables can take output from a terminal command.

Terminal commands can take variable using

{python variable}

The output response from the execution of a terminal command can even be stored as Python variables!

When you change directories

$cd ./filepath/

More Tricks

Other advanced code tricks include the following:

Hosting notebooks online using GitHub and myBinder.
Notebooks can also be colaboratively edited by sharing a link on Google Drive.
Colabs can connect to and run on your local machine.

Markup

In computer text processing, a markup language is a system for annotating a document in a way that is syntactically distinguishable from the text,[1] meaning when the document is processed for display, the markup language is not shown, and is only used to format the text. -wikipedia

Markdown Guide

A) Markdown is the name given to markup used for making text rich-text.

B) Text cells (not code-cells) in Google Colab will automatically understand Markdown and display it appropriately.

C) Within Colabs, many HTML elements can readily be rendered within Markdown cells like the enriched text in this sentence.

This is not a given on other markdown viewers and can be prevented by encapsulating the html <u>with backticks</u>
.

Badges

Badges are (typically) action-enabled icons used to call attention to the reader. These are often displayed using HTML or Markdown.

Pick a template and create your own badge from shields.io to get started!

More on Markdown:

Markdown Vs. Markup
Generic Github Guide
Basic writing and formatting syntax
Writing on GitHub

Flags: Magics and Comments

A. 'Flags' are a special form of shorthand annotation that change how code-block's are executed.

B. These annotations augment the interpreters handling of a cell or line.

C. Flags are placed on the first line or on a per line basis depending on intent

D. There exists two types of Flags: Comment and Magics

Magics is often identified by two
```
%
```
's at the top of the document followed by the intendid magical affect.
Comments use a single
```
#
```
and are less favored since the
```
#
```
symbol is already overloaded.

Under normal circumstances, a
```
#
```
will preface a numeral, whats more,
Markdown uses
```
#
```
's to denote a header element.

Common Uses:

A) Create section titles from within a codeblock using

#@title <TITLENAME>

B) Suppress cell output using

%%capture

C) Execute terminal commands in a cell by prefacing it with the

line-magics.

D) Comment-ify a line in your code using the

' prefix.

E) Render the cell as

%%html

%%javascript

or a single line with

#@markdown

F) Creating input forms by placing the line-magics

#@param {type:"DATA-TYPE"}

at the end of a variable declaration.

The Python Enviornment

Click on the following link for a quick overview of notable features.
These links are the official documentation and tutorial.
This website w3schools provides great introductory tutorials with examples.
This Python Wiki Beginners Guide provides a ton of helpful guides for programmers and nonprogrammers.

Now that we understand a bit more about Colab, we can address the following questions.

What is Python?

From the Docs

(emphasis my own)

Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together. Python's simple, easy to learn syntax emphasizes readability and therefore reduces the cost of program maintenance. Python supports modules and packages, which encourages program modularity and code reuse. The Python interpreter and the extensive standard library are available in source or binary form without charge for all major platforms, and can be freely distributed.

Often, programmers fall in love with Python because of the increased productivity it provides. Since there is no compilation step, the edit-test-debug cycle is incredibly fast. Debugging Python programs is easy: a bug or bad input will never cause a segmentation fault. Instead, when the interpreter discovers an error, it raises an exception. When the program doesn't catch the exception, the interpreter prints a stack trace. A source level debugger allows inspection of local and global variables, evaluation of arbitrary expressions, setting breakpoints, stepping through the code a line at a time, and so on. The debugger is written in Python itself, testifying to Python's introspective power. On the other hand, often the quickest way to debug a program is to add a few print statements to the source: the fast edit-test-debug cycle makes this simple approach very effective.

What makes Python high-level?

Because it is not assembly or as a series of ones and zeroes, memory management is made automatic.

What makes Python Object-Oriented

Basically, everything in Python is an object?! We will get back to this later. But for now, here's a peek.

 # import a module
 import json
 
 # create some data as text or a 'String':
 x =  '{ "name":"John", "age":30, "city":"New York"}'
 
 # use the JavaScript Object Notation (JSON) module to parse x:
 y = json.loads(x)
 
 # the result yields a Python dictionary. 
 # It was converted from a string of text (data) that was encoded using JSON notation:
 print(y["age"])

More information on JSON:

What makes Python interpreted?

Machines run on machine code and Python needs some way to be translated to machine code.
When you execute a line of python code, the process of interpreting the python code and translating (compiling) it to machine code happens in real-time.
While all languages need to be interpreted, the real-time compilation during code execution is why Python is called an interpreted as apposed to compiled language.

'Installing python' is really just the process of installing an interpreter.

Colab comes with a built-in interpreter that runs every time a cell runs.
Use this guide to learn more about local installation.

Python files can be imported for use in other scripts or interpretated directly using a Python terminal command.

```
python ./path/to/file/nameOfFile.py
```

What is the difference between Python 2 and Python 3?

The difference should not matter!

It used to, but Python 2 is now depricated. Everyone should be using Python 3.

If your computer comes with Python built-in, chances are it came with Python 2. Finagling with two versions of Python can be a pain since they use different notations.

With Colabs, this is simply not a problem because of they are brand new virtualized enviornments every time.

What are modules?

A module is a Python object with arbitrarily named attributes that you can bind and reference. Simply, a module is a file consisting of Python code. A module can define functions, classes and variables. A module can also include runnable code.... You can use any Python source file as a module by executing an import statement in some other Python source file. - TutorialsPoint

'Package' is a term often used to describe a suite of modules.

What are PIP and PyPI?

PIP is a de facto standard package-management system used to install and manage software packages written in Python. Many packages can be found in the default source for packages and their dependencies — Python Package Index (PyPI). Most distributions of Python come with PIP preinstalled. Python 2.7.9 and later (on the Python 2 series), and Python 3.4 and later include PIP (PIP3 for Python 3) by default. - Wikipedia

If you find Python code you like on GitHub, see if it can be found on PyPI.

If so, type

pip install package

into the terminal to install the module.

Once installed, you can now 'import' the package in your Python code.

For more information on PIP, check out this cool guide

To import a library that is not in Colaboratory by default, you can use
!pip install
or
!apt-get install
. - Snippets: Importing Libraries

Pandas

Colab comes with PIP pre-installed but can be installed using 'pip install pandas'

Congratulations! You've installed Pandas.

Pandas provides tools for data analysis. As an example, let's import some JSON data!

You can do awesome things with data when it is being interpreted as a 'dataframe'. Take a look!

Pandas works with bunch of great utilities like Dexplot and Geopandas for enhanced visualizations.

A more thorough introduction to pandas on colabs can be found here.

Learning Objectives:

Gain an introduction to the DataFrame and Series data structures of the pandas library
Access and manipulate data within a DataFrame and Series
Import CSV data into a pandas DataFrame
Reindex a DataFrame to shuffle data

Be sure to take a look at its online library, provided to help you along the way!

Outside Data

The most simple way to access your data is by mounting Google Drive to your virtual enviornment.

You can store a user's input as a value, like so:

A neat trick to get form values can be done like this:

Just be sure to re-run the cell block to update the variable values.

Putting it Together

dataguide is a package I am working on to help work with data. It provides tools and tutorials for data manipulation.

With this package, you can install ACS data with relative ease.

Software

Charles Karpati | Python and Colab