2014-09-17

"Hello World" for accessing Google Sheets with Python and OAuth

Remote Access to Google Spreadsheets using Python, GSpread and OAuth2


gspread is a high-level tool kit for accessing Google Sheets from Python.  It makes it really easy to manipulate spreadsheets if you have the username and password to get into them.

Unfortunately, its documentation for OAuth access does little more than tell you to read Google's manuals.  If you have small experience with OAuth, this can be a big problem.

This tutorial should get you going quickly, as long as you already have Python running in Ubuntu.  Everything is explained step by step.  You can get the code from gspread_HelloOAuthWorld on GitHub.

Note that that OAuth has several versions and numerous distinct sequences of operations for different purposes and circumstances. One of the trickiest parts of learning OAuth is that different authors use different terminology to refer to the same sequence or they simply fail to clarify which sequence or version they're talking about.

So to be clear . . .  this Hello World example shows you how to perform two different sequences :
  1. authenticate and authorize yourself to use your own GMail account as a mail transfer service for your Python programs. It uses the OAuth sequence described in Using OAuth 2.0 for Web Server Applications.
  2. get a third-party to authorize you to access their Google Sheets. It uses the OAuth sequence described in Using OAuth 2.0 for Devices.

authorize_SMTP.py

The included program authorize_SMTP.py works in accordance with this sequence:



It prepares and displays a URL that includes a request token, and then prompts you to enter an authorization code. When you open the URL in a browser, you will need to log in to Google (if you haven't done so earlier) and consent to the indicated permissions. You then copy the resulting code back into the command line of authorize_SMTP.py.  It will write to disk a file called "working_parameters.py", which, when imported into any other Python program, will provide access and refresh tokens for using GMail as an SMTP mail transfer service.

request_authorization.py

The included program request_authorization.py works in accordance with this sequence :



It connects to Google and obtains a URL and authorization code specific to Google Sheets, then prepares an email and sends it to the individual you specify.

The text of the email explains to them the purpose of the authorization code and provides a hyperlink to the authorization page.  They simply have to copy and paste the provided code into the field presented at that URL, and then consent to the indicated access permissions on Google Sheets.

Meanwhile, from the moment it sends the email, request_authorization.py sits in a slow loop waiting for 30 minutes, for your user to react. If they do indeed authorize you, it will prompt you to enter the URL of one of their Google Sheets workbooks.  It then writes two files to disk: a new credentials file called "creds_oa.py", and an executable Python script "gspread_HelloOAuthWorld.py". The latter uses the former and gspread to get and display the name of the first sheet of your user's workbook.

Here are the steps :
  1. Get the code
  2. Start up a Virtual Environment
  3. Resolve Dependencies
  4. Get Developer Credentials
  5. Authenticate and self-authorize GMail SMTP use.
  6. Get permission on other user's spreadsheet.
  7. Run gspread_HelloOAuthWorld.py.

Get the code.

Top
#
# Make a disposable directory
mkdir -p ~/disposable#
# Jump into it
pushd ~/disposable
#
# Get the code
wget https://github.com/martinhbramwell/gspread_HelloOAuthWorld/archive/master.zip
#
# unpack it
unzip master.zip
#
# give it its real name
mv gspread_HelloOAuthWorld-master gspread_HelloOAuthWorld
#
popd


Start up a Virtual Environment

Top
I don't want to be accused of wrecking anyone's system. Python is integral to the workings of Ubuntu and altering it willy-nilly is a bad idea. Virtual environments protect you from that.

First we check what the Python execution path looks like normally. It contains directories like /usr/local/... and /usr/lib/... etc. That's what we do not want to mess up.
python -c "import sys; print sys.path"
The included bash script prepare_virtualenv.sh automates the job of setting your machine up for virtual environment management :
pushd ~/disposable/gspread_HelloOAuthWorld
sudo ./prepare_virtualenv.sh
The scripts create new environment variables and aliases that will be available on next log in. To get them immediately, run this . . .
source ~/.bashrc
We create a shadow directory holding the secluded execution environment for our project.
mkvirtualenv hiOAWorld
Notice that your execution prompt is now prefixed with (hiOAWorld). Check if the execution path is now self-contained. You should see numerous references to a directory /home/yourself/.python_virtual_environments/hiOAWorld/
python -c "import sys; print sys.path"
Deactivate the created virtual project
deactivate
Check it again. The (hiOAWorld) prefix should be gone, and you should see the original Python system path.
python -c "import sys; print sys.path"
Reactivate the virtual project.
workon hiOAWorld
Check it again. The (hiOAWorld) prefix should be back again.
python -c "import sys; print sys.path"
Go back to where you were before
popd


Resolve Dependencies

Top
The application has a few external dependencies. Now is the time to go get them.
You don't have to be in the project directory, but do make sure you are safely "in" the virtual environment:
workon hiOAWorld
Now you can safely install the dependencies:
pip install progressbar2 gspread httplib2


Get Developer Credentials

Top
The included utility loadGoogleJSON.py reads Google's OAuth credentials out of a JSON file that you have to download from Google's Developer API console.

Follow these steps for Obtaining Authentication Credentials from Google's new style API Console.

You will need to save the file "client_secret_xxx...xxxcom.json" in the directory "~/disposable/gspread_HelloOAuthWorld".



Authenticate and self-authorize GMail SMTP use.

Top
I have several dummy GMail accounts for testing purposes.  I'll use one as the SMTP account holder and the other as the user whose Google Sheets we need to access.  For now we'll us the first one to get our email transfer capability working.

Run the program authorize_SMTP.py
./authorize_SMTP.py
The first time you run it, it will ask for all details. It remembers them so that you have less to do on subsequent operations.
No valid token pair found in working_parameters.py. Will run the wizard.
Enter the GMail address you want to authorize : <enter a GMail address here>

Provide the GMail address for which you want to be authenticated as the official user for SMTP mail transfer, and you will be prompted as follows :
To be able to request authorization from your users by email, you need to 
authorize this program to use Google's email resender in your name.
    Visit this url and follow the directions:

  https://accounts.google.com/o/oauth2/auth?client_id=18192nt.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&scope=https%3A%2F%2Fmail.google.com%2F

    * * * Enter verification code : <put the resulting code here>

Copy that url into a browser, follow the steps, then get the verification code and paste it into the field of the prompt "Enter verification code:"


"User login and consent" step . . .

"Authorization code" response step . . .
Shortly after pasting the code and hitting [Enter] the following should appear :
Success :
 - Access Token: = ya29.gQB-6n3zy3BzENT3Nd_3Gpk5PHtpfwZgmVlAofiGA
 - Refresh Token: 1/RevA7ZIJdKPaXVidCBRuyv93jv3FWRkSBs5l8qkv5wM
 - Access Token expires at : 2014-09-15 17:39

Appending latest tokens to the bottom of the file "working_parameters.py"... 
       ... done.

No test mail sent.

Also, a new file is written to disk, available to the next step. "working_parameters.py" contains :
# Appended automatically . . . 
google_project_client_smtp_access_token = 'ya29.gQB-6n3zy3BzENT3Nd_3Gpk5PHtpfwZgmVlAofiGA'
google_project_client_smtp_refresh_token = '1/RevA7ZIJdKPaXVidCBRuyv93jv3FWRkSBs5l8qkv5wM'
google_project_client_smtp_expiry = '2014-09-15 17:39'
google_project_client_email = 'doowa.diddee@gmail.com'
#

Get permission on other user's spreadsheet.

Top
Now that we can send email through Google SMTP, we can request access to Google products of other users.

Run the following command, replacing [some.other.user.name@gmail.com] with the GMail address of a real user, of course :
./request_authorization.py  -e some.other.user.name@gmail.com
Text like the following will be displayed :
Sending permission request . . . 
      . . permission request sent!

***  You have to verify that you DO allow this software to open your Google document space.
***  Please check your email at some.other.user.name@gmail.com.
***  The message, [doowa.diddee@gmail.com wants permission to access your Google Spreadheets with "gspread".] contains further instructions for you.

Trying to get tokens.
|>>>>            | Elapsed Time: 0:06:33 |             <<<<<|

Your user will see an email like this :
They will copy the indicated code and paste it into a screen like this :
Your user will approve by clicking the blue button . . .
. . . and see a confirmation screen like this :
If your user takes longer than 30 minutes you'll have to try again.

If your user does approve your request then the "./request_authorization.py" program will continue to the next step with a screen like this :

Identify the Google spreadsheet you want to use; use the full URL ("http://" etc, etc) 
Paste the full URL here : <full Google Sheets URL here>
Enter the URL of one your user's spreadsheets at the prompt, "Paste the full URL here : ". A quick test file will be generated :

A simple example file called gspread_HelloOAuthWorld.py was written to disk.
It lists the names of the sheets in the target spreadsheet.
Test it with:
       $  ./gspread_HelloOAuthWorld.py


Run ./gspread_HelloOAuthWorld.py.

Top
"gspread_HelloOAuthWorld.py" is an extremely simple sample that you can use as starting point for more complex projects. You run it like this :
./gspread_HelloOAuthWorld.py
It connects to your user's spreadsheet and returns the name of the contained sheets, like this :
Found sheets:
     - Sheet #1: Id = od6  Title = Sheet1

Conclusion

Top
As a freelancer, often collaborating with my clients on their spreadsheets, and interconnecting them with other data sources, I find my self needing the above functionality quite frequently.

Please let me know if you find them useful too.

2014-02-18

Step #09 : Demonstrate Google Spreadsheet to OpenERP data pump

Description

Welcome to the final step in this series.

If you have followed through from the beginning you will have a cloud-based platform with three valuable tools:
  • a SSH terminal server accessible by HTTPS
  • a Salt Stack master
  • the Google spreadsheet to OpenERP data pump.
Our last task is to understand how to use the data pump to control OpenERP instances.

There should be no need for further documentation of it here; the web site covers the whole thing in detail.  


Here's the video:


Tasks performed

  1. Update the "creds" page of the controller spreadsheet with IP address or domain name of the target OpenERP installation
  2. On the same page, also provide a name for the database to use, and the super user password of the OpenERP installation
  3. To test connectivity, run the pump, specifying the controller spreadsheet's ID key with the -k switch.  This first run should show that a connection was achieved, but no tasks run.
  4. There are eight possible subtask executions per task row.  A pair of bit maps control which of the eight are executed. If a row has 5 action steps and only the 4th should be executed. Using binary arithmetic the settings should be: complete 23, incomplete 8, length 31. Any error in that will be seen in the "Chk" column (P).
  5. To create the new OpenERP data base, we want to run the second action step in the first row.  We'll also run the chkTask action.  It simply reports that the indicated Model Class is available and has a method called chkTask.  There are two action steps in total, so the length is 3. 
  6. Run the pump, with the same command as before.
  7. Examine OpenERP to see if the database "demoVideo" was actually created.
  8. Log in to the "demoVideo" database in OpenERP.
  9. I never found a way to set the "Technical Features" check-box for the administrator via XMLRPC.  So this is an unfortunate, one-time, manual step.  A workaround is to restore a database backup, with similar default settings already done.
  10. We repeat the preceding steps for adding some national regions and languages.
  11. The remainder of the video attempts to point out how all the various parts of the pump work together.
An overview of all the steps is available here:

2014-02-17

Step #08 : Install and check OpenERP data pump

Description

In the next step in this series we will Demonstrate Google Spreadsheet to OpenERP Data Pump.

First, however, we have to install and configure it.  In this video you will see how.

For deeper understanding please read the documentation for Google Spreadsheets to OpenERP Data Pump.

Success with this step depends on the previous step Install And check GSpread.  In fact, the Salt state stack for the data pump needs to be able to execute GSpread's self-test.  Most of this video is concerned with preparing for that.

Here's the video:



Tasks performed

  1. Copy the access and refresh tokens that were acquired at the end of the previous step, and apply them to the pillar init.sls for gdata_oerp_pump
  2. Open a new or unimportant spreadsheet and copy'n paste its title, URL  and key into the pillar init.sls for gdata_oerp_pump. These control GSpread's self test
  3. Run salt 'name of the target minion' state.sls gdata_oerp_pump.
  4. Verify that the install proceeded correctly and that the embedded GSpread self-test encountered no errors.

An overview of all the steps is available here:

2014-02-16

Step #07 : Install And check GSpread

Description

Our purpose with this video series is to demonstrate how to set up a cloud-based, permanent, yet part-time, cloud control workstation.  One we can depend on to be there when we activate it, but switched off 95% of the time: casual DevOps if you will.

Our goal with the particular use-case example is to install and use the Google spreadsheets to OpenERP data pump.  That tool needs GSpread to connect to Google.

In this video we show how to use the Salt state stack we created for installing GSpread and the GData_OpenERP_Data_Pump.

We want our connection to Google to use OAuth.  That way, our username and password do not get left lying around by accident. 

Most of the following video is about preparing our Salt "pillar" file with the first phase of OAuth credentials before we run it.  Under the covers, our Salt state patches a JSON file with those data items and deposits it in GSpread's tests directory.  For the second phase of OAuth -- obtaining access and refresh tokens -- we use the tool get_google_oauth2_creds.py from the same directory.

The tool does several useful things:

  1. Guides you through authorizing GSpread to use your GMail for sending email.
  2. Sends email to a 3rd party requesting permission for GSpread to access their spreadsheets.
  3. Generates files in several formats containing the resulting access and refresh tokens.  One of those is qiktest.py; a quick check that the connection works.
Once we have these tokens, checked with qiktest.py, we can move on to the next step


Here's the video:


Tasks performed

  1. With GateOne as our SSH entry point, we no longer need SSH terminal windows in our work station
  2. Create two SSH windows in GateOne logged into the Salt master machine and become to root in one of them with sudo -sHu root
  3. As root, re-point Salt's symbolic link /srv to point to SaltStackToolSet/gdata_oerp_pump/srv
  4. Open the APIs and Auth >> Credentials sub-menu of our project in Google's API Console
  5. Edit the pillar file init.sls, providing values for redirect_uri, client_secret and client_id obtained from the API console
  6. Run Salt state.sls against the master's own minion to install just GSpread.  (We'll do the same for gdata_oerp_pump in the next video)
  7. In the tests subdirectory of the temporary gspread build directory, run get_google_oauth2_creds.py
  8. Pick the 2nd method and provide the project Google ID and the ID of the 3rd party whose spreadsheets are to be worked on
  9. Authorize the project's GMail account to be used for SMTP by obtaining the verification code and pasting it back into get_google_oauth2_creds.py
  10. Run a quick test to be sure we are connecting correctly.
An overview of all the steps is available here:

2014-02-14

Step #06 : Install and use GateOne; the HTTP terminal server

Description

After this step in the series you'll be able to continue from any machine, anywhere, even a Chromebook!

We finished the previous step, Install OpenERP using a SaltStack state file, with OpenERP up and running in a Salt minion.  Inappropriately for a production-grade service, we installed it in a disposable pay-by-the-hour virtual machine.

On the other hand, for occasionally required services such as a Salt master, pay-by-the-hour VMs like iwstack.com's are a boon for several reasons: cost and security.

I have a very lightweight VM (384MB) with a minuscule virtual disk for my administrative work.  It's dead for most of its life.  When I need to work on a client's machines -- off site back up, for example -- I attach their admin virtual disk to my little admin VM, boot it up, do my housekeeping work, like saving a backup copy, and kill it again.

I used to use SSH for that work, but really regretted it one day when a client had an emergency and I was hours away from access to any copy of my SSH private key.

That's when I found GateOne.
GateOne is a terminal server with a HTTP client.

Now, my SysAdmin work day involves :
  1. Logging into iwStack.com 
  2. Booting up my admin VM
  3. Logging into the GateOne terminal of the admin VM using my Google ID
  4. Finding all my old terminal sessions up, running and logged back in
  5. Dusting the shelves and mopping the floor
  6. Killing my admin VM
  7. Logging out of iwStack.com 

Here's the video:


Tasks performed

  1. Make the Master also be it's own minion, check that both daemons are running and have the Master accept the Minion's keys
  2. Finalize the Salt state stack for GateOne, link it into Salt's path and then run it
  3. Start up GateOne
  4. Access GateOne by IP address in the browser.
  5. Fix the missing "origin" in GateOne's configuration
  6. Restart GateOne, and gain clear anonymous access to it
  7. Ensure we are already logged in to Google
  8. Switch the "auth" configuration parameter from "none" to "google"
  9. Restart GateOne, and get automatically redirected to Google's confirmation page
  10. Confirm our trust in the GateOne page
  11. Gain clear authenticated access to GateOne
  12. Use GateOne's identity manager to create an SSH key pair.
  13. Ensure SSH is intalled and that a .ssh directory exists with correct permissions
  14. Add the public key to tour authorized keys file
  15. Open an SSH session with the server, authenticate with SSH keys instead of UID/PWD

An overview of all the steps is available here:

2014-02-13

Step #05 : Install OpenERP using a SaltStack state file

Description

We finished the previous step, Install SaltStack master and minion, with a Salt minion accepted for communication with a Salt master.

Now, at last, we can install OpenERP using the "stack" for that purpose published here : FleetingCloud's SaltStack Tool Set on github.com.


Why SaltStack?

Salt requires a bit of conceptual understanding. So here goes...

The point of Salt is to offer a standard declarative language for preparing machines to serve their intended purpose.  As much as possible it hides differences between operating systems.

In this code fragment we see how to customize the PostgreSQL configuration file:
postgresql.conf:
    file:
        - blockreplace
        - name: /etc/postgresql/9.1/main/postgresql.conf
        - marker_start: "# START SaltStack managed zone -DO-NOT-EDIT- -----"
        - marker_end: "# END SaltStack managed zone -----------------------"
        - prepend_if_not_found: True
        - backup: '.bak'
        - show_changes: True

postgresql.conf-accumulated1:
    file.accumulated:
        - filename: /etc/postgresql/9.1/main/postgresql.conf
        - name: postgresql-accum
        - text: "listen_addresses='{{ grains['ip_interfaces']['eth0'][0]}}'"
        - require_in:
            - file: postgresql.conf


The first of the two sections sets up a "managed zone" at the beginning of the file /etc/postgresql/9.1/main/postgresql.conf. The require_in phrase of the second section says that the postgresql.conf state should include the "listen_addresses" configuration item. In Salt, "grains" are named attributes of the minion. So, for example, the grain "ip_interfaces" is a list of interfaces on the minion. Listed item "eth0" has an array of attributes of which the first ("0th") is the IP address that PostgreSQL needs to be told about. 

All this may seem like a very verbose and roundabout way of patching a file, but it is advantageous because of platform independence .  Also, once you are familiar with the syntax it becomes much more readable that reams of bash script.

Finally, the main advantage is that it can all be kept on a single master server, and pushed out to hundreds or thousands of minion machines according to a wide variety of targeting and filtering rules.

The /srv Symbolic Link

While Salt's minion targeting is very powerful, it is very advanced usage.  For this video series I have placed each installation state stack in a separate named directory of SaltStackToolSet.  Each one has a single immediate subdirectory named "srv".  For each installation run of salt "*" state.highstate I simply set a symbolic link from Salt's default state stack directory named "/srv".  Not recommended for production, but clear and simple for an introductory video.

Here's the video:


Note correction:  Please refer to Update on recent changes to Video #5 for errata updates.

Tasks performed

  1. Go to GitHub to get the URL of zip'd FleetingClouds / SaltStackToolSet.
  2. Download and unzip the tools.
  3. Rename the directory to SaltStackToolSet
  4. Copy the example pillar file and edit the new one with the correct configuration parameters for OpenERP.
  5. Set up a symbolic link from Salt's default state stack directory ("/srv") to the "srv" directory of our OpenERP tool.
  6. Run salt 
  7. Restart the minion
  8. Test that OpenERP is working.
An overview of all the steps is available here:

2014-02-12

Step #04 : Install SaltStack master and minion

Description

In the first three steps we set ourselves up with a working environment "out in the cloud", thanks to iwStack's implmentation of Apache CloudStack.  Although, as underlined in Step #03 : Create template from VM and 2nd VM from template any two Ubuntu (and probably any Linux) machines would do just as well.

I have experimented with Chef, and a bit with Ansible.  I have used RunDeck a lot.  I've found that SaltStack is closest to what I want -- declarative provisioning with open source modularity in Python.

In this video, I take you through the steps to get a SaltStack master and minion installed and interacting.

SaltStack Bootstrap is Unix shell script that gives you a complete installation with a single command.  With switch "-P" it installs a minion; with the switches "-M -N", it installs a master.  We will do both, and then get them talking.

Here's the video:



Tasks performed


  1. Connect to the minion by SSH, change its "host name", tell it the name and IP address of the Salt master and then reboot it.
  2. Go to GitHub to pick up the latest SaltStack Bootstrap. Note that the usage instructions changed shortly after I made the video.  You should find, on that page, the bootstrap install that best suits your needs. I use this now.
  3. As root, paste the indicated text into a SSH terminal and let it run.
  4. Reboot to ensure the minion starts up as a working SaltStack minion.
  5. Check the log files to ensure it is trying to contact the master.
  6. Start up the master, and repeat the above.
  7. When the master is working, ensure that the minion has presented keys and accept them.


An overview of all the steps is available here: