2014-09-17

"Hello World" for accessing Google Sheets with Python and OAuth

Remote Access to Google Spreadsheets using Python, GSpread and OAuth2


gspread is a high-level tool kit for accessing Google Sheets from Python.  It makes it really easy to manipulate spreadsheets if you have the username and password to get into them.

Unfortunately, its documentation for OAuth access does little more than tell you to read Google's manuals.  If you have small experience with OAuth, this can be a big problem.

This tutorial should get you going quickly, as long as you already have Python running in Ubuntu.  Everything is explained step by step.  You can get the code from gspread_HelloOAuthWorld on GitHub.

Note that that OAuth has several versions and numerous distinct sequences of operations for different purposes and circumstances. One of the trickiest parts of learning OAuth is that different authors use different terminology to refer to the same sequence or they simply fail to clarify which sequence or version they're talking about.

So to be clear . . .  this Hello World example shows you how to perform two different sequences :
  1. authenticate and authorize yourself to use your own GMail account as a mail transfer service for your Python programs. It uses the OAuth sequence described in Using OAuth 2.0 for Web Server Applications.
  2. get a third-party to authorize you to access their Google Sheets. It uses the OAuth sequence described in Using OAuth 2.0 for Devices.

authorize_SMTP.py

The included program authorize_SMTP.py works in accordance with this sequence:



It prepares and displays a URL that includes a request token, and then prompts you to enter an authorization code. When you open the URL in a browser, you will need to log in to Google (if you haven't done so earlier) and consent to the indicated permissions. You then copy the resulting code back into the command line of authorize_SMTP.py.  It will write to disk a file called "working_parameters.py", which, when imported into any other Python program, will provide access and refresh tokens for using GMail as an SMTP mail transfer service.

request_authorization.py

The included program request_authorization.py works in accordance with this sequence :



It connects to Google and obtains a URL and authorization code specific to Google Sheets, then prepares an email and sends it to the individual you specify.

The text of the email explains to them the purpose of the authorization code and provides a hyperlink to the authorization page.  They simply have to copy and paste the provided code into the field presented at that URL, and then consent to the indicated access permissions on Google Sheets.

Meanwhile, from the moment it sends the email, request_authorization.py sits in a slow loop waiting for 30 minutes, for your user to react. If they do indeed authorize you, it will prompt you to enter the URL of one of their Google Sheets workbooks.  It then writes two files to disk: a new credentials file called "creds_oa.py", and an executable Python script "gspread_HelloOAuthWorld.py". The latter uses the former and gspread to get and display the name of the first sheet of your user's workbook.

Here are the steps :
  1. Get the code
  2. Start up a Virtual Environment
  3. Resolve Dependencies
  4. Get Developer Credentials
  5. Authenticate and self-authorize GMail SMTP use.
  6. Get permission on other user's spreadsheet.
  7. Run gspread_HelloOAuthWorld.py.

Get the code.

Top
#
# Make a disposable directory
mkdir -p ~/disposable#
# Jump into it
pushd ~/disposable
#
# Get the code
wget https://github.com/martinhbramwell/gspread_HelloOAuthWorld/archive/master.zip
#
# unpack it
unzip master.zip
#
# give it its real name
mv gspread_HelloOAuthWorld-master gspread_HelloOAuthWorld
#
popd


Start up a Virtual Environment

Top
I don't want to be accused of wrecking anyone's system. Python is integral to the workings of Ubuntu and altering it willy-nilly is a bad idea. Virtual environments protect you from that.

First we check what the Python execution path looks like normally. It contains directories like /usr/local/... and /usr/lib/... etc. That's what we do not want to mess up.
python -c "import sys; print sys.path"
The included bash script prepare_virtualenv.sh automates the job of setting your machine up for virtual environment management :
pushd ~/disposable/gspread_HelloOAuthWorld
sudo ./prepare_virtualenv.sh
The scripts create new environment variables and aliases that will be available on next log in. To get them immediately, run this . . .
source ~/.bashrc
We create a shadow directory holding the secluded execution environment for our project.
mkvirtualenv hiOAWorld
Notice that your execution prompt is now prefixed with (hiOAWorld). Check if the execution path is now self-contained. You should see numerous references to a directory /home/yourself/.python_virtual_environments/hiOAWorld/
python -c "import sys; print sys.path"
Deactivate the created virtual project
deactivate
Check it again. The (hiOAWorld) prefix should be gone, and you should see the original Python system path.
python -c "import sys; print sys.path"
Reactivate the virtual project.
workon hiOAWorld
Check it again. The (hiOAWorld) prefix should be back again.
python -c "import sys; print sys.path"
Go back to where you were before
popd


Resolve Dependencies

Top
The application has a few external dependencies. Now is the time to go get them.
You don't have to be in the project directory, but do make sure you are safely "in" the virtual environment:
workon hiOAWorld
Now you can safely install the dependencies:
pip install progressbar2 gspread httplib2


Get Developer Credentials

Top
The included utility loadGoogleJSON.py reads Google's OAuth credentials out of a JSON file that you have to download from Google's Developer API console.

Follow these steps for Obtaining Authentication Credentials from Google's new style API Console.

You will need to save the file "client_secret_xxx...xxxcom.json" in the directory "~/disposable/gspread_HelloOAuthWorld".



Authenticate and self-authorize GMail SMTP use.

Top
I have several dummy GMail accounts for testing purposes.  I'll use one as the SMTP account holder and the other as the user whose Google Sheets we need to access.  For now we'll us the first one to get our email transfer capability working.

Run the program authorize_SMTP.py
./authorize_SMTP.py
The first time you run it, it will ask for all details. It remembers them so that you have less to do on subsequent operations.
No valid token pair found in working_parameters.py. Will run the wizard.
Enter the GMail address you want to authorize : <enter a GMail address here>

Provide the GMail address for which you want to be authenticated as the official user for SMTP mail transfer, and you will be prompted as follows :
To be able to request authorization from your users by email, you need to 
authorize this program to use Google's email resender in your name.
    Visit this url and follow the directions:

  https://accounts.google.com/o/oauth2/auth?client_id=18192nt.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&scope=https%3A%2F%2Fmail.google.com%2F

    * * * Enter verification code : <put the resulting code here>

Copy that url into a browser, follow the steps, then get the verification code and paste it into the field of the prompt "Enter verification code:"


"User login and consent" step . . .

"Authorization code" response step . . .
Shortly after pasting the code and hitting [Enter] the following should appear :
Success :
 - Access Token: = ya29.gQB-6n3zy3BzENT3Nd_3Gpk5PHtpfwZgmVlAofiGA
 - Refresh Token: 1/RevA7ZIJdKPaXVidCBRuyv93jv3FWRkSBs5l8qkv5wM
 - Access Token expires at : 2014-09-15 17:39

Appending latest tokens to the bottom of the file "working_parameters.py"... 
       ... done.

No test mail sent.

Also, a new file is written to disk, available to the next step. "working_parameters.py" contains :
# Appended automatically . . . 
google_project_client_smtp_access_token = 'ya29.gQB-6n3zy3BzENT3Nd_3Gpk5PHtpfwZgmVlAofiGA'
google_project_client_smtp_refresh_token = '1/RevA7ZIJdKPaXVidCBRuyv93jv3FWRkSBs5l8qkv5wM'
google_project_client_smtp_expiry = '2014-09-15 17:39'
google_project_client_email = 'doowa.diddee@gmail.com'
#

Get permission on other user's spreadsheet.

Top
Now that we can send email through Google SMTP, we can request access to Google products of other users.

Run the following command, replacing [some.other.user.name@gmail.com] with the GMail address of a real user, of course :
./request_authorization.py  -e some.other.user.name@gmail.com
Text like the following will be displayed :
Sending permission request . . . 
      . . permission request sent!

***  You have to verify that you DO allow this software to open your Google document space.
***  Please check your email at some.other.user.name@gmail.com.
***  The message, [doowa.diddee@gmail.com wants permission to access your Google Spreadheets with "gspread".] contains further instructions for you.

Trying to get tokens.
|>>>>            | Elapsed Time: 0:06:33 |             <<<<<|

Your user will see an email like this :
They will copy the indicated code and paste it into a screen like this :
Your user will approve by clicking the blue button . . .
. . . and see a confirmation screen like this :
If your user takes longer than 30 minutes you'll have to try again.

If your user does approve your request then the "./request_authorization.py" program will continue to the next step with a screen like this :

Identify the Google spreadsheet you want to use; use the full URL ("http://" etc, etc) 
Paste the full URL here : <full Google Sheets URL here>
Enter the URL of one your user's spreadsheets at the prompt, "Paste the full URL here : ". A quick test file will be generated :

A simple example file called gspread_HelloOAuthWorld.py was written to disk.
It lists the names of the sheets in the target spreadsheet.
Test it with:
       $  ./gspread_HelloOAuthWorld.py


Run ./gspread_HelloOAuthWorld.py.

Top
"gspread_HelloOAuthWorld.py" is an extremely simple sample that you can use as starting point for more complex projects. You run it like this :
./gspread_HelloOAuthWorld.py
It connects to your user's spreadsheet and returns the name of the contained sheets, like this :
Found sheets:
     - Sheet #1: Id = od6  Title = Sheet1

Conclusion

Top
As a freelancer, often collaborating with my clients on their spreadsheets, and interconnecting them with other data sources, I find my self needing the above functionality quite frequently.

Please let me know if you find them useful too.