Update 2015-11-29: The described authentification method has been depredated. Thx to m@ for pointing this out.
In this note we will receive and send rows to a Google Spreadsheet.
Google Spreadsheets is a great tool to store all kinds of tables e.g. for expenses or contacts, in a human readable, editable and globally accessible form. With the Google Drive API it is possible to automate access to Google Spreadsheets and use these worksheets as little databases that can be modified from the command line.
The API is even more powerfull and supports applications to access data from third party users (e.g. for a image manipulation tool). This power brings a lot of complexity, in particular to the authentification process, that makes the documentation hard to understand. This note walks you through a simple data access example, while avoiding a lot of this complexity.
##Step 1: Install gdata Library
We will be using the gdata-python-client. You can either install it from source or using a package manager, e.g. using
$ pip install gdata
We will need to access your Google Drive data but do not require to access data from third parties. Therefore we don’t need OAuth2 but can use application specific passwords:
Generate a password at https://security.google.com/settings/security/apppasswords
Store it an a json file (do not check it into version control) e.g. using
$ cat <<EOF > GoogleAppPw.json
{
"email" : "<put your email here>",
"password" : "<put generated password here>"
}
EOF
We can read the password from a python script as follows:
If the last command did not raise an error, you have successfully connected to the Google API. Let’s print all available spreadsheets:
Expenses
Untitled spreadsheet 1
Untitled spreadsheet 2
Untitled spreadsheet 3
...
First, we read the name of the spreadsheet and the worksheet we want to access into variables. I have added them to the config file to leave the code more generic, feel free to use string literals instead.
Now we iterate through the document feed and select the spreadsheet entry with matching title:
Unfortunately, there is no easy way to obtain a spreadsheet object, from the entry which we can query for cells. Instead, we have to manually extract the key field from the id (this is no joke. cf. example) and pass it back to the client library:
Now we can iterate over the worksheets in a similar way:
Now we have found the keys of the worksheet and spreadsheet.
We retrieve the rows of a worksheet as iterator of dictionaries. For this to work, the spreadsheet has to comply to a format. The first row (header) defines the key of the dictionary the following rows provide the subsequent values.
Here is an example:
| date | comment | amount | currency |
| 2015-01-10 | Dinner | 29.30 | EUR |
| 2015-01-10 | Taxi | 12.00 | EUR |
The worksheet can be accessed as follows:
Now, we print out all rows in the sheet:
['date', 'comment', 'amount', 'currency']
---
['2015-01-10', 'Dinner', '29.30', 'EUR']
['2015-01-10', 'Taxi', '12.00', 'EUR']
This is surprisingly straight forward.
success
For future use we wrap the above functions in a few classes. In the future I might add a few more methods to it. The current version can be found on GitHub.