Introduction
Google App Engine (GAE) is a useful platform on which to develop
Python-based web applications. But a GAE application runs in a sandbox
that prevents it from opening a socket, which makes the standard Python
xmlrpclib
module inoperable.
Fortunately, there’s a simple solution to this problem.
This article is broken into several parts: First, I discuss the code that’s
necessary to get xmlrpclib
to work within a GAE application. Then, I show
how to enhance picoblog, the sample GAE blogging engine I developed for
my Writing Blogging Software for Google App Engine article, so that it
can send a “ping” to Technorati when a new article is published.
Getting XML-RPC to work with GAE
A Quick Overview of xmlrpclib
It’s entirely possible to do XML-RPC without the benefit of the standard
Python xmlrpclib
module. But xmlrpclib
makes things so much simpler
that it’d be nice to use it. Doing the job manually means building an XML
message, sending it to the remote HTTP server, reading the result XML, and
parsing that XML. xmlrpclib
already does all that. But xmlrpclib
attempts to open a socket to connect to the remote HTTP server, and opening
a socket is strictly forbidden by the GAE sandbox.
Ideally, we want to use xmlrpclib
, but have it connect to the remote HTTP
server using the fetch() function provided by the
google.appengine.api.urlfetch
module. We could create our own hacked
version of xmlrpclib
to do just that, but, luckily, the authors of
xmlrpclib
thought ahead and made the library easy to extend.
Sample XML-RPC call
Typically, making an XML-RPC call through xmlrpclib
requires code
like this:
1 2 3 4 5 6 7 8 9 10 |
|
There are a couple things going on here. The first line of code sets up a
ServerProxy
object that allows us to interact with the remote RPC server.
The actual method call looks just like a method call. The xmlrpclib
module translates this line of code:
1
|
|
into the following chunk of XML, which it then sends to the remote web server:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
It then waits for the XML-RPC response and decodes it into a dictionary of name-value pairs. Just for completeness, here’s the successful result from a Technorati ping:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
And here’s the failure result:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
xmlrpclib
Transport Classes
As written, the code, above, won’t work, because xmlrpclib
attempts to create a socket to connect to the web server, and GAE’s
sandbox
forbids creating sockets.
However, under the covers, xmlrpclib
uses special transport
objects to make the connections to the remote HTTP server. The
standard transport object is an instance of the
xmlrpclib.Transport
class; you can examine that class by looking
at the
xmlrpclib.py source code
in your Python distribution. Here’s a portion of that class; the
method we care about is request()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
|
The class is considerably larger than that, but request()
is the
only method that’s required by the interface.
As it happens, the ServerProxy
class lets us pass in our own
transport object; if we don’t supply one, it uses its Transport
object (if the connection is not an SSL connection). This is the
key to our GAE solution.
The GAEXMLRPCTransport
class
We can create our own transport class that uses the
google.appengine.api.urlfetch
module’s
fetch()
method instead of standard socket access. That class turns out to
be pretty simple:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
There are several things to note about the request()
method.
- It uses the
fetch()
method from the GAE API. - It scrupulously raises
xmlrpclib
exceptions on error conditions. - It uses
xmlrpclib
functiongetparser()
to parse the result. Unlike the response parsing logic in thexmlrpclib.Transport
class, ours is much simpler, since it has the entire response in hand and doesn’t have to read it a bufferful at a time.
Using the GAEXMLRPCTransport()
class, we can now make our XML-RPC
client code work within GAE:
1 2 3 4 5 6 7 8 9 10 11 |
|
Changes to picoblog
Finally, as a proof of concept, let’s change the picoblog
software (see `Related Brizzled Articles`_, below) to ping
Technorati whenever an article is published for the first time.
New xmlrpc.py
module
First, put the GAEXMLRPCTransport
class in its own xmlrpc.py
file, and put that file at the root of the picoblog
source tree.
Changes to defs.py
Next, we add a few things to the defs.py
module:
1 2 3 4 5 6 7 8 9 |
|
The CANONICAL_BLOG_URL
constant defines the URL of our blog; we
have to include that information in the Technorati ping. (We
could figure that out from the request that posts the article to
be saved, but using a constant is simpler for now.) The second
block of code sets ON_GAE
to True
if we’re running on App
Engine, and False
if we’re running within the local development
server. During tests on the development server, we’ll ping a fake
URL; see below.
Changes to admin.py
SaveArticleHandler
The bulk of the changes are in the admin.py
module. First, we have to
modify the SaveArticleHandler
class to detect when an article is
published and notify Technorati when that happens. (GAE invokes an instance
of the SaveArticleHandler
class to process the “save article” action.)
We’ll use a simple definition of “published”: When the “draft” flag is
cleared.
Here’s the new version of SaveArticleHandler
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
|
Here are the relevant changes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Determining that the article was just published is trivial. If it has
just been published, we call a (new) alert_the_media()
function.
The alert_the_media()
function
This function sends the appropriate alerts to whichever external web sites we think should hear about new articles. Currently, that’s only Technorati, but we might want to add more later, so it doesn’t hurt to put this logic in a separate function.
The alert_the_media()
function is simple enough:
1 2 3 |
|
The ping_technorati()
function
Finally, we get to the function that uses our XML-RPC coolness. It’s not a whole lot different from the `sample XML-RPC call`_ at the top of the article:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Note the first four lines, though. They say:
- If we’re running on GAE, use the real Technorati ping URL.
- Otherwise, use a fake one.
Those constants are defined at the top of admin.py
:
1 2 |
|
The fake URL is nothing more than a canned page. On my development machine, I run an instance of the Apache HTTP server. In my personal web page area, I created a static XML file containing the canned result of a Technorati ping. (See above.) That way, I can test the XML-RPC logic without actually pinging Technorati for real.
And that’s all there is to it.
Potential Problems
Note that XML-RPC calls can fail for several reasons, including:
- The XML-RPC response is too large. GAE defines a
ResponseTooLargeError
that is sent when the response data exceeds the maximum allowed size and theallow_truncated
parameter passed to fetch() wasFalse
. Passingallow_truncated=True
to fetch() isn’t especially helpful, so there isn’t much we can do about this error. - The remote HTTP server takes too long to respond. There’s not much we can do about this error.
Getting the Code
The code used in this article is available at http://software.clapper.org/python/picoblog/.
Related Brizzled Articles
Additional Reading
- XML-RPC HOWTO
- The xmlrpclib.py source code
- The xmlrpclib documentation
- Building Scalable Web Applications with Google App Engine (presentation), by Google’s Brett Slatkin.
- Google App Engine documentation