WSGI, PSGI, Rack - learning some new backend stuff

(This is one of those articles I sometimes fear writing, because it reveals a vast gap in my knowledge. I've spend most of the past decade in PHP programming, with numerous forays into other langauges and frameworks, but mostly in the app dev end. I have to deploy my code and that's made me take forays lower into the stack.)

I've been learning the Django REST Framework, and it's nice, but deploying it was difficult for me, because I'm not an experienced pythonist. Unlike PHP, which generally comes integrated into Apache with most Linux distros, python web apps need to be built and deployed the old fashioned way, by editing some config files to wire up the various parts that make up the app.

The integration between Apache and Python is called web service gateway interface, WSGI. It's a lot like CGI, if CGI were grafted with Unix pipes. WSGI also seems to be influenced by FastCGI, which keeps the application running, to avoid the startup overhead.

(FastCGI never went away, but it seemed to fade into the background when ASP, JSP, PHP, and ColdFusion dominated web dev in the early 2000s. The FastCGI model of exectution, however, was never obsolete. It's just inherently more difficult to program because, unlike the "page with embeded code" languages, the programmer must clean up the execution enviornment after each response. The advantage is that it's faster than most other system, and also has more sound, decoupled architecture than something like mod_perl, mod_python or even mod_php.)

WSGI is like a "framework for web app frameworks" because it introduces the idea of "middleware" or software that transforms data and passes it on to other software. WSGI defines programs as taking input - a Request - and producing output - a Response. Requests can be modified, and then passed down to another layer of middleware. Responses are received, and can be modified, before being passed back up to another layer of middleware. It's like Unix pipes, but you have two streams of data that are transformed rather than just one.

A framework can be factored into different layers of middleware, so that each layer can be disaggregated and swapped with other layers.

WSGI influenced the creation of Rack for Ruby on Rails, and PSGI for Perl. Other languages have similar *SGI intefaces, and this has led to the development of uWSGI, which is a app server container for middleware that supports several languages, and manages each layer.

If the framework supports it, non-WSGI applications can also be run within a WSGI application, if they act like CGI apps. The middleware alters the requests, calls the CGI, and then alters the response.

So, the pattern in the *SGI systems is to have layers of services. You have a security layer, a routing layer, a url rewriting layer, a data sanitization layer, etc. On the return trip, if the application at the bottom of the stack returns well-formed XML, you can have XML-transforms on the response as it is forwarded up the stack.

Having read and, I think, grokked WSGI, it's become a little easier to read the Django REST Framework sources. It's not that the framework is full of WSGI - it's not. It's based on Django, which predates WSGI, but was written in a post-WSGI world. You get a general feel about the patterns used to build contemporary Python software.

Other Systems

WSGI is a successor to CGI, and doesn't seek to displace application servers directly. Rather, it's something that will allow the creation of new application servers by allowing more vendors into the space. In the very old model, web servers had ways to extend the server via modules: Apache modules, NSAPI, and ISAPI. The modules would be written in C, and were fast, but hard to write. CGIs were easier to write, and thus, dominated for a while.

Then, the *SP/PHP languages took over. JSP was easiest to extend, with tag libraries written in Java, which ran right in the server. You only needed some jar/war files. ASP was fairly easy to extend if you could program ActiveX controls. PHP was hard to extend, because you needed to know C, could write to their spec, and needed to modify and build PHP from sources. Consequently, people just wrote OO classes in PHP rather than write extensions in C unless absolutely necessary.

WSGI is implemented above or outside of the *SP/PHP languages. It's more akin to something that intrudes into the Apache modules, NSAPI and ISAPI space. However, the natural trend will be to move framework features out of the framework and into a WSGI layer. Each such layer could lead to forking different, competing layers.

Looking at the list of middleware it looks like this stuff is still in flux, and some things are dead. It could be that some things are suited to be middleware, and some things are still better implemented within the application's codebase. And given that ops don't want to mix and match allegedly compatible layers, you end up with suites like TurboGears, which looks nice.

On the flipside, you have Node.js and Go, which are a different style of application server. It seems like the current action is in node.js and Go. Python frameworks are somewhat moribund compared to these new systems. web2py, Django, and TurboGears seem to be the most active. Django REST Framework is also active. At the same time, Python is running a lot of high-traffic websites, and is still in the game - but I have to wonder if it, like Ruby, has scaling limits that have to be overcome with additional tech.