If you have watched my posts over the last few years, you know I’m a fan of Django for web site development. Despite its many redeeming qualities (ORM, server integration, MVC, and many others), Django’s URL mapping file (urls.py) is problematic for me because it’s targeted at the needs of the original developers. Normally, this wouldn’t be a problem, but unfortunately the mapping needs of the original developers don’t translate across most web sites.
Why It’s Messed Up
Django’s developers managed news-oriented sites that, for the most part, display news stories. If you think about most news sites, a single dynamic HTML page serves thousands upon thousands of different stories. A news site has very few actual programmed pages because the story content comes from the database. Perhaps I’m stretching things a bit here — news sites certainly have a number of dynamic pages — but I think it’s fair to say that news sites rely upon fewer page designs (with more database-driven content) than most sites.
In contrast to news sites, most web sites contain many different dynamic pages. For example, my current project is a learning management system for university courses. While the content is database driven so users see their own assignments, exams, and files, the system is made up of hundreds (and eventually thousands) of different dynamic pages.
Here’s the rub: The Django framework requires a line entry in urls.py for every, single, dynamic page. This entry tells the Django system where to send requests based on their URL patterns. It is powerful, but it is definitely tedious. The regular expressions are primarily meant to pass parameters to view functions (i.e. to make pretty urls) rather than route many different HTML pages through a single line. When a site has thousands of pages, this file becomes:
- Large and unwieldy. As development and maintenance continue over time, the file easily gets out of sync with the actual functions and pages it is referencing. It takes a serious effort to ensure it stays updated, clean, and correct. Too much configuration was one of the reasons I left Java for Python!
- Painful. This is the real issue, IMO. I can deal with #1 through good practices, but who wants to write a line in the urls.py file each time a page is added? Certainly not me. Isn’t this exactly what “design by convention” is all about? With a convention, each new page simply needs to match a prescribed format to be found by the system.
Is this really such a big deal? It certainly isn’t a reason to leave Django in favor of the many other options out there. I happen to really like the rest of the Django ecosystem. And, there’s an easy fix. The pattern matching involved in urls.py makes adding a central, convention-based router (controller) really easy.
The Fix: A Convention-Based Controller
Most web languages route requests based on directory and file extension. The web server handles files with extensions like .png or .css. It passes urls ending with .jsp or .php to their respective interpreters. We just need to tell Django to do the same thing. Let’s come up with a file extension to represent a Django page. In my site, I simply use .html because every page in my site is dynamic; both color-coding editors and users like seeing .html in the filename. But for this post, I’ll use .dj to keep clear which pages should be routed by our new controller.
Step 1: Configure your web server
First ensure that Apache or IIS or Nginx knows to route filename.dj requests to your Django system. This is different with each server, but it’s a simple modification to the regular Django instructions. I trust you can figure it out.
Step 2: Add the controller to urls.py
We could add a single line to handle every .dj file, but in keeping with Django’s design, I’ll limit our controller to a single app. Suppose we had an application named “/account”. In the master urls.py file, add a line similar to the following:
(r'^account/(?P<path>.*)\.dj(?P<urlparams>/.*)?$', 'account.views.route_request' ),
The above line tells Django to route any .dj file in the account app to the route_request view function.
Step 3: The request_route view function
Add the following to your account/views.py file:
import sys
from django.http import HttpResponse, HttpResponseRedirect, Http404
from django.shortcuts import render_to_response
def route_request(request, path, urlparams):
parameters = urlparams and urlparams[1:].split('/') or []
funcname = 'process_request__%s' % path
try:
function = getattr(sys.modules[__name__], funcname)
except AttributeError:
return render_to_response('account/%s.dj' % path, { 'parameters': parameters })
return function(request, parameters)
Following is an explanation of the above code:
- Its primary parameter is the path variable, which contains the view function to be called. (I’ll get to urlparams in a minute). If a browser requests /account/test.dj, the path is equal to test.
- The function add process_request__ to the front of this path, making the full function name process_request__test in our example. The reason for prepending this seemingly arbitrary text to the beginning of the request is security. Since we’re routing expressly based on the URL pattern, a rogue user could try different names to call functions in our views.py file that were not meant to be web accessible. Prepending process_request__ ensures that only those methods starting with this text can be called from the web. It also helps us as programmers know which functions will be web accessible and which will not.
- The function tries to find the named function in the views.py file. If successful, it calls that secondary function and returns its response to the Django system. The secondary function acts like a normal view function that could have been called directly from a urls.py pattern.
- If the function isn’t found, it tries to find a template with the given name. More on this in Improvement 2 below.
My secondary function follows. Add this function to your account/views.py file.
def process_request__test(request, parameters):
return HttpResponse('Got to test view function with parameters: %s' % parameters)
This secondary function should do all the typical work of a view function, then it returns an HttpResponse object with the HTML to return to the browser. Again, note that test in the name of the function matches the test.dj given in the browser request URL. To add new view functions, simply add new process_request__* functions to the file.
Improvement 1: URL Parameters
Django’s documentation makes a big deal out of “pretty” URLs. The urls.py file uses patterns to embed parameters within the URL itself, freeing the user from unsightly ?a=b&c=d parameters. The urlparams pattern in our urls.py line above enables something similar.
The URL /account/test.dj/first/second/third/ still calls process_request__test, but it also adds the python list [ 'first', 'second', 'third', '' ] as parameters to the function call. I hope you agree that this is much prettier than the alternative, and it keeps with Django’s culture.
The drawback to this mechanism is the url-based parameters are given in a numbered list rather than in a named dictionary like traditional parameters. You can decide which is most important to you.
Improvement 2: Automatic Template Rendering
The standard Django system assumes we want to call a view function before every template. This is normally the case, but it isn’t true across all the pages on my site. Some pages, such as the “about us” and “contact us” pages don’t require any processing before rendering. They still have dynamic content in them, so I need dynamic templates rather than regular static HTML pages. But I don’t need a view function to precede their calls.
The AttributeError code in the route_request function handles these cases nicely. If a process_request__* function is not found, the function tries to render a template with the given name. Ensure your TEMPLATE_DIRS variable is set correctly in settings.py, then put the following in <templates_dir>/account/test2.dj:
Got to test2.dj template with parameters: {{ parameters }}
Now take your browser to /account/test2.dj. It should directly render the template — no view function required!
You might be wondering if this compromises security. First, note that this only calls .dj files, which are expected to be renderable. Then, if you really want to ensure a page cannot be called without its view function, name the template something other than .dj, such as test.hdj for a hidden django file. With a different extension, the page can never be rendered automatically, but it can still be rendered easily from a view function by a call to render_to_response(‘/account/test.hdj’, {}).
Improvement 3: Subrouting
This one is my favorite. Another (smaller) problem I have with standard Django is placing all the view functions in a single file. On a large project, the views.py grows, and grows, and grows, and…gets really convoluted. There’s just too much in a given views.py file. One solution to this problem is to split the project into more applications, but I’d rather split applications based on design principles rather than pragmatism. A better solution is to split the views.py file into a number of smaller files.
Instead of looking in a views.py file, my controller looks for view functions in a views/ directory. For example, the URL /account/test.dj in my system goes to /account/views/test.py and looks for a standard method called process_request() within the test.py module. This allows me to split my view functions across many files rather than a huge views.py file.
With this change, I get five views directory files for five pages. A single function in each file responds to the request for each page.
But perhaps that goes a little too far the other direction. Let’s pull back a bit. I added an exclamation point to the routing pattern to allow multiple view functions each file. In other words, each views directory file contains a primary response function called process_request(), plus any number of additional process_request__*() functions.
The best use case for this seemingly complex improvement is Ajax. Most modern web pages have multiple ajax calls that occur, and it makes sense to place all these ajax view functions in the same module as the main function. This might make more sense with an example:
- Suppose my test.dj page contains three ajax requests within it. The page is essentially made up of four requests: the main page and three sub-requests. I want all these view functions in the same views/test.py file.
- The first request is the main one: /account/test.dj/1/2/3/. This URL is routed to views/test.py -> process_request() with [ '1', '2', '3' ] as the indexed parameters.
- The second (ajax) call requests /account/test__first.dj/1/2/3/. This URL is routed to views/test.py -> process_request__first(), again with [ '1', '2', '3' ] as the indexed parameters.
- The third and fourth calls are routed to process_request__*() based on the characters after the exclamation point.
It might be argued that this third improvement loses all sense of pretty URLs, but it makes for a really clean project. View functions for a given app are separated into different modules, but related view functions can still be combined. I get the best of both worlds.
More Improvements
While I started with the above, my central controller is much more complex than what I’ve described in this post. Here’s a few ideas for additional improvements:
- I’ve replaced the Django template system with Mako templates. In other words, render_to_response() is replaced with a Mako-based template.render() call. Mako allows real Python in the templates, which might be an abomination to some, but is a relief to me. Why would I want to learn a new, severely limited templating language when I already know and love Python?
- A central controller gives a single place to add common behavior that should happen to all requests. For example, we could check every parameter value for SQL injection phrases like “DROP TABLE” and cut things short immediately, right at the controller. Post-processing can also go here. However, I should note that Django provides the idea of middleware which can also act on every request going through the system, so I’d suggest limiting the central controller to routing behavior.
- Rather than creating a controller for each app in my project, I actually created a central controller class that has the behavior above embedded in one of its methods. This enabled the router to adapt to each app without being rewritten each time.
After all the adjustments above, you might be wondering why I’m still using Django. These are really very minor adjustments to the Django system. I’m simply adding another step in the routing, right after urls.py. The other big change is using Mako for templating. Django is much, much bigger than these few changes. For the most part, the Django developers created an amazing system that produces clean, fast development efforts. I love Django’s wonderful form system, the object-relational system (across several hundred view functions, still not one line of SQL), the model-view-controller separation, the middleware layer, the separation into applications, the built-in admin, the user system, etc. Web development has come a long way since the CGI script days!
Comments