Super-simple Caching of Class Properties

If you have a class (as in Python class), and you have a property whose value is not likely to change during the instance’s life, but very likely to be accessed multiple times, it helps if you can somehow cache the value so that it is only calculated the first time you access it. This is very simple to do, and here’s how.

In my Djassets project, I have a class called AssetNode. It’s a custom template tag for managing code assets (JavaScript and CSS). The class has an attribute called file_objects, which returns a list of AssetFile instances, each representing a single asset file from the list provided by the user. I didn’t like the idea of AssetFile objects being created again and again every time I access this property. Here’s how I got around it.

class AssetNode(template.Node):
    def __init__(self, params):
        # do something with the params
        self._cached_objects = [] # where we store the cached objects

    # ....

    @property
    def file_objects(self):
        if not self._cached_objects:
            file_objects = []
            for file in self.files:
                file_objects.append(AssetFile(file))
            self._cached_objects = file_objects
        return self._cached_objects

Empty list means False in Python, and therefore I can safely use self._cached_objects in the if statement. As you can see, the method is fairly simple. If we have the cached version, just return that. In case we don’t have anything cached, perform the caching.

Also, note that the code block inside if never returns anything. The return line is common for both cases. It’s like saying “If such and such condition is met, just make a detour and return to regular activities afterwards.” In our case, regular activity is returning the cached value. It always returns the cached value. You only place something in the cache the first time, and that’s our detour.

I’m still trying to figure out a good way to further abstract this method, since it’s an ever recurring pattern. There must be a shorter way to say the same thing. Know any?

Comments

Flattening Nested Code by Short-Circuiting

Nested code is ugly. It’s not just ugly, it is also bad for your health. Imagine having to deal with dozes of binary if blocks nested into each other. It can drive you crazy just imagining it. But there is hope. Using short-circuiting, you can flatten your nested blocks, and write some very nice code.

Here’s a simplified example:

if x > 1:
    if y < 5:
        return a
    else: return c
else:
    return None

We exploit the fact that return ejects you out of the block, so:

if x <= 1:
    return None

if y >= 5:
    return c

return a

This is what I call ‘short-circuiting’ (and correct me if I’m wrong about the name, but I thought it was fitting). We basically trap the executing by if clauses breaking out of the block if it’s a match. For example, if x is indeed less or equal 1, we don’t even worry about the rest of the code. We’ve effectively ‘short-circuited’. On the other hand, we don’t need any specific test for the cases when x is greater than 1 because if it weren’t, we wouldn’t even reach the next line.

You now have beautifully readable code, and still get the same effect. It’s much easier to maintain because it’s far less intimidating.

Comments

authentication.py Summary of the 1st Week

It’s late Saturday night (actually closer to Sunday morning), and it’s time to summarize the development of authentication.py this week.

A short intro is in order. authentication.py is a microframework for user/group/permission management for web.py. It’s meant to cover a wide range of scenarios involving user authentication, user account and group management, and permissions. It handles most aspects of these activities including sending out confirmation e-mails for account creation, password reset, account removal, and account suspension. It also handles encryption and storage of passwords, and generating the URL suffix for confirmation and activation.

What’s done so far is:

  • user account creation with activation e-mail
  • user account activation
  • username and e-mail validation
  • password minimum length constraint check
  • user detail modification and record update
  • password resetting with confirmation e-mail
  • messaging (sending arbitrary e-mails to users)
  • retrieval of user account records by username and/or e-mail

Everything that’s been implemented so far should work and is also covered by automated tests.

Comments

Python Bits: setdefault

You may have encountered the get method available to all dict instances before. Here’s an example:

>>> d = {'a' = 1}
>>> d.get('a')
1

That’s the same as d['a'], but it also provides an alternative value (None by default) if there is no such key. I’ve just discovered that there is another way to do this, and that’s the setdefault. In addition to returning the default value, it also sets it, adding the requested key to the dictionary. Here’s a session that demonstrates the difference:

>>> d = {'a': 1}
>>> d.get('a')
1
>>> d.get('b')
>>> d
{'a': 1}
>>> d.setdefault('a')
1
>>> d.setdefault('b')
>>> d
{'a': 1, 'b': None}

Comments

How to Mess up an URL Before Request Happens

This whole question came when I wanted to add .xml to my /posts URL so that, what is usually an HTML page with a list of posts, gets returned as RSS when addressed as /posts.xml. This is not a head-scratcher. But, then I also wanted to have a date-based archive for the posts. Here’s the whole list of URL patterns:

urls = (
    '/', 'home',
    '/posts/?', 'posts',
    '/posts(.xml)/?', 'posts'
    '/posts/(\d{4})/?', 'posts',
    '/posts/(\d{4})/(\d{2})/?', 'posts',
    '/posts/(\d{4})/(\d{2})/(\d{2})/?', 'posts',
)

What’s the catch? web.py is limited to positional arguments in URLs. In this case, it means that there would be a clash between the (.xml) and (\d{4}) patterns. And that’s the head-scratcher. I could have just said that the year argument, which is otherwise optional, can be either a 4-digit number or ‘.xml’, but that would be very ugly, and I hate ugliness in code. Here’s what I did instead.

First, I removed the offending .xml URL, because I would use that same trick elsewhere as well, so I wanted to handle it in a more generalized way. Then I added this bit of code to my app.py:

ACCEPT_FORMATS = ('html', 'xml')
formats_re = re.compile('\.(%s)$' % '|'.join(ACCEPT_FORMATS))

def format_hook():
    web.ctx.format = None
    matches = formats_re.search(web.ctx.path)
    if matches:
        web.ctx.format = matches.groups()[0]
        web.ctx.path = formats_re.sub('', web.ctx.path)

What this little clever piece of code does is it looks for presence of .html or .xml in a request URL, and if it finds it, it saves the part after the dot (html and xml respectively) as web.ctx.format variable (request context), and strips the extension away from the URL. This is a so-called load hook.

Let’s hook this load hook up to our application:

app.add_processor(web.loadhook(format_hook))

Now, if we start the app and point our browser to http://0.0.0.0:8080/posts.xml, the /posts URL will be called instead of /posts.xml, and web.ctx.format will contain xml.

Comments

Assert Your Defense Against Bugs

In Python, assert does a bit more than it looks like it would do, but it’s basic function is to make sure a condition is met. In defensive programmnig, we try to make sure every bit of input is carefully filtered and verified, and allow our software to function even if input is broken. Although you can detect broken input, and correct it or replace it, assert allows you to add a safety net that will catch anything we might have overlooked.

Here’s a simple example:

>>> def sum_up(int_list):
...     result = 0
...     for i in int_list:
...         result += i
...     return result
...
>>> l = [1, 2, 3]
>>> sum_up(l)
6

The above function returns the sum of all integers in a supplied list. We simply assume that the input will be a list of integer values, but we can’t be sure, can we? Someone might do something silly like this:

>>> sum_up(12)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in sum_up
TypeError: 'int' object is not iterable

Let’s defend against this potential bug. Let’s make sure we are getting a list and a list of integers at that.

>>> def sum_up(int_list):
...     if not hasattr(int_list, '__iter__'):
...         return 0
...     result = 0
...     for i in int_list:
...         try:
...             result += int(i)
...         except (ValueError, TypeError):
...             pass
...     return result
...

Important thing to remember is that we’re making sure that our function can withstand all kinds of faulty output that you might imagine someone shoving into it. The above code is still not perfect, but we’ll get there. Not that we’re testing to see if the int_list is an iterable, rather than a list. The reason we do this is that the code will work even with such things as strings and tuples, as long as they can be iterated. This is called duck typing, and we employ it to maximize the chance of our code surviving malformed input by requesting only minimal requirements for correct operation.

>>> sum_up([1,2,3])
6
>>> sum_up([1, '2', 3])
6
>>> sum_up([1, 'a', 3])
4
>>> sum_up('abc')
0

Not only does it work when you pass it an occasional integer, but it tries to convert it into a string. Also, if you pass it a non-iterable, it returns 0. This may be desirable or not, but it’s your choice. Sometimes it can make the code more robust, when you expect a lot of abuse.

Now, let’s handle one more case. When the input is not iterable, we can still try to return its value as a sum:

>>> def sum_up(int_list):
...     if not hasattr(int_list, '__iter__'):
...         try:
...             int_list = int(int_list)
...             return int_list
...         except (ValueError, TypeError):
...             return 0
...     result = 0
...     for i in int_list:
...         try:
...             result += int(i)
...         except (ValueError, TypeError):
...             pass
...     return result
...

Now all safety nets are in place, and our code will pass most inputs we can think of. The only case left is when there is no input at all. Provided you even want to allow no input, we can simply assign a default value, and we’re done.

>>> def sum_up(int_list=[]):
...     if not hasattr(int_list, '__iter__'):
...         try:
...             int_list = int(int_list)
...             return int_list
...         except (ValueError, TypeError):
...             return 0
...     result = 0
...     for i in int_list:
...         try:
...             result += int(i)
...         except (ValueError, TypeError):
...             pass
...     return result
...

Let’s briefly test our code:

>>> sum_up([1, 2, 3])
6
>>> sum_up(['d', '3', 2, 'abc', 5.4])
10
>>> sum_up('bogus')
0
>>> sum_up('12')
12
>>> sum_up(12)
12
>>> sum_up(None)
0
>>> sum_up()
0
>>>

Finally, in case we’ve missed something, we can use assert to make sure make sure we’ve covered all the angles.

>>> def sum_up(int_list=[]):
...     if not hasattr(int_list, '__iter__'):
...         try:
...             int_list = int(int_list)
...             return int_list
...         except (ValueError, TypeError):
...             return 0
...     assert hasattr(int_list, '__iter__')
...     assert hasattr(int_list.__iter__(), 'next')
...     result = 0
...     for i in int_list:
...         try:
...             result += int(i)
...         except (ValueError, TypeError):
...             pass
...     return result
...

The two assert lines check that (a) we’ve got an iterator-looking input, and (b) that the iterator object returned by actually calling the __iter__() method has the next attribute. Of course, we might agree that the tests are not definitive and there might still be some edge cases where the next attribute can be something other than we expect, but this is, I think, more than enough as an illustration.

Comments

Write CSS fast: DumbCSS

Have you ever heard of CleverCSS? Well, this post is not about that. Sorry. This is about DumbCSS. It’s actually not that dumb, but it’s still about writing plain ol’ CSS, with a few bells and whistles.

So, this guy from RDBHost, an interesting relational database hosting service, wanted me to do some CSS templates for his app. RDBHost is actually quite a cool service (nooo, I’m not paid to say that, honest). It’s a Postgres database on Google App Engine that allows you to access it using a provided HTTP API. I don’t wanna babble on about it whole day, so let’s just say ‘DB access using JavaScript, anyone?’

[ read more... >

Comments