Overriding Django Model fields for good

Overriding Django Model fields for good

2020, May 14    

Django’s ORM framework is very flexible and it unfolds the easiest ways to interact with your database without explicitly writing SQL. But this flexibility also has proved some downsides where teams have varied experience and understanding of the ORM. And this discussion is an attempt to point some of those issues and approaches to tackling them.

Many developers who are new to the Django’s ORM after starting to write joins(debatable) with it, will often forget or fail to understand the idea that referencing objects via foreign key relationships will actually trigger a new query. Django has a helper select_related to mitigate this another roundtrip to the database. I’ve noticed developers going around this and directly referencing foreign key relations. Now, let me show you the code.

article = Article.objects.get(id=12345)
author = article.author # extra query, select \*from authors where id = :whatever

This pattern of usage will cost an extra round trip to the database for fetching the author and some developers make a mistake in thinking that they’ve already got the author in first-line and are now just using. This usually happens in big teams where the code gets shipped in different styles of writing and is reviewed less for these kinds of mistakes.

At a later point in time when the codebase is considerably big, and the same team using AWS Aurora like cloud database services will see a huge database bill attributed to the unnecessary database i/o. In fact, this is where most companies using Django land up when they do a cost audit and trying to reduce the bills. Ideally, with the intent to use author information as well, the above query should be written as

# article info too in one round trip to DB
article = Article.objects.select_related('article').get(id=12345) 
author = article.author

But, with this huge codebase, you realize that changing all the usages of this pattern is a tedious task. But your new-relic, DB io statistics push you to optimize this to save costs. So, you come up with an idea that you should cache the author and return the cached copy when someone requests an associated author in this pattern. So, how do you achieve that?

class MyClass(models.Model):
    my_date = models.DateField()

class MyClass(models.Model):
    _my_date = models.DateField(
        db_column="my_date",  # allows to avoid migrating to a different column

    def my_date(self):
        return self._my_date

    def my_date(self, value):
        if value > datetime.date.today():
            logger.warning("The date chosen was in the future.")
        self._my_date = value

You might also like

VIM editor and configuration for python editing | 12 May 2020 - 5 minute read
VIM editor and configuration for python editing | 12 May 2020 - 5 minute read
Performance benchmarks - redis mget vs pipeline | 23 Sep 2022 - 2 minute read
Backend development roadmap, skills, resources | 21 Aug 2022 - 5 minute read
Performance benchmarks - redis get vs mget | 22 Jul 2022 - 2 minute read
Brief note on debugging | 15 Jul 2022 - 2 minute read
Performance benchmarks - mget vs hmget(pipeline) | 23 Apr 2022 - 4 minute read
VIM editor and configuration for python editing | 12 May 2020 - 5 minute read