
Overriding Django Model fields for good
Django’s ORM framework is very flexible and it unfolds the easiest ways to interact with your database without explicitly writing SQL. But this flexibility also has proved some downsides where teams have varied experience and understanding of the ORM. And this discussion is an attempt to point some of those issues and approaches to tackling them.
Many developers who are new to the Django’s ORM after starting to write joins(debatable) with it, will often forget or fail to understand the idea that referencing objects via foreign key relationships will actually trigger a new query. Django has a helper select_related
to mitigate this another roundtrip to the database. I’ve noticed developers going around this and directly referencing foreign key relations. Now, let me show you the code.
1
2
article = Article.objects.get(id=12345)
author = article.author # extra query, select \*from authors where id = :whatever
This pattern of usage will cost an extra round trip to the database for fetching the author and some developers make a mistake in thinking that they’ve already got the author in first-line and are now just using. This usually happens in big teams where the code gets shipped in different styles of writing and is reviewed less for these kinds of mistakes.
At a later point in time when the codebase is considerably big, and the same team using AWS Aurora like cloud database services will see a huge database bill attributed to the unnecessary database i/o. In fact, this is where most companies using Django land up when they do a cost audit and trying to reduce the bills. Ideally, with the intent to use author information as well, the above query should be written as
1
2
3
# article info too in one round trip to DB
article = Article.objects.select_related('article').get(id=12345)
author = article.author
But, with this huge codebase, you realize that changing all the usages of this pattern is a tedious task. But your new-relic, DB io statistics push you to optimize this to save costs. So, you come up with an idea that you should cache the author and return the cached copy when someone requests an associated author in this pattern. So, how do you achieve that?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class MyClass(models.Model):
my_date = models.DateField()
class MyClass(models.Model):
_my_date = models.DateField(
db_column="my_date", # allows to avoid migrating to a different column
)
@property
def my_date(self):
return self._my_date
@my_date.setter
def my_date(self, value):
if value > datetime.date.today():
logger.warning("The date chosen was in the future.")
self._my_date = value