Think twice before you soft delete

10 Apr 2019

This is a code-heavy blog post about a specific pattern/third party library. I wanted to document the behaviours that other Django developers need to expect if they are not super careful when introducing a concept called soft deletion.

Soft deletion, on the face of it, is a neat thing to be able to do. Say you have to write an application to track books in a library. You might create a Book model in your database and have a Loan object that refers to it. Say “1984” gets borrowed ten times; there would be ten Loans referring to the one Book, all having different dates.

Then you have to handle deletion. The book gets lost; you want to delete it to prevent it from being available for future borrowing. Since databases apply constraints (a foreign key in this case), you have to delete the Loan rows as well as the Book, or allow a NULL value. But you don’t want the historical statistics to be altered - the report page showing how many times that book was borrowed shouldn’t change. This is when soft deletion is handy. Introduce a is_removed column into the Book model, and set it to True for certain books. You just have to remember, wherever the application shows a list of available books, to use .exclude(is_removed=True).

Depending on the size and complexity of the codebase, being sure that you’ve implemented that exclusion absolutely everywhere might be difficult. It’s therefore appealing to use a “feature” of Django and do it in one step by overriding the default manager on the Book model. If you do that, then Book.objects.all() ceases to include books for which is_removed=True. You still have to remember not to call .delete() on Book objects or Querysets anywhere (call .update(is_removed=True) instead). There’s a library called django-model-utils which does this all for you. It overrides querying and deletion methods in a model class called SoftDeletableModel. This one even overrides .delete() on model instances and QuerySets for you.

I admit it: I’ve done this trick in Django codebases I’ve worked on (overridden objects on a model, I mean, like here). I’m sorry.

Despite seeming to solve the soft deletion problem in one fell swoop, and obviating the need to search the entire codebase for lists or deletions, taking this approach causes all sorts of complications down the line. Django documentation recommends against it (“Don’t filter away any results”) and explains some of the problems that result ("…aren’t used when querying on related models…").

It’s all rather difficult to follow, because of the abstract language. To help convince myself (and you, if you’re still reading!), I put together a small demonstration application to illustrate just a few of them.

Here’s a link: soft-deletable-models

And here’s the main exploratory python notebook

Tags: code, python, django