Bringing more security to multi-tenant Django applications with django-scopes

pretix is a multi-tenant application: With one software installation, it can handle lots of companies and institutions selling tickets. In pretix, they are called organizers, but in the more general case, we usually speak of “tenants” in the software industry. Building pretix this way is a design choice, we could just as well have created a software that only handles one company and run the software many times on logically or physically seperate systems for every event organizer. We decided to go with multi-tenancy in the software many years ago for a number of reasons.

One of them is that we want to have the ability to provide some features across multiple tenants. Some of our power users manage a large number of events split over multiple organizers and it’s really convenient for them that we can offer global search and reporting. A second reason is the performance pattern inherent to ticket sales: Most of our clients do not cause any load 99% of the time and we want them to take up as little memory and disk space as possible, but they sometimes require a lot of resources for a short time when they have a big onsale. This performance pattern is easier to handle when running all customers on the same large instance instead of every customer in their own.

This leaves us with the challenge to isolate customers effectively to protect their data from unwanted data leaks. There are some fany approaches e.g. based on having multiple PostgreSQL schemas, but since pretix is capable to run with different database backends, this is not a solution for us.

Multi-tenancy with foreign keys

The way we (and many other developers) ended up implementing multi-tenancy is to keep record of which database object belongs to which tenant in a special database column. Let’s look at an example:

class Tenant(models.Model):
    name = models.CharField(max_length=190)

class Page(models.Model):
    tenant = models.ForeignKey(Tenant, on_delete=models.PROTECT)
    title = models.CharField(max_length=190)

Then, every time we run a query on that model, we need to filter by the current tenant:

Page.objects.filter(
    tenant=get_tenant(request)
)

get_tenant could look up the tenant based on the current URL or based on who is logged in. This works, but it’s really easy to accidentally miss the filter() part in one of your queries.

On explicit queries, one gets used to spotting these things in code review, but they still slip through once in a while. It’s especially easy to miss it since local development installations usually do not contain much data for multiple tenants and therefore this does not become visible during manual testing. It gets even more dangerous when Django auto-generates queries for you, such as when using forms or generic views:

class LinkForm(forms.ModelForm):
    class Meta:
        model = Link
        fields = ('page', 'text')

This will automatically create a ModelChoiceField that shows a selection of all page objects of all tenants. That’s a data leak! Would you have spotted the missing .filter() call in code review?

We believe that this type of data leak is the most dangerous security vulnerability in any multi-tenant Django application. Django does a good job providing you with sane defaults for authentication and protecting you from SQL injections, but there’s no built-in mechanism to protect you from this class of errors. In our history, we had three known vulnerabilities of this type, including a very critical incident where we almost leaked lots of personal data, and we know of friends with similar experiences.

In the aftermath of that incident, we did deploy a few defense-in-depth approaches on pretix.eu to detect large-scale exploitation of these kinds of issues at a database level. We don’t really want to talk about them too much, since some security by obscurity is naturally involved in these things.

However, we always wanted to have a more thorough solution to tackle this problem in an idiomatic way that makes it hard to build these bugs in the first place. After years of thinking about this problem, we think we have found a viable solution that we released as a standalone Python library.

Explicit is better than implicit: django-scopes

The solution is a combination of custom managers and Python-level context managers. With our new package django-scopes, you can define a relation to a tenant by using a special manager classScopedManager and defining the lookup that allows you to reach the tenant model:

class Page(models.Model):
    title = models.CharField(...)

    objects = ScopedManager(tenant='tenant')

class Comment(models.Model):
    page = models.ForeignKey(Page, ...)
    text = models.CharField(...)

    objects = ScopedManager(tenant='page__tenant')

In this case, the name of our keyword argument tenant= defines our scope dimension. We can call it anything we want, and we can have multiple scope dimensions in the same project or model, but let’s stick to the simple case here. You can find examples using the full power of the library in its documentation.

Our new model manager will refuse operation by default:

>>> Comment.objects.all()
ScopeError: A scope on dimension "tenant" needs to be active for this query.

In order to use our model, we need to explicitly activate a tenant by using a context manager:

with scope(tenant=get_tenant(request)):
    Comment.objects.all()

This will not only allow us to run queries on all objects related to the given tenant, it will even automatically add filters to any query in case we forget them. The previous code snippet is equivalent to:

Comment.objects.filter(page__tenant=get_tenant(request)).all()

Sounds tedious to add in all those with statements? It absolutely isn’t: In reality you’ll often find it suitable to just call the context manager in a middleware, either based on the URL or on the authenticated user. Such a middleware could be as simple as this one:

class ScopingMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        with scope(tenant=get_tenant(request):
            return self.get_response(request)

Of course, you can opt out of the functionality at any time if you want. with scope(tenant=None) will disable the protection for a given dimension, or you can use with scopes_disabled() to temporarily disable the protection across all dimensions:

with scopes_disabled():
    ...

# OR

@scopes_disabled()
def fun(...):
    ...

The library can deal with more complex cases as well, such as activating multiple tenants at once, using custom manager base classes, nested invocations, and more. Some caution is required with testing, model forms and django-filters. See the Caveats section of the documentation for more information.

What django-scopes isn’t

django-scopes does not try to be a fine-grained permission system. It’s not a replacement for thinking about the queries you write and we recommend still writing all the .filter() calls even if they get inserted automatically. You’re unlikely to notice any performance overhead at all.

In pretix, we use a role-based access control system which allows every tenant to define roles (“teams”) and assign them to users to seletively grant them access to some of the data. So even though we activate scope(tenant=get_tenant(request)) in our middleware, this doesn’t mean the user has access to all data of that tenant and we still need to apply filters based on the permissions of the current users. Here’s some real-life code from pretix retrieving all orders a user is allowed to see:

qs = Order.objects.filter(
    Q(event__organizer_id__in=
        self.request.user.teams.filter(
		    all_events=True,
            can_view_orders=True
        ).values_list('organizer', flat=True)
    ) | Q(event_id__in=
        self.request.user.teams.filter(
            can_view_orders=True
        ).values_list('limit_events__id', flat=True)
    )
)

We don’t want django-scopes to handle this kind of separation. It would lead to incomprehensible code paths and lots of very complex and implicit logic.

We see django-scopes purely as a safety net. We want it to limit our exposure by limiting the size of any data leaks to the currently active tenant in case anything else fails.

Real-world experience

We think this is a great way to make this class of vulnerability much less likely. It doesn’t replace cautious programming or good code review in any way, but it certainly makes us sleep a little better. We’re already running it in production (although only for a very short period), and our friends at pretalx do, too.

Integrating this approach into pretix, a mature Django application with approx. 55,000 lines of Python code, was a matter of two or three hours. Wrapping all the queries in our 33,500 lines of tests with context managers took a little more work and caused the pull request to get quite large, but the total integration effort was still around two full days.

We’d love to hear your thoughts whether this is something you would use!

Find django-scopes on PyPI