Home Speeding up pytest + sqlalchemy
Post
Cancel

Speeding up pytest + sqlalchemy

A large chunk of python projects use SQLAlchemy as ORM, and in doing so may actually be doubling their test runs.

One of the more interesting things I discovered while running tests with profiling on is the fact that the most expensive operations were actually involving database fixtures and user auth. This didn’t make any sense at first, because it was supposed to be a very cheap set of operations on a very small scale.

However, on closer inspection I realised that this wasn’t SQLAlchemy per se, it was actually bcrypt! As it is a decently written cryptography library, bcrypt operations are somewhat expensive (although I’m not sure at this point that it is not vulnerable to timing attacks?), and in the context of unit tests, it turns out that they were the most expensive operations.

What this means in the practical sense, is that all of the sqlalchemy_utils.PasswordType columns will spend a significant amount of time when doing almost anything useful (inserts, updates, comparisons). Since we don’t really care about password security during tests, this is simply a waste of time, and a waste of CPU cycles in your pipeline.

We solve this problem by doing some metaprogramming, and replacing the PasswordType with a plain old String column. The rest of your code should be none the wiser.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from flask_sqlalchemy import SQLAlchemy
from flask_sqlalchemy.model import DefaultMeta
from sqlalchemy.ext.declarative import as_declarative
from sqlalchemy_utils import PasswordType

from yourproject.config import ActiveConfig

db = SQLAlchemy()


class BaseModelMetaclass(DefaultMeta):
    def __new__(cls, name, bases, dct):
        obj = super().__new__(cls, name, bases, dct)

        if ActiveConfig.TESTING:
            for key, value in dct.items():
                if not isinstance(value, db.Column):
                    continue
                if isinstance(value.type, PasswordType):
                    value.type = db.String(100)

        return obj


@as_declarative(
    metaclass=BaseModelMetaclass,
)
class BaseModel(db.Model):
    __abstract__ = True

This example uses flask-sqlalchemy, but obviously it will work without the flask wrappers. The key element here is the ActiveConfig.TESTING flag, so you don’t accidentally replace all the password fields in production as well.

Results may vary of course, but for the first project where we applied this, test runs went from ~14.5 minutes to ~6 for less than a 1000 tests. For large projects, time savings could be even more significant.

This post is licensed under CC BY 4.0 by the author.