Site do Guilherme

On maintaining large SSR applications

A collection of insightful talks and essays about maintaining large applications.

The focus is mainly traditional SSR web applications using MVC~esque framesworks (such as Django, RoR, Laravel, …), meaning that if you are focused on frontend only applications (such as SPAs using React, Vue, etc) this probably won’t be very helpful :).

Django structure for scale and longevity

Reference: https://www.youtube.com/watch?v=yG3ZdxBb1oo

Where do we put the business logic?

What is “business logic”?

Models

They define the relations to be used in the “business logic”. Initial business logic can be written in model validation or added in the save or clean methods.

BUT models take care primarily of the data model & relations. Avoid fat models!

Views & APIs

They call code from the core of the application.

(DRF specific)

Serializers should:

What about in the APIVIew?

class EntityCreateAPI(APIView):
    def post(self, request):
        serializer = ...
        serializer.is_valid(...)


        # bunch of business logic

        return Response(status=...)

But what if we need the same logic elsewhere? For example in a command, in another API path, in a regular view?

Existing boxes (from Django)

None are great places to put business logic at.

Services

app/services.py or multiple files at app/services/…py

Example:

def create_user(
    *,
    email: str,
    name: str
) -> User:
    user = User(email=email)
    user.full_clean()
    user.save()

    create_profile(user=user, name=name)
    send_confirmation_email(user=user)

    return user

Every non-trivial operation, where objects are being created, should be a service.

@transaction.atomic
def create_complex_thing_with_dependences():
    ...
    # does a bunch of database creating

No ORM code in the view layer! Not in the viewset, not in the APIView.

Selectors

def get_users(*, fetched_by: User) -> Iterable[User]:
    user_ids = get_visible_users_for(user=fetched_by)

    query = Q(id__in=user_ids)

    return User.objects.filter(query)

Selectors vs Model properties

If a model property starts doing queries on the model’s relations, or outside them, it should be a selector.

Example (better as a selector):

class Lecture(models.Model):
    ...
    course = models.ForeignKey(Course, ...)
    ...
    @property
    def not_present_students(self):
        present_ids = self.present_students.values_list('id', flat=True)

        return self.course.students.exclude(id__in=present_ids)

If we are listing all Lectures, this can easily become a n+1 problem (for every lecture, we need to query all course students). So this should probably be moved into a selector.

APIs

Example:

class CourseListApi(AuthMixin, APIView):
    class OutputSerializer(serializers.ModelSerializer):
        class Meta:
            model = Course
            fields = ('id', 'name', 'start_date', 'end_date')

    def get(self, request, course_id):
        course = get_couse(id=course_id)
        data = self.OutputSerializer(course)
        return Response(data)

class CourseCreateApi(AuthMixin, APIView):
    class InputSerializer(serializers.Serializer):
        name = serializers.CharField()
        start_date = serializers.DateField()
        end_date = serializers.DateField()

    def post(self, request):
        serializer = self.InputSerializer(data=request.data)
        serializer.is_valid(raise_exception=True)

        create_course(**serializer.validated_data)

        return Response(status=status.HTTP_201_CREATED)

APIs - Serializers

Testing models

If there is no complex logic on the model methods, tests don’t actually need to touch the database. Test custom validation logic.

Testing services

Testing APIs

TL;DR

Avoid business logic in

Selectors & Services:

Follow up talks

On separation of concerns and identifying boundaries in the system: Ruby Conf 12 - Boundaries by Gary Bernhardt.

On testing: Proper Django Testing by Martin Angelov

Boundaries

Reference: https://www.youtube.com/watch?v=yTkzNHF6rMs

Testing in isolation involves mocking everything that interacts with the feature under test.

Mocking can be deceitful

There are three main advantages to mocking/stubbing:

Which are balanced out by the drawback of testing your implementation in a fake ecosystem: with mocking you can never be 100% sure that your mocks are modeling the system in a reliable way.

This creates situations where tests pass but the system breaks under the tested conditions.

How to reduce the problem

There are multiple approachs to making mock and stub design more reliable.

Contract & Collaboration tests

This involves another layer of testing: we try to guarantee that our testing doubles (mocks and stubs) are reproducing the expected behaviour.

Contract and Collaboration tests are medium sized tests that validate how one’s own code interacts with an external dependency. Internal APIs are adapters that insulate most of your code from changes in such dependencies. (Mike Bland - Contract/Collaboration Tests and Internal APIs)

This valid alternative reinforces the definition of interfaces, both for internal and external dependencies, with the downside of increasing codebase complexity and adding additional layers of indirection.

The tools approach

In the ruby ecosystem there’s rspec-fire that attempts to validate that mocked contracts (such as method or returned values) are in accord to the real implementations.

Python has something similar with unittest.mock’s create_autospec, but it only validates that the mocked method exist, not their returned values.

Typing

By subclassing the mock from the real class, we can enforce type safety and statically catch both invalid methods and invalid return types.

Integration testing

See Integrated Tests Are A Scam by J.B. Rainsberger. But seriously the thing is that integrated tests probably do not cover your codebase throughly enough, and you still need unit tests.

How to not need mocks

If a function has no side effects, meaning it receives a value, manipulates it, and returns a new value, and it has no dependencies, then we don’t need to mock anything.

How can we modify existing code to approach this theoretical function?

By using Values as Boundaries.

Compare this initial implementation:

class Sweeper:
    def sweep(self):
        for user in User.objects.all():
            if user.active and user.paid_at < datetime.now() - timedelta(30):
                UserMailer.billing_problem(user)


@fixture
def bob():
    # in librarires such as factory boy, .create() instantiates a python
    # object and saves it to the database
    return UserFactory(active=True, paid_at=datetime.now() - timedelta(60)).create()


@mark.describe("Sweeper")
class TestSweeper:
    @mark.context("When a subscription is expired")
    @mark.it("emails the user")
    def test_emails_users_with_expired_subs(self, bob, mocker):
        billing_problem_mock = mocker.patch("UserMailer.billing_problem", return_value=None)
        sweeper = Sweeper()
        sweeper.sweep()
        billing_problem_mock.assert_called_once_with(bob)

To this modified version:

class ExpiredUsers:
    def for_users(self, users):
        expired = []
        for user in users:
            if user.active and user.paid_at < datetime.now() - timedelta(30):
                expired.append(user)
        return expired


class Sweeper
    def sweep(self):
        expired_users = ExpiredUsers()
        for user in expired_users.for_users(User.objects.all()):
            UserMailer.billing_problem(user)


@fixture
def bob():
    # .build() should instantiate the python object without
    # saving to the database
    return UserFactory(active=True, paid_at=datetime.now() - timedelta(60)).build()

@mark.describe("ExpiredUsers")
class TestExpiredUsers:
    @mark.context("When a list of users is received")
    @mark.it("returns the ones that are expired")
    def test_emails_users_with_expired_subs(self, bob):
        expired_users = ExpiredUsers()
        assert expiredUsers.for_users([bob]) == [bob]


@mark.describe("Sweeper")
class TestSweeper:
    @mark.context("When a subscription is expired")
    @mark.it("emails the user")
    def test_emails_users_with_expired_subs(self, bob):
        mocker.patch("ExpiredUsers.for_users", return_value=[bob])
        billing_problem_mock = mocker.patch("UserMailer.billing_problem", return_value=None)

        sweeper = Sweeper()
        sweeper.sweep()
        billing_problem_mock.assert_called_once_with(bob)

While we still need to use a mock, we have a much clearer separation of concerns.

The ExpiredUsers class has a pure method (no side effects, no dependency on external state). The Sweeper class is now responsible for managing the dependencies.

In a way, the Sweeper class is an orchestration layer (imperative shell) around a logical layer (the functional core).

The Core should be heavy on paths (possible code outcomes, ifs, etc) and light on dependencies, meaning it is more easily isolated.

The Shell should be the oposite, heavy on dependencies and light on paths, which makes it a great contender for integration testing.

#api #english