Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #35967 -- Deferred test suite fixtures serialization after all dbs setup. #18949

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

charettes
Copy link
Member

Trac ticket number

ticket-35967

Branch description

Prevent unintended (and hard to control) interactions with non-test databases during test databases in-memory fixtures serialization.

@charettes charettes marked this pull request as ready for review December 18, 2024 20:48
Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, left a minor question or two about tests.

Comment on lines 862 to 855
The ``create_test_db`` function use to accept a ``serialize``
parameter that would automatically call ``serialize_db_to_string`` but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The ``create_test_db`` function use to accept a ``serialize``
parameter that would automatically call ``serialize_db_to_string`` but
The ``create_test_db`` function used to accept a ``serialize``
parameter that would automatically call ``serialize_db_to_string``, but

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks will change!

Comment on lines 941 to 947
with mock.patch(
"django.db.backends.dummy.base.DatabaseWrapper.creation_class"
) as mocked_db_creation:
with mock.patch("django.test.utils.connections", new=tested_connections):
self.runner_instance.setup_databases()
with (
mock.patch(
"django.db.backends.dummy.base.DatabaseWrapper.creation_class"
) as mocked_db_creation,
mock.patch("django.test.utils.connections", new=tested_connections),
):
self.runner_instance.setup_databases()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably avoid doing this cleanup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll revert.

warnings.warn(
"DatabaseCreation.create_test_db(serialize) is deprecated. "
"Call DatabaseCreation.serialize_test_db() once all test "
"databases are setup instead if you need fixtures persistence "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"databases are setup instead if you need fixtures persistence "
"databases are set up instead if you need fixtures persistence "

# This slightly horrific process is so people who are testing on databases
# without transactions or using TransactionTestCase still get a clean
# database on every test run.
for serialize_connection in serialize_connections:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we discussed in the ticket was the fact that this block needs to come after the test mirrors just above. Moving this back (to the wrong place) doesn't cause a test failure as far as I can tell. Do you think it's worth adding some coverage for that, or it is too complicated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth adding some coverage for that, or it is too complicated?

I did initially try by having execute against the test database connections raise exceptions but I couldn't get it to work properly under all circumstances because of the complexity revolving around creating a second nesting of test database connections.

self.assertIs(mark_expected_failures_and_skips.called, False)
finally:
with mock.patch.object(creation, "_destroy_test_db"):
creation.destroy_test_db(old_database_name, verbosity=0)

@mock.patch("django.db.migrations.executor.MigrationExecutor.migrate")
@mock.patch.object(BaseDatabaseCreation, "serialize_db_to_string")
def test_serialize_deprecation(self, serialize_db_to_string, *mocked_objects):
Copy link
Member

@jacobtylerwalls jacobtylerwalls Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mentioned on the ticket the ambient strategy for the surrounding tests is to use mocking, so that seems fine to me for the current purpose.

I did, however, take a stab at asserting against the result and came up with this. Would you have an idea about why this fails? Have I mocked something only partially? (It passes without the database router override.)

diff --git a/tests/migration_test_data_persistence/migrations/0003_add_editor.py b/tests/migration_test_data_persistence/migrations/0003_add_editor.py
new file mode 100644
index 0000000000..3a803280ed
--- /dev/null
+++ b/tests/migration_test_data_persistence/migrations/0003_add_editor.py
@@ -0,0 +1,43 @@
+from django.db import migrations, models
+
+
+def add_editor(apps, schema_editor):
+    editor = (
+        apps.get_model("migration_test_data_persistence", "Editor")
+        .objects.using(
+            schema_editor.connection.alias,
+        )
+        .create()
+    )
+    apps.get_model("migration_test_data_persistence", "Book").objects.using(
+        schema_editor.connection.alias,
+    ).first().editors.add(editor)
+
+
+class Migration(migrations.Migration):
+    dependencies = [("migration_test_data_persistence", "0002_add_book")]
+
+    operations = [
+        migrations.CreateModel(
+            name="Editor",
+            fields=[
+                (
+                    "id",
+                    models.AutoField(
+                        verbose_name="ID",
+                        primary_key=True,
+                        serialize=False,
+                        auto_created=True,
+                    ),
+                ),
+            ],
+            options={},
+            bases=(models.Model,),
+        ),
+        migrations.AddField(
+            "Book",
+            name="editors",
+            field=models.ManyToManyField(to="Editor"),
+        ),
+        migrations.RunPython(add_editor, migrations.RunPython.noop),
+    ]
diff --git a/tests/migration_test_data_persistence/models.py b/tests/migration_test_data_persistence/models.py
index c1572d5ddf..30a44571c7 100644
--- a/tests/migration_test_data_persistence/models.py
+++ b/tests/migration_test_data_persistence/models.py
@@ -1,8 +1,12 @@
 from django.db import models
 
 
+class Editor(models.Model): ...  # noqa
+
+
 class Book(models.Model):
     title = models.CharField(max_length=100)
+    editors = models.ManyToManyField(Editor)
 
 
 class Unmanaged(models.Model):
diff --git a/tests/test_runner/tests.py b/tests/test_runner/tests.py
index 2a4fdc3fca..265abdb0ba 100644
--- a/tests/test_runner/tests.py
+++ b/tests/test_runner/tests.py
@@ -3,6 +3,7 @@ Tests for django test runner
 """
 
 import collections.abc
+import json
 import multiprocessing
 import os
 import sys
@@ -31,6 +32,8 @@ from django.test.utils import (
     dependency_ordered,
     get_unique_databases_and_mirrors,
     iter_test_cases,
+    modify_settings,
+    override_settings,
 )
 from django.utils.version import PY312
 
@@ -930,25 +933,35 @@ class SetupDatabasesTests(unittest.TestCase):
                 tested_connections["default"].settings_dict["NAME"], old_name
             )
 
-    def test_serialization(self):
+    def test_serialization_with_router(self):
+        class ReadReplicaRouter:
+            def db_for_read(self, model, **hints):
+                return "replica"
+
         tested_connections = db.ConnectionHandler(
             {
                 "default": {
-                    "ENGINE": "django.db.backends.dummy",
+                    "ENGINE": "django.db.backends.sqlite3",
+                },
+                "replica": {
+                    "ENGINE": "django.db.backends.sqlite3",
+                    "TEST": {"MIRROR": "default"},
                 },
             }
         )
         with (
-            mock.patch(
-                "django.db.backends.dummy.base.DatabaseWrapper.creation_class"
-            ) as mocked_db_creation,
             mock.patch("django.test.utils.connections", new=tested_connections),
+            mock.patch("django.db.models.query.connections", new=tested_connections),
+            override_settings(DATABASE_ROUTERS=[ReadReplicaRouter()]),
+            modify_settings(
+                INSTALLED_APPS={"append": "migration_test_data_persistence"}
+            ),
         ):
-            self.runner_instance.setup_databases()
-        mocked_db_creation.return_value.create_test_db.assert_called_once_with(
-            verbosity=0, autoclobber=False, keepdb=False
-        )
-        mocked_db_creation.return_value.serialize_db_to_string.assert_called_once_with()
+            tuples = DiscoverRunner(verbosity=0).setup_databases()
+        default_wrapper = tuples[0][0]
+        serialized = default_wrapper._test_serialized_contents
+        books = [obj for obj in json.loads(serialized) if obj["model"].endswith("book")]
+        self.assertEqual(books[0]["fields"]["editors"], [1])
 
 
 @skipUnlessDBFeature("supports_sequence_reset")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late answer here @jacobtylerwalls but from my local testing when I was trying to also trigger the failure without heavily relying on mocking my conclusion was that there was way to many combination of variables such as in-memory vs on-disk databases, transaction wrapping or not, the fact we're dealing with nested levels of test connections to achieve a of integration testing here.

I know this is not a satisfactory answer so I'll try to pull your test and get to the bottom of it.

Copy link
Member Author

@charettes charettes Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so starting from the following rebased patch

diff --git a/tests/migration_test_data_persistence/migrations/0003_add_editor.py b/tests/migration_test_data_persistence/migrations/0003_add_editor.py
new file mode 100644
index 0000000000..3a803280ed
--- /dev/null
+++ b/tests/migration_test_data_persistence/migrations/0003_add_editor.py
@@ -0,0 +1,43 @@
+from django.db import migrations, models
+
+
+def add_editor(apps, schema_editor):
+    editor = (
+        apps.get_model("migration_test_data_persistence", "Editor")
+        .objects.using(
+            schema_editor.connection.alias,
+        )
+        .create()
+    )
+    apps.get_model("migration_test_data_persistence", "Book").objects.using(
+        schema_editor.connection.alias,
+    ).first().editors.add(editor)
+
+
+class Migration(migrations.Migration):
+    dependencies = [("migration_test_data_persistence", "0002_add_book")]
+
+    operations = [
+        migrations.CreateModel(
+            name="Editor",
+            fields=[
+                (
+                    "id",
+                    models.AutoField(
+                        verbose_name="ID",
+                        primary_key=True,
+                        serialize=False,
+                        auto_created=True,
+                    ),
+                ),
+            ],
+            options={},
+            bases=(models.Model,),
+        ),
+        migrations.AddField(
+            "Book",
+            name="editors",
+            field=models.ManyToManyField(to="Editor"),
+        ),
+        migrations.RunPython(add_editor, migrations.RunPython.noop),
+    ]
diff --git a/tests/migration_test_data_persistence/models.py b/tests/migration_test_data_persistence/models.py
index c1572d5ddf..3d4921e4e9 100644
--- a/tests/migration_test_data_persistence/models.py
+++ b/tests/migration_test_data_persistence/models.py
@@ -1,8 +1,13 @@
 from django.db import models


+class Editor(models.Model):
+    pass
+
+
 class Book(models.Model):
     title = models.CharField(max_length=100)
+    editors = models.ManyToManyField(Editor)


 class Unmanaged(models.Model):
diff --git a/tests/test_runner/tests.py b/tests/test_runner/tests.py
index d034cd7301..6096eaa4a4 100644
--- a/tests/test_runner/tests.py
+++ b/tests/test_runner/tests.py
@@ -3,6 +3,7 @@
 """

 import collections.abc
+import json
 import multiprocessing
 import os
 import sys
@@ -31,6 +32,8 @@
     dependency_ordered,
     get_unique_databases_and_mirrors,
     iter_test_cases,
+    modify_settings,
+    override_settings,
 )
 from django.utils.version import PY312

@@ -948,6 +951,36 @@ def test_serialization(self):
         )
         mocked_db_creation.return_value.serialize_db_to_string.assert_called_once_with()

+    def test_serialization_with_router(self):
+        class ReadReplicaRouter:
+            def db_for_read(self, model, **hints):
+                return "replica"
+
+        tested_connections = db.ConnectionHandler(
+            {
+                "default": {
+                    "ENGINE": "django.db.backends.sqlite3",
+                },
+                "replica": {
+                    "ENGINE": "django.db.backends.sqlite3",
+                    "TEST": {"MIRROR": "default"},
+                },
+            }
+        )
+        with (
+            mock.patch("django.test.utils.connections", new=tested_connections),
+            mock.patch("django.db.models.query.connections", new=tested_connections),
+            override_settings(DATABASE_ROUTERS=[ReadReplicaRouter()]),
+            modify_settings(
+                INSTALLED_APPS={"append": "migration_test_data_persistence"}
+            ),
+        ):
+            tuples = DiscoverRunner(verbosity=0).setup_databases()
+        default_wrapper = tuples[0][0]
+        serialized = default_wrapper._test_serialized_contents
+        books = [obj for obj in json.loads(serialized) if obj["model"].endswith("book")]
+        self.assertEqual(books[0]["fields"]["editors"], [1])
+

 @skipUnlessDBFeature("supports_sequence_reset")
 class AutoIncrementResetTest(TransactionTestCase):

The first change that needs to be made is to mock another reference to django.db.models.connections this time in django.db.models.sql.query. Any potential reference to connections has to be mocked

diff --git a/tests/test_runner/tests.py b/tests/test_runner/tests.py
index 6096eaa4a4..7e32ed22bb 100644
--- a/tests/test_runner/tests.py
+++ b/tests/test_runner/tests.py
@@ -969,6 +969,9 @@ def db_for_read(self, model, **hints):
         )
         with (
             mock.patch("django.test.utils.connections", new=tested_connections),
+            mock.patch(
+                "django.db.models.sql.query.connections", new=tested_connections
+            ),
             mock.patch("django.db.models.query.connections", new=tested_connections),
             override_settings(DATABASE_ROUTERS=[ReadReplicaRouter()]),
             modify_settings(

you then run into a django.db.utils.OperationalError: database table is locked error which too me indicated that there was remaining references to the original django.db.models.connections interfering so I adjusted the mocking strategy to target django.db.connections.__dict__ instead to avoid the whack-a-whole game of mocking each reference through imports.

diff --git a/tests/test_runner/tests.py b/tests/test_runner/tests.py
index 6096eaa4a4..ab7a81ef78 100644
--- a/tests/test_runner/tests.py
+++ b/tests/test_runner/tests.py
@@ -968,8 +968,9 @@ def db_for_read(self, model, **hints):
             }
         )
         with (
-            mock.patch("django.test.utils.connections", new=tested_connections),
-            mock.patch("django.db.models.query.connections", new=tested_connections),
+            mock.patch.dict(
+                db.connections.__dict__, tested_connections.__dict__, clear=True
+            ),
             override_settings(DATABASE_ROUTERS=[ReadReplicaRouter()]),
             modify_settings(
                 INSTALLED_APPS={"append": "migration_test_data_persistence"}

And the test is now passing and still failing against main 🎉

…dbs setup.

While the top-level objects fed to serialization are bound to the test database
being created nothing prevents code invoked during serialization from
performing queries against other connections entries that haven't been swapped
yet.

The reported example of that is a database router directing all reads to a test
mirror for a set of models involving auto-created many-to-many fields. It might
be tempting to address the many-to-many field case but this a symptom of a
larger problem where the test framework yields the flow execution to user code
that could interact with non-test databases in unexpected ways.

Deferring test database fixture serialization until the point where all
connections entries have been swapped for their test equivalent ensures that no
code triggered during serialization can interact with non-test databases.

Thanks Jake Howard for the report and Jacob Walls for the initial
investigation.
…ze).

Given there are no longer any internal usages of serialize=True and it poses
a risk to non-test databases integrity it seems appropriate to deprecate it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants