-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed #36088 -- Avoided unnecessary DEFAULT usage on bulk_create(). #18909
base: main
Are you sure you want to change the base?
Conversation
e645795
to
9861770
Compare
Just out of technical curiosity: how much of a performance gain can we expect from this? I imagine it depends on the number of inserted models, but maybe you ran tests with a fairly large sample dataset? |
@alexandernst this PR is an extension of #18847 that allows the optimization implemented over there to be enabled even when the model includes fields making use of I haven't tested how much of a speed up not specifying As for the |
It might make sense to split out the commit adding |
Good idea @LilyFoote; I'll work on that later today. |
9861770
to
8303ded
Compare
8303ded
to
b7dcafb
Compare
b7dcafb
to
73934aa
Compare
django/db/models/sql/compiler.py
Outdated
elif not supports_default_keyword_in_bulk_insert: | ||
# If the field cannot be excluded from the INSERT for the | ||
# reasons listed above and the backend doesn't support the | ||
# DEFAULT keyword each values must be expanded into their | ||
# underlying expressions. | ||
prepared_db_default = field_prepare(field.db_default) | ||
field_values = [ | ||
( | ||
prepared_db_default | ||
if isinstance(value, DatabaseDefault) | ||
else value | ||
) | ||
for value in field_values | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch is already covered by tests and has simply be moved over from _prepare_for_bulk_create
.
73934aa
to
4e4f400
Compare
django/db/models/sql/compiler.py
Outdated
prepared_db_default = field_prepare(field.db_default) | ||
field_values = [ | ||
( | ||
prepared_db_default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prepared_db_default = field_prepare(field.db_default) | |
field_values = [ | |
( | |
prepared_db_default | |
field_values = [ | |
( | |
field_prepare(field.db_default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done explicitly. field.db_default
is the same for all instances and field_prepare
is not cheap so it's better to prepare it once and use the same value than prepare it N times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, gotcha!
django/db/models/sql/compiler.py
Outdated
if field.has_db_default(): | ||
# If all values are DEFAULT don't include the field and its | ||
# values in the query as they are redundant and could prevent | ||
# optimizations. This cannot be done if we're dealing with the | ||
# last field as INSERT statements require at least one. | ||
if len(fields) > 1 and all( | ||
isinstance(value, DatabaseDefault) for value in field_values | ||
): | ||
fields.remove(field) | ||
continue | ||
elif not supports_default_keyword_in_bulk_insert: | ||
# If the field cannot be excluded from the INSERT for the | ||
# reasons listed above and the backend doesn't support the | ||
# DEFAULT keyword each values must be expanded into their | ||
# underlying expressions. | ||
prepared_db_default = field_prepare(field.db_default) | ||
field_values = [ | ||
( | ||
prepared_db_default | ||
if isinstance(value, DatabaseDefault) | ||
else value | ||
) | ||
for value in field_values | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier to read if we negate some checks and explicitly continue
:
if field.has_db_default(): | |
# If all values are DEFAULT don't include the field and its | |
# values in the query as they are redundant and could prevent | |
# optimizations. This cannot be done if we're dealing with the | |
# last field as INSERT statements require at least one. | |
if len(fields) > 1 and all( | |
isinstance(value, DatabaseDefault) for value in field_values | |
): | |
fields.remove(field) | |
continue | |
elif not supports_default_keyword_in_bulk_insert: | |
# If the field cannot be excluded from the INSERT for the | |
# reasons listed above and the backend doesn't support the | |
# DEFAULT keyword each values must be expanded into their | |
# underlying expressions. | |
prepared_db_default = field_prepare(field.db_default) | |
field_values = [ | |
( | |
prepared_db_default | |
if isinstance(value, DatabaseDefault) | |
else value | |
) | |
for value in field_values | |
] | |
if not field.has_db_default(): | |
value_cols.append(field_values) | |
continue | |
# If all values are DEFAULT don't include the field and its | |
# values in the query as they are redundant and could prevent | |
# optimizations. This cannot be done if we're dealing with the | |
# last field as INSERT statements require at least one. | |
if len(fields) > 1 and all( | |
isinstance(value, DatabaseDefault) for value in field_values | |
): | |
fields.remove(field) | |
continue | |
if supports_default_keyword_in_bulk_insert: | |
value_cols.append(field_values) | |
continue | |
# If the field cannot be excluded from the INSERT for the | |
# reasons listed above and the backend doesn't support the | |
# DEFAULT keyword each values must be expanded into their | |
# underlying expressions. | |
prepared_db_default = field_prepare(field.db_default) | |
field_values = [ | |
( | |
prepared_db_default | |
if isinstance(value, DatabaseDefault) | |
else value | |
) | |
for value in field_values | |
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it!
django/db/models/sql/compiler.py
Outdated
if len(fields) > 1 and all( | ||
isinstance(value, DatabaseDefault) for value in field_values | ||
): | ||
fields.remove(field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is mutating self.query.fields
here safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. In practice InsertQuery
is terminal and not chained but it's just bad practice to mutate an object like that.
4e4f400
to
bb42041
Compare
Thanks for the review @LilyFoote. All comments should be addressed. |
When all values of a field with a db_default are DatabaseDefault, which is the case most of the time, there is no point in specifying explicit DEFAULT for all INSERT VALUES as that's what the database will do anyway if not specified. In the case of Postgresql doing so can even be harmful as it prevents the usage of the UNNEST strategy and in the case of Oracle, which doesn't support the usage of the DEFAULT keyword, it unnecessarily requires providing literal db defaults. Thanks Lily Foote for the review.
bb42041
to
b9d920e
Compare
buildbot, test on oracle. |
Trac ticket number
ticket-36088
Branch description
This is still a draft that builds on top of #18847 to demonstrate the benefit of avoid to use
DEFAULT
when unneessary.For example, given the following model
Then
will result in the following SQL
while
DEFAULT
could be entirely omitted since well, it's the database default?I discovered this case when testing the
UNNSET
strategy as unfortunately the unnecessary usage ofDEFAULT
disables the optimization.