Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pickle support to dynamically created models and generics #1686

Merged

Conversation

Bobronium
Copy link
Contributor

@Bobronium Bobronium commented Jul 5, 2020

Change Summary

  • Always set __module__ to created class (try to find the actual one, in case of failure fallback to pydantic.main)
  • Set reference of created concrete model to it's module to allow pickling (not applied to models created in functions)
  • Add pydantic.utils.ensure_picklable which will allow to make model, created inside a functions or/and referenced with a different name from class name pickable

Related issue number

Closes #1667
Closes #1734

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
    (see changes/README.md for details)

@codecov
Copy link

codecov bot commented Jul 5, 2020

Codecov Report

Merging #1686 into master will decrease coverage by 0.05%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##            master    #1686      +/-   ##
===========================================
- Coverage   100.00%   99.94%   -0.06%     
===========================================
  Files           21       21              
  Lines         3925     3948      +23     
  Branches       788      790       +2     
===========================================
+ Hits          3925     3946      +21     
- Misses           0        1       +1     
- Partials         0        1       +1     
Impacted Files Coverage Δ
pydantic/generics.py 100.00% <100.00%> (ø)
pydantic/typing.py 98.11% <0.00%> (-1.89%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29e3877...028444b. Read the comment docs.

@PrettyWood
Copy link
Collaborator

PrettyWood commented Aug 7, 2020

@MrMrRobat I reckon your PR also fixes #1734 so #1757 can probably be closed. I let it open in the meantime but maybe you could add my test in your PR and add #1734 in closed issues ;)

tests/test_utils.py Outdated Show resolved Hide resolved
@Bobronium Bobronium force-pushed the allow-dynamically-created-models-pickling branch from 8201b6e to 3daa8df Compare August 7, 2020 12:26
Bobronium and others added 2 commits August 7, 2020 15:28
Co-authored-by: PrettyWood <em.jolibois@gmail.com>
@Bobronium
Copy link
Contributor Author

@PrettyWood, test I found in #1757 seems to work regardless of changes I made 🤔.
I assume example from #1734 will work as a test, as it reproduce the issue

@PrettyWood
Copy link
Collaborator

@MrMrRobat weird 🤔 I just tried with the test on master and it actually raises

>       globalns = sys.modules[cls.__module__].__dict__.copy()
E       KeyError: None

pydantic/main.py:684: KeyError

@Bobronium
Copy link
Contributor Author

Ah, sorry, now I get it. I just removed setting model attribute to a module, but kept setting of a module to a model

@Bobronium
Copy link
Contributor Author

Bobronium commented Aug 25, 2020

So, pickling can't be done if model has no global reference somewhere in code. Pickle threats path in __qualname__ of as such reference.

Dynamically created models don't have global references therefore can not be pickled, unless referenced manually.

We can't set global references to each dynamically created model, as it will either override previous created model, or/and lead to memory leaks.

But turns out, it's not trivial task to determine if model was created globally or locally.

It can be done like this in normal conditions:

def is_call_from_module():
    import inspect

    return inspect.stack()[2].function == '<module>'

However, if code is executed in debugger, value of a inspect.stack()[2].function would be evaluate_expression. Obviously, we can't predict any situation like this.

So, we either need a stable way to detect such differences, or think about other implementations.

@Bobronium
Copy link
Contributor Author

typing.Generic, for example, doesn't create new classes for concrete types, but instead uses special class instances and therefore can be pickled. However I'm not sure how it can be used for pydantic models.

@Bobronium
Copy link
Contributor Author

Bobronium commented Aug 29, 2020

I finally found the way to determine call from module, but it won't work with compiled code, but it mostly needed in generics, and since they won't work in compiled code anyway, it's not a problem.

However get_caller_module_name doesn't work in compiled code either, and that's bring inconsistency if we will use it in create_model

The question is: do we need that functionality for non-compiled code, or we better stick with the way it will work the same on both versions?

What will work in compiled and non-compiled code:

  • Add default value to create_model argument __module__=__name__ ('pydantic.main')
  • Add ensure_picklable function wicth will help to ensure that dynamically created models are pickable (all it actually does is create reference in model.__module__ by model.__name__, if it's not used by another object)
import pickle
from pydantic import create_model
# from pydantic.utils import ensure_picklable

# this should not be done this way. Reference should be set globally, not dynamically,
# otherwise there is no guarantee that reference will be created by the time model will be unpickled
# model = ensure_picklable(create_model('FooModel'))
# pickle.dumps(model)  # works

BarModel = create_model('BarModel', __module__=__name__)
pickle.dumps(BarModel)  # works

Additional things that will work in non-compiled code:

  • Determine in what module model was created, so this will work without specifying module:
BarModel = create_model('BarModel')
pickle.dumps(BarModel) 

@remidebette
Copy link

Hi!

Is this blocked? Can we help?

Thank you

@Bobronium
Copy link
Contributor Author

Bobronium commented Sep 14, 2020

@remidebette, I'm trying to figure out consistent way of solving the problem acroas all platforms. Hacky staff with frames seems to not do the job. Hope, I'll have a success with implementing this with __reduce__ or __reduce_ex__ method, but need more time for that.

Until issue is resolved, you can still fix such issues by using create_model only on module level, declaring variable with the name of the model and using __module__ arg like this:

from pydantic import create_model

MyModel = create_model('MyModel', __module__=__name__)

Such models won't have any problems with pickling/forward refs/etc.

For generics you have to manually set reference to base model module, like so:

from typing import Generic, TypeVar

from pydantic.generics import GenericModel

T = TypeVar('T')
class MyGenericModel(Generic[T]. GenericModel):
    t: T

IntModel = MyGenericModel[int]
setattr(MyGenericModel.__module__, IntModel.__name__, IntModel)

Code above might be implemented in your base classe's __class_getitem__ like so:

class MyAutoGenericModel(Generic[T]. GenericModel):
    t: T

    def __class_getitem__(cls, *args, **kwargs):
        created_model = super().__class_getitem__(*args, **kwargs)
        setattr(cls.__module__, created_model.__name__, created_model)
        return created_model

@remidebette
Copy link

Thanks for the detailed explanation!

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, let me know if I can do anything to help fix it.

I assume it's not as simple as moving is_call_from_module to genetics.py which is never compiled?

pydantic/main.py Outdated
@@ -796,12 +796,12 @@ def __values__(self) -> 'DictStrAny':
_is_base_model_class_defined = True


def create_model(
def create_model( # noqa: C901 (ignore complexity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed if create_model hasn't been modified?

@@ -241,7 +243,8 @@ def unique_list(input_list: Union[List[T], Tuple[T, ...]]) -> List[T]:


def update_normalized_all(
item: Union['AbstractSetIntStr', 'MappingIntStrAny'], all_items: Union['AbstractSetIntStr', 'MappingIntStrAny'],
item: Union['AbstractSetIntStr', 'MappingIntStrAny'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anything changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like inconsistency between black versions. Which one should I use?

Comment on lines 569 to 570
except IndexError:
raise RuntimeError('This function must be used inside another function')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except IndexError:
raise RuntimeError('This function must be used inside another function')
except IndexError as e:
raise RuntimeError('This function must be used inside another function') from e

Comment on lines 551 to 552
except IndexError:
raise RuntimeError('This function must be used inside another function')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except IndexError:
raise RuntimeError('This function must be used inside another function')
except IndexError as e:
raise RuntimeError('This function must be used inside another function') from e



@pytest.mark.skipif(
compiled,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this skip required? Surely the code below in create_module is never compiled so should always work.

If is_call_from_module itself needs to be not compiled, I guess you need to move it into genetics.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, somehow I never thought about this. I did so and CI just passed!

Just to be clear: compiled code may not contain any use of GenericModel, right?

def get_current_module_name():
return get_caller_module_name()

assert get_current_module_name() == __name__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert get_current_module_name() == __name__
module_name = get_current_module_name()
assert module_name == __name__, (module_name, __name__)

so we can better understand the error?

Copy link
Contributor Author

@Bobronium Bobronium Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't pytest provide such introspection in case of error? 🤔

upd. it doesn't :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to apply suggestions before moving functions and tests to generics, so have to make changes manually

@samuelcolvin
Copy link
Member

this looks great, please could you:

  • add a change description
  • fix the coverage

@Bobronium
Copy link
Contributor Author

@samuelcolvin, master CI is failing again 😕. I guess PR fromatted with different black version was merged and broke CI.

@samuelcolvin
Copy link
Member

master should now be fixed.

@Bobronium
Copy link
Contributor Author

Bobronium commented Oct 10, 2020

@samuelcolvin, FYI: test_module fixture can be used to test try_assert.py without having a separate command for that:

def test_assert_in_validators(test_module):
    test_module("try_assert.py")

Upd. Actually, in case of try_assert.py, it could be simply renamed to test_assert.py and runned by pytest withoht any problems, just by putting """PYTEST_DONT_REWRITE""" on top of the module

As for current tests, I decided to rewrite them and enhanced create_module and added run_as_module that runs tests in subprocess

pydantic/main.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dynamically create_model that contains forward refs How to pickle a generic pydantic class?
4 participants