Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.to_dict() and .to_json() attribute names are always camel cased #63

Open
circulon opened this issue Jul 12, 2022 · 5 comments
Open

.to_dict() and .to_json() attribute names are always camel cased #63

circulon opened this issue Jul 12, 2022 · 5 comments
Labels
acknowledged documentation Improvements or additions to documentation enhancement New feature or request

Comments

@circulon
Copy link

circulon commented Jul 12, 2022

  • Dataclass Wizard version: 0.22.1
  • Python version: 3.9
  • Operating System: Mac os X 12.4

Description

The documentation is confusing and notes that attributes names are returned as camel case from .to_json() regardless of the actual attribute name in the dataclass definition.
Why is this behaviour used and not just return the attributes as per the repr()?

This breaks many logic points as the returned attributes (from .to_dict and .to_json()) do not actually reflect the attribute names in the actual dataclass.

additional observations

I really like this module as it does an excellent job of handling Types that are not handled properly by the basic json module.
Unfortunately I would now have to write a serialization hook to return the attribute names back to their original name so my projects will function as expeceted
Or build a custom mixin (which I would prefer not to do as this module is already doing this) that uses simplejson for better type handling than the std json module

What I Did

example

from dataclasses import dataclass
from typing import Optional
from dataclass_wizard import JSONWizard

@dataclass
class Something(JSONWizard):
    user_id: Optional[str]
    access_token: str
    expires: int 
    some_type: str 

some = Something(
  expires=3600,
  some_type="hello",
  user_id="1235-1235",
  access_token="abcd-12345-hjgas-12365",
)

>>> print(f"class reps: {repr(some)}")
class reps: Something(user_id='1235-1235', access_token='abcd-12345-hjgas-12365', expires=3600, some_type='hello')
>>> print(f"class dict: {some.to_dict()}")
class dict: {'userId': '1235-1235', 'accessToken': 'abcd-12345-hjgas-12365', 'expires': 3600, 'someType': 'hello'}
>>> print(f"class json: {some.to_json()}")
class json: {"userId": "1235-1235", "accessToken": "abcd-12345-hjgas-12365", "expires": 3600, "someType": "hello"}
>>> 

expected results

The repr(), .to_dict() & .to_json() should all have the same attribute names

@circulon circulon changed the title .to_dict() and .to_json() output is always camal cased .to_dict() and .to_json() attribute names are always camel cased Jul 12, 2022
@rnag
Copy link
Owner

rnag commented Aug 15, 2022

Hi @circulon, thanks for opening this issue. I was curious to know if this handy workaround that was posted earlier in another issue could work for you, at least in the meantime.

Also including it below, just for completeness.

from dataclass_wizard import JSONWizard, DumpMeta

class JSONSnakeWizard(JSONWizard):
    """Helper for JSONWizard that ensures dumping to JSON puts keys in snake_case"""
    def __init_subclass__(cls, str=True):
        """Method for binding child class to DumpMeta"""
        super().__init_subclass__(str)
        DumpMeta(key_transform='SNAKE').bind_to(cls)

Then the only other change would be to update code to subclass from JSONSnakeWizard instead:

class Something(JSONSnakeWizard):
    ...

@rnag rnag added enhancement New feature or request acknowledged labels Aug 15, 2022
@rnag rnag added the documentation Improvements or additions to documentation label Oct 11, 2022
@rnag
Copy link
Owner

rnag commented Oct 11, 2022

The documentation is confusing and notes that attributes names are returned as camel case from .to_json() regardless of the actual attribute name in the dataclass definition.

Why is this behaviour used and not just return the attributes as per the repr()?

This breaks many logic points as the returned attributes (from .to_dict and .to_json()) do not actually reflect the attribute names in the actual dataclass.

This is a very good point, and the short answer to that is that when I was originally designing this library, it just "made sense" at the time. I.e., when dealing with JSON (which stands for JavaScript object notation) it made sense at the time to use JS convention for key names, which ideally was camelCase instead of snake_case. Of course, I can now understand why that would be confusing when working in Python, where all attribute names are snake-cased by convention.

So, just adding a note, but the plan is that in the next major release (still TBD) this case will likely be addressed. Ie, Attribute or key names will be returned "un-changed" as part of the dump process, by default. For example, if attribute or field names are snake-cased, they should also be similarly snake-cased in the JSON object returned when to_dict or to_json is called; if field names are camel-cased, they should similarly be retained as camel-cased in the JSON output.

I plan to add a milestone to track this, but I note however it will likely need to be implemented in a major version release (rather than a minor release) as this will be a "breaking" change so to speak. However, I definitely agree this is a good change to implement, also so that there is less confusion overall.

@rnag rnag closed this as completed Oct 11, 2022
@rnag
Copy link
Owner

rnag commented Nov 27, 2024

I know it's been a while but it's 2024 and lot of changes have been made, and on the roadmap for V1 is to ensure no key transform in dump process.

Accordingly, I've added a Mixin class JSONPyWizard that does exactly this, and also added a note that this will be the default behavior in V1.

@rnag
Copy link
Owner

rnag commented Nov 27, 2024

Re-opening this issue because I do 100% understand where you're coming from. I've also had similar trouble lately, and realized that the design decision of camelCase was a perhaps ill-advised choice 😞 .

That said -- the year 2024 is winding down, and on the roadmap for V1 is to ensure no key transform in dump process.

Accordingly, I've added a Mixin class JSONPyWizard that does exactly this, and also added a note that this will be the default behavior in V1.

Feel free to follow my announcement on #153 to keep up-to-date on what's expected in V1. Thanks!

@rnag rnag reopened this Nov 27, 2024
@circulon
Copy link
Author

circulon commented Nov 27, 2024

@rnag
Thanks for reopening this, I ended up using something else...

In my case I nneded performance for serialization of complex nested dataclasses.
I found a performance enhancement for asdict() and astuple() in Python 3.12.
I based a gist dataclass_util.py on this for use with Python 3.11 with additional enhancements.
This reduced runtimes in my AWS Lambda functions considerably.

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants