Pack/unpack Python dict
s into/out of MariaDB's Dynamic Columns format.
A quick example:
>>> mariadb_dyncol.pack({"key": "value"})
b'\x04\x01\x00\x03\x00\x00\x00\x03\x00key!value'
>>> mariadb_dyncol.unpack(mariadb_dyncol.pack({"key": "value"}))
{'key': 'value'}
Use pip:
python -m pip install mariadb-dyncol
Python 3.7 to 3.10 supported.
Working on a Django project? Check out my book Boost Your Django DX which covers many ways to improve your development experience.
- Sensible type mapping from Python to SQL
- Tested against examples from MariaDB, including property/fuzz testing with hypothesis (which is amazing and found many bugs)
The normal way for adding data into dynamic columns fields is with the
COLUMN_CREATE
function, and its relatives. This allows you to do things
like:
INSERT INTO mytable (attrs) VALUES (COLUMN_CREATE('key', 'value'))
Unfortunately the Django ORM is restricted and cannot use database functions like this in every instance, at least not until Django 1.9. It was this limitation I hit whilst implementing a dynamic columns field for my project django-mysql that spurred the creation of this library.
By pre-packing the dynamic columns, the above query can just insert the blob of data directly:
INSERT INTO mytable (attrs) VALUES (X'0401000300000003006B65792176616C7565')
Asides from being more easily implemented with the Django ORM, this approach of packing/unpacking dynamic columns in Python also has some advantages:
- All data types are properly preserved in Python. The only way MariaDB
provides of pulling back all values for a dynamic columns field is to call
COLUMN_JSON
, but JSON only supports strings and integers. AlsoCOLUMN_JSON
has a depth limit of 10, but the format has no actual limit. - The CPU overhead of packing/unpacking the dynamic columns is moved from you database server to your (presumably more scalable) clients.
All functions and names are accessible as attributes of the mariadb_dyncol
module, which you can import with import mariadb_dyncol
.
Packs the given mapping (a dict
) into the MariaDB Dynamic Columns
format for named columns and returns it as a byte string (Python 3's bytes
,
Python 2's str
). This is suitable for then inserting into a table as part
of a normal query.
The dict
's keys must all be unicode strings, and the values must all be
one of the supported data types:
int
between-(2 ** 32) + 1
and(2 ** 64) - 1
(Python 2:long
is supported too)str
up to 4GB encoded in UTF-8 (Python 2:unicode
)float
- anything exceptNaN
or+/- inf
datetime.datetime
- full range supporteddatetime.date
- full range supporteddatetime.time
- full range supported- Any
dict
that is valid by these rules, allowing nested keys. There is no nesting limit except from for MariaDB'sCOLUMN_JSON
function which restricts the depth to 10
Note that this does not support the DECIMAL
type that MariaDB does (and
would naturally map to Python's Decimal
) - it is a little more fiddly to
pack/unpack, though certainly possible, and pull requests are welcomed. If you
try and pack a Decimal
, a DynColNotSupported
exception will be raised.
There are other restrictions on the UTF-8 encoded column names as documented in MariaDB:
- The maximum length of a column name is 16383 bytes
- The maximum length of all column names (at one level in nested hierarchies) is 65535 bytes
All other unsupported types will raise a DynColTypeError
. Out of range
values will raise a DynColValueError
.
Examples:
>>> mariadb_dyncol.pack({"a": 1})
b'\x04\x01\x00\x01\x00\x00\x00\x00\x00a\x02'
>>> mariadb_dyncol.pack({"a": "💩"})
b'\x04\x01\x00\x01\x00\x00\x00\x03\x00a!\xf0\x9f\x92\xa9'
Unpacks MariaDB dynamic columns data encoded byte string into a dict; the types
you can expect back are those listed above. This is suitable for fetching the
data direct from MariaDB and decoding in Python as opposed to with MariaDB's
COLUMN_JSON
function, preserving the types that JSON discards.
As noted above, DECIMAL
values are not supported, and unpacking this
will raise DynColNotSupported
. Also strings will only be decoded with the
MySQL charsets utf8
or utf8mb4
; strings with other charsets will raise
DynColNotSupported
as well.
Unsupported column formats, for example the old MariaDB numbered dynamic
columns format, or corrupt data, will raise DynColValueError
.
Examples:
>>> mariadb_dyncol.unpack(b"\x04\x01\x00\x01\x00\x00\x00\x03\x00a!\xf0\x9f\x92\xa9")
{"a": "💩"}
>>> mariadb_dyncol.unpack(b"\x04\x01\x00\x01\x00\x00\x00\x00\x00a\x02")
{"a": 1}