Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using native geometric data types #392

Open
greyltc opened this issue Sep 27, 2022 · 7 comments
Open

using native geometric data types #392

greyltc opened this issue Sep 27, 2022 · 7 comments

Comments

@greyltc
Copy link

greyltc commented Sep 27, 2022

I've read https://www.postgresql.org/docs/current/datatype-geometric.html and I'm having some trouble inserting the native circle type shown there with psycopg.

I can insert a native box data type with:

cur.execute('INSERT INTO my_table(box) VALUES (%(box)s)', {"box": (2.2, 2.3, 2.4, 2.5)})

no problems there. However, when I try to do the exact same thing with the native circle data type via

cur.execute('INSERT INTO my_table(cir) VALUES (%(cir)s)', {"cir": (2.2, 2.3, 2.4)})

that errors out with

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.10/site-packages/psycopg/cursor.py", line 725, in execute
    raise ex.with_traceback(None)
psycopg.errors.InvalidTextRepresentation: invalid input syntax for type circle: "(2.2,2.3,2.4)"
CONTEXT:  unnamed portal parameter $1 = '...'

I can't understand why the box case works but the circle case does not.
Have I made a mistake? Is this a bug in something?

@dvarrazzo
Copy link
Member

dvarrazzo commented Sep 27, 2022

Yes, you are making an error. According to postgres docs you can specify a circle using

< ( x , y ) , r >
( ( x , y ) , r )
  ( x , y ) , r
    x , y   , r

whereas you are using (x, y, r).

piro=# select '(2.2,2.3,2.4,2.5)'::box;
┌─────────────────────┐
│         box         │
├─────────────────────┤
│ (2.4,2.5),(2.2,2.3) │
└─────────────────────┘
(1 row)

piro=# select '(2.2,2.3,2.4)'::circle;
ERROR:  invalid input syntax for type circle: "(2.2,2.3,2.4)"
LINE 1: select '(2.2,2.3,2.4)'::circle;
               ^

piro=# select '((2.2,2.3),2.4)'::circle;
┌─────────────────┐
│     circle      │
├─────────────────┤
│ <(2.2,2.3),2.4> │
└─────────────────┘
(1 row)

However, using the tuple ((x, y), r) will not work, and your box example only works by chance. Python tuples are dumped to postgres using the row syntax, and the syntax for nested rows involves using quotes:

>>> conn.execute('SELECT %(cir)s::circle', {"cir": ((2.2, 2.3), 2.4)})
InvalidTextRepresentation: invalid input syntax for type circle: "("(2.2,2.3)",2.4)"

For the boxes it happens to work because a flat 4-element tuple happens to match an accepted syntax for box, but it's totally by chance.

psycopg 3, at the moment, doesn't offer mappings of geometric types. They might be added: for instance:

from dataclasses import dataclass

@dataclass
class Circle:
    x: float
    y: float
    r: float
    
class CircleDumper(psycopg.adapt.Dumper):
    oid = psycopg.adapters.types["circle"].oid
    def dump(self, obj):
        return f"<({obj.x},{obj.y}),{obj.r}>".encode()

conn.adapters.register_dumper(Circle, CircleDumper)

conn.execute('SELECT %(cir)s::circle', {"cir": Circle(2.2, 2.3, 2.4)}).fetchone()
('<(2.2,2.3),2.4>',)

there is no immediate plan to do it and we haven't received many requests for it.

So we can do one of the following:

  • in your application, you create wrappers for the geometric types you use, and dumpers/loaders for them. Your program will work, and you are happy.
  • you can make a MR to add support for all the Postgres geometric types, we can add it to psycopg, your program will work, everyone is happy, and you are a hero.
  • you can wait for us to do it, but likely it will take a few months because we are still solving teething problems of psycopg 3.1 and this will only happen in psycopg 3.2.

Let me know what you prefer to do :)

@greyltc
Copy link
Author

greyltc commented Sep 27, 2022

Hi @dvarrazzo thanks for the thoughtful reply!

I've just done this:

import psycopg
from psycopg.adapt import Loader, Dumper
import ast


class Circle(object):
    def __init__(self, center: tuple[float, float], radius: float):
        self.center = (float(center[0]), float(center[1]))
        self.radius = float(radius)

    def __eq__(self, other):
        comp = False
        if isinstance(other, Circle):
            if (self.center == other.center) and (self.radius == other.radius):
                comp = True
        return comp

    def __ne__(self, other):
        x = self.__eq__(other)
        if x is not NotImplemented:
            return not x
        return NotImplemented

    def __hash__(self):
        return hash(tuple(sorted(self.__dict__.items())))


class CircleDumper(Dumper):
    oid = psycopg.adapters.types["circle"].oid

    def dump(self, co: Circle) -> bytes:
        return f"<{co.center}, {co.radius}>".encode()


class CircleLoader(Loader):
    def load(self, co: memoryview) -> Circle:
        bo = bytes(co.obj).strip(b"<>")
        po = ast.literal_eval(bo.decode())
        return Circle(po[0], po[1])


with psycopg.connect("postgres://") as conn:
    with conn.cursor() as cur:
        cur.adapters.register_dumper(Circle, CircleDumper)
        cur.adapters.register_loader("circle", CircleLoader)

        cur.execute(
            """
            CREATE TABLE test (
                id serial PRIMARY KEY,
                num integer,
                cir circle,
                data text)
            """
        )

        ping = (100, Circle((2.0, 3.0), 4.0), "abc'def")
        cur.execute("INSERT INTO test (num, cir, data) VALUES (%s, %s, %s)", ping)

        cur.execute("SELECT * FROM test")
        pong = cur.fetchone()[1::]

        assert ping == pong, "ping != pong"

        # conn.commit()

I think this matches your first suggestion.
I'm keen to do what needs to be done to turn it into a merge request myself. What do you think?

@dvarrazzo
Copy link
Member

dvarrazzo commented Sep 27, 2022

Hello! yes, that looks a good start. It highlight a few points that would likely become design choices:

  • is Circle (and other geometric object) immutable/hashable? Yours is hashable but mutable, which is not a good combination (you can change center/radius of your point, the hash will change, and whatever dict using it would be extremely confused by it).
  • you implement eq/ne. Are they needed (yes they are, otherwise they can't be used as dict keys). Are they the only needed (you can't sort bidimentional objects, so I guess that's all we need yes).
  • I don't think that eval is really needed to parse a few numbers, the hash implementation can be improved... etc

However, it's a good start and it seems it's putting you on the right track for your project. In my experience you can only design good interfaces if you also use them. So, I propose that you continue what you are doing for yourself, introducing the types you need, and see experiment how they work on your side. When you consider them mature enough, we can work on a merge request.

Does it sound ok for you?

@greyltc
Copy link
Author

greyltc commented Sep 27, 2022

Thanks for the suggestions. That plan sounds good.

I'm still not sure why the box case works actually though.
From the docs, Box supports the following syntaxes:

( ( x1 , y1 ) , ( x2 , y2 ) )
  ( x1 , y1 ) , ( x2 , y2 )
    x1 , y1   ,   x2 , y2

and we expect

cur.execute('INSERT INTO my_table(box) VALUES (%(box)s)', {"box": (2.2, 2.3, 2.4, 2.5)})

to go in as (2.2, 2.3, 2.4, 2.5), right? so why should that work? It's not in the list.

@dvarrazzo
Copy link
Member

You are right, it's not in the list, but postgres accepts it nonetheless.

piro=# select '(2.2, 2.3, 2.4, 2.5)'::box;
┌─────────────────────┐
│         box         │
├─────────────────────┤
│ (2.4,2.5),(2.2,2.3) │
└─────────────────────┘
(1 row)

It seems an undocumented gift, not to be relied upon. Although any input syntax would be good to write a parser, it seems nice and symmetric to use the same used in output (the second one, in the list they provide in the docs).

@greyltc
Copy link
Author

greyltc commented Sep 28, 2022

Using dataclasses is a much better idea than the manual class I had. Here's what I've got now for the full set of native geometric types, everything seems to work alright:

import psycopg
from dataclasses import dataclass
from psycopg.adapt import Loader, Dumper
import ast


@dataclass(frozen=True)
class Point:
    xy: tuple[float, float]


class PointDumper(Dumper):
    oid = psycopg.adapters.types["point"].oid

    def dump(self, po: Point) -> bytes:
        return f"({po.xy[0]},{po.xy[1]})".encode()


class PointLoader(Loader):
    def load(self, co: memoryview) -> Point:
        bo = bytes(co.obj)
        evald = ast.literal_eval(bo.decode())
        return Point(evald)


@dataclass(frozen=True)
class Line:
    A: float
    B: float
    C: float


class LineDumper(Dumper):
    oid = psycopg.adapters.types["line"].oid

    def dump(self, lo: Line) -> bytes:
        return f"{{{lo.A},{lo.B},{lo.C}}}".encode()


class LineLoader(Loader):
    def load(self, co: memoryview) -> Line:
        bo = bytes(co.obj).strip(b"{}")
        evald = ast.literal_eval(bo.decode())
        return Line(*evald)


@dataclass(frozen=True)
class LineSegment:
    xy1: tuple[float, float]
    xy2: tuple[float, float]


class LineSegmentDumper(Dumper):
    oid = psycopg.adapters.types["lseg"].oid

    def dump(self, lso: LineSegment) -> bytes:
        return f"[{lso.xy1},{lso.xy2}]".encode()


class LineSegmentLoader(Loader):
    def load(self, co: memoryview) -> LineSegment:
        bo = bytes(co.obj).strip(b"[]")
        evald = ast.literal_eval(bo.decode())
        return LineSegment(*evald)


@dataclass(frozen=True)
class Path:
    xys: tuple[tuple[float, float], ...]
    closed: bool = True


class PathDumper(Dumper):
    oid = psycopg.adapters.types["path"].oid

    def dump(self, po: Path) -> bytes:
        if po.closed:
            return f"{po.xys}".encode()
        else:
            return f"{list(po.xys)}".encode()


class PathLoader(Loader):
    def load(self, co: memoryview) -> Path:
        bo = bytes(co.obj)
        evald = ast.literal_eval(bo.decode())
        if isinstance(evald, tuple):
            closed = True
        else:
            evald = tuple(evald)
            closed = False
        return Path(evald, closed)


@dataclass(frozen=True)
class Polygon:
    xys: tuple[tuple[float, float], ...]


class PolygonDumper(Dumper):
    oid = psycopg.adapters.types["polygon"].oid

    def dump(self, po: Polygon) -> bytes:
        return f"{po.xys}".encode()


class PolygonLoader(Loader):
    def load(self, co: memoryview) -> Polygon:
        bo = bytes(co.obj)
        evald = ast.literal_eval(bo.decode())
        return Polygon(evald)


@dataclass(frozen=True)
class Circle:
    center: tuple[float, float]
    radius: float


class CircleDumper(Dumper):
    oid = psycopg.adapters.types["circle"].oid

    def dump(self, co: Circle) -> bytes:
        return f"<{co.center},{co.radius}>".encode()


class CircleLoader(Loader):
    def load(self, co: memoryview) -> Circle:
        bo = bytes(co.obj).strip(b"<>")
        evald = ast.literal_eval(bo.decode())
        return Circle(*evald)


with psycopg.connect("postgres://") as conn:
    with conn.cursor() as cur:
        cur.adapters.register_dumper(Point, PointDumper)
        cur.adapters.register_loader("point", PointLoader)
        cur.adapters.register_dumper(Line, LineDumper)
        cur.adapters.register_loader("line", LineLoader)
        cur.adapters.register_dumper(LineSegment, LineSegmentDumper)
        cur.adapters.register_loader("lseg", LineSegmentLoader)
        cur.adapters.register_dumper(Path, PathDumper)
        cur.adapters.register_loader("path", PathLoader)
        cur.adapters.register_dumper(Polygon, PolygonDumper)
        cur.adapters.register_loader("polygon", PolygonLoader)
        cur.adapters.register_dumper(Circle, CircleDumper)
        cur.adapters.register_loader("circle", CircleLoader)

        cur.execute(
            """
            CREATE TABLE test (
                id serial PRIMARY KEY,
                num integer,
                p point,
                lin line,
                lins lseg,
                cir circle,
                cpth path,
                opth path,
                poly polygon,
                data text)
            """
        )

        ping = (
            100,
            Point((3, 2)),
            Line(2, 3.1, 4),
            LineSegment((3, 2), (8.9, 10)),
            Circle((2.0, 3.0), 4.0),
            Path(((2.0, 3.0), (2, 6.0), (2, 7.6), (2, 3)), True),
            Path(((2.0, 3.0), (2, 6.0), (2, 7.6)), False),
            Polygon(((65.2, 28), (44, 1), (-98.33, 21))),
            "abc'def",
        )
        cur.execute(f'INSERT INTO test (num, p, lin, lins, cir, cpth, opth, poly, data) VALUES ({", ".join(["%s"]*len(ping))})', ping)

        cur.execute("SELECT * FROM test")
        pong = cur.fetchone()[1::]

        assert ping == pong, "ping != pong"

        conn.rollback()
        # conn.commit()

EDIT: just realized I forgot box. It's just about identical to lseg

@dvarrazzo
Copy link
Member

Hello @greyltc! I was wondering if you have used your types more, and are satisfied about their behaviour.

Maybe you would like to work to a MR for Psycopg 3.2 to add these objects?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants