-
-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build parameters in Composables #339
Comments
Looking at your proposal, it seems that most The added argument can be a sequence of values or a mapping of values. As a special case might be a single value, to be interpreted as Then comes the difficult part: when you create a Would we be able to tell apart the following? It doesn't seem easy because the sql.SQL("where x = %s and {}", value=10).format(sql.SQL("y = %s", value=20))
sql.SQL("where {} and x = %s", value=10).format(sql.SQL("y = %s", value=20)) By the look of it, it seems that the parameters should "bubble up", aggregated, to the root object of the tree, so that This is some real(ish) code I have which would benefit of this feature: async def _fetch(
self,
conn: AsyncConnection[Any],
*,
id: UUID | None = None,
group_id: UUID | None = None,
category: UUID | None = None,
deleted: bool = False,
) -> list[Res]:
conds: list[sql.Composable] = []
args: dict[str, Any] = {}
if id is not None:
conds.append(sql.SQL("r.id = %(id)s"))
args["id"] = id
if group_id is not None:
conds.append(sql.SQL("r.group_id = %(group_id)s"))
args["group_id"] = group_id
if category is not None:
conds.append(sql.SQL("r.category = %(category)s"))
args["category"] = category
if not deleted:
conds.append(sql.SQL("r.deleted_at is null"))
if not conds:
conds.append(sql.SQL("true"))
objs_sql = sql.SQL(
"""
select
id, name, description, ...
from {res_table} r
where {filters}
"""
).format(
res_table=self.res_table_name,
filters=sql.SQL(" and ").join(conds),
)
async with conn.cursor(row_factory=class_row(self.res_class)) as cur:
await cur.execute(objs_sql, args)
recs = await cur.fetchall()
return recs Using your proposal, this code wouldn't need a separate I don't like very much the loss of separation that The changes to the code could be: --- tmp-old.py 2022-07-19 00:01:29.143346787 +0100
+++ tmp-new.py 2022-07-19 00:03:43.588569435 +0100
@@ -8,16 +8,12 @@
deleted: bool = False,
) -> list[Res]:
conds: list[sql.Composable] = []
- args: dict[str, Any] = {}
if id is not None:
- conds.append(sql.SQL("r.id = %(id)s"))
- args["id"] = id
+ conds.append(sql.SQL("r.id = %s", value=id)) # value= optional?
if group_id is not None:
- conds.append(sql.SQL("r.group_id = %(group_id)s"))
- args["group_id"] = group_id
+ conds.append(sql.SQL("r.group_id = %s", group_id))
if category is not None:
- conds.append(sql.SQL("r.category = %(category)s"))
- args["category"] = category
+ conds.append(sql.SQL("r.category = %s", category))
if not deleted:
conds.append(sql.SQL("r.deleted_at is null"))
if not conds:
@@ -36,7 +32,7 @@
)
async with conn.cursor(row_factory=class_row(self.res_class)) as cur:
- await cur.execute(objs_sql, args)
+ await cur.execute(objs_sql, objs_sql.values)
recs = await cur.fetchall()
return recs So, uhm... It's not an earth-shattering feature, but I think I'd just be an old grumpy fart if I just said "I don't like that! I can do it with an extra var!"... I am inclined to give this idea a chance, unless the implementation is not so awfully messy to suggest it's the wrong direction. The case of Would you like to give a try to implement it? |
It seems to me that the response you gave is the best one I could have hoped for! Thank you for the work you put into responding to my idea. Here are some thoughts in response, in no particular order.
I have the opposite preference, and precisely for the reason you mention. If we wanted to make a more ergonomic spelling, we could have another class that was a shorthand for constructing the identifier to placeholder mapping, but I think that sounds tricky, as you'd likely want to specify the operator as well. I think your suggestion of putting it on the identifier would lead to a messy code smell in the implementation.
These are constructor arguments, and I'd expect subclasses to decide what their optimal function signature would be. A Placeholder would take a single value, while an As a related aside, it's odd to me that there is an
Mixing sequences and mappings
combining overlapping mappingsNon-overlapping mappings can and should be combined. I see three possible options for how we can handle overlapping keys in mapping params.
Its not clear to me which option is preferable, but I think I'm inclined toward raising an error. While its quite common to wish to re-use a keyword parameter, I think that validation of sameness is error prone, and picking one without the choice coming from the user seems like a bad design decision. combining interspersed sequencesAs you've identified, this is particularly tricky. A first pass solution is to disallow complex combinations to use sequence params. This would significantly reduce the applicability of this feature, but potentially could get us a proof of concept more quickly. But I do think that this is a solvable problem. The detecting mismatch arguments to placeholdersIf we got to the place that we were parsing fragments as described above, it could allow us to notice if the parameters passed in for any particular fragment didn't match. Passing extra sequence or mapping params seems like it has no obvious semantic meaning, so it would be helpful if it would notice that and raise a helpful error for the developer to get early. Missing params for placeholdersIf an incomplete number of params is given, such as a smaller sequence or a mapping that's missing named keys, I can see potential utility for that. Even if we would want that, we might prefer to still disallow it for the first proof-of-concept implementation. But it might be helpful to consider what utility that might have, and what semantics we'd expect if it were useful. I think that it would mean "I'd like to pass the missing parameters in later myself". If this was a worthwhile use-case, it might suggest that the mechanism for retrieving the gathered parameters after the query is built should be a function rather than a property, so that missing params from fragments can be filled in the order they are present, even if other later fragments have additional params. If that were what we wanted, then if the function was called without additional params, it could give placeholders for positional arguments that needed to be filled, with some new constant with a helpful repr. I think the ugliest part might be if we'd end up needing to have a check for that new constant to make sure that threw an error if passed into
I'll be careful about making promises I haven't always been able to keep, but it at least sounds interesting to me, and I'd like to! You're right that this isn't earth-shattering, and it doesn't fit as well as the similar feature from SQLAlchemy does. SQLAlchemy's position is different, because it's in a place to completely control how all the parameters are referenced or named, so it doesn't have to worry about what to do in the edge-cases we've identified above. |
Would this (real) example be what you are looking for? filters = {'a': 1, 'b': 2}
query = (
sql.SQL("SELECT * FROM {tbl} WHERE {filters}")
.format(
tbl=sql.Identifier('mytable'),
filters=sql.SQL(' AND ').join([sql.Identifier(key) + sql.SQL('=') + sql.Literal(value) for key, value in filters.items()])
)
) Apologies if there are additional complexities here that I don't realize. |
@madsruben I believe your example would be vulnerable to SQL injection. Also, I want to give a big 👍 to this enhancement idea overall. Binding parameters with queries makes it a lot easier to compose queries, for example you can write reusable functions for subqueries: def active_users_subquery(last_active_date):
return sql.SQL("SELECT * FROM USERS WHERE last_active > {last_active}")
.format(last_active=Parameter(last_active))
# re-use elsewhere
active_users = active_users_subquery(last_active)
query = SQL("""
SELECT * FROM COMMENTS
INNER JOIN ({users}) as active_users
ON COMMENTS.user_uuid = USERS.user_uuid
""").format(users=active_users)
cursor.execute(query) This example is trivial and could be accomplished with bound parameters, but complex queries with several optionally provided parameters would be difficult to compose without this feature. |
This is also how https://github.com/Drizin/DapperQueryBuilder works, although that's able to use string interpolation directly because it's c#. |
Hey @dpikt. Would you care to comment further on why you believe this code would be vulnerable? I've also tested this thoroughly without finding any issues, so please enlighten us as to why you believe it to be different. Thanks. |
@madsruben you're completely right - I looked at the docs again and I was misunderstanding what With that being the case, the reason to use |
@dpikt thank you for the update. I agree that this feature with server-side parameter binding would be awesome. |
@madsruben @dpikt would either of you be interested in reviewing the pull request I made, and see what you think? The tests aren't passing and quite out of date at the moment, but it may give you a sense of what I'm going for. #441 |
@ryanhiebert I'd be happy give it a look, although I'd like to check with the maintainer first that this is still a feature he's interested in adding (it seems like it's been lingering a while). |
He never got around to looking at it, and he’s not as keen on it as we are, but until he says something different I think it reasonable to think he still sees merit in it generally. I’m sure he’s trying to manage various priorities and suspect he may not be super responsive to us on this, and that seems okay to me. If we come up with an implementation that pleases us, I think that will likely make his job easier when he goes to look at it. |
SQL
andComposable
are great for composing queries. Coming from SQLAlchemy, I made an incorrect assumption that it would allow me to build up the query, including parameters at the same time. It doesn't, but I'm not understanding any technical limitation that would make that a bad idea. Am I missing a good reason to avoid that approach, or some technical limits that make that more difficult than it appears to me?Here's a rough example, ignorant of the details of what would need to happen, of the kind of thing I'm talking about:
There's sure to be some complexity regarding when to use positional and keyword
%
parameters. And more complexity with composing the parameters, especially if they are positional parameters and we need to figure out where the%
parameters are compared to the{}
type formatting as they are composed. This seems like a really nice way to build dynamic queries, at least from my admittedly limited perspective.The text was updated successfully, but these errors were encountered: