Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement IPv6 support in the inet extension. #10839

Merged
merged 7 commits into from
Mar 11, 2024

Conversation

troycurtisjr
Copy link
Contributor

@troycurtisjr troycurtisjr commented Feb 25, 2024

I've started to use the duckdb and have several cases where the inet module has been really nice. However, it doesn't currently support IPv6, so this PR aims to add that support. There is also a somewhat related discussion about using uhugeint to store IPv6.

The CI runs for the latest rebase are in progress, but there was already a mostly successful back before my latest rebase. There were a few failures in the NightlyTests action, but nothing seemed related to the inet extension. I'm hoping the new run will have pulled in fixes from main.

Corresponding doc PR: duckdb-web

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks great, well done!

Some minor comments below. Postgres also supports the family function which returns whether or not a value is IPv4 or IPv6 - perhaps it makes sense to add support for that as well?

if (addr_type == IPAddressType::IP_ADDRESS_V6) {
// The top bit is flipped when storing as the signed hugeint so that sorting
// works correctly. Flip it back here to have a proper unsigned value.
return retval ^ (uhugeint_t(1) << 127);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bitwise ops on uhugeint are somewhat expensive - perhaps we can do the bitwise operation directly on the component that is affected instead (e.g. new_addr.upper or new_addr.lower)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, you usually do pay a price for magic 😉 . I've started some implementations, and I'll take a look at some of the other bit operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the implementation to perform bit ops directly on the two halves of the hugeint object.

using INET_TYPE = StructTypeTernary<uint8_t, hugeint_t, uint16_t>;

static uhugeint_t from_compat_addr(hugeint_t compat_addr, IPAddressType addr_type) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could we use the naming conventions we use elsewhere in the codebase for functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, just snake cased it out of habit. Updated.

#pragma GCC diagnostic ignored "-Wunused-function"
#endif

#include "duckdb/common/operator/numeric_cast.hpp"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this include here necessary at all? Can we perhaps include duckdb/common/operator/cast_operators.hpp instead? Then we might also be able to get rid of all of the pragmas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out it isn't necessary, uhugeint has a built in cast operator from hugeint, and going from always positive hugeint to uhugeint has no risk of overflow, so the lack of checks isn't a concern.

# overflow
statement error
select INET '255.255.255.255' + 1
----
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an expected error message here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll work on adding these cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added expected outputs to various error tests.

# quibble with too many digits
statement error
SELECT '2001:db8:100:0:c59b:cef3:a0d189:6de4'::INET;
----
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add expected errors here?

* Fix function naming to match the project style
* Optimize some uhugeint bit operations
* Drop the unnecessary numeric_cast header and associated pragmas
@github-actions github-actions bot marked this pull request as draft February 27, 2024 02:59
@troycurtisjr
Copy link
Contributor Author

Thanks for the PR! Looks great, well done!

Some minor comments below. Postgres also supports the family function which returns whether or not a value is IPv4 or IPv6 - perhaps it makes sense to add support for that as well?

I have a vague idea of implementing many of the helpers from postgres inet features actually, but wanted to keep this PR focused on the basic support. Still this should be an easy add and provides a programatic way to detect v4 versus v6 now that both are supported, so it seems sensible. I'll get it added in.

@Mytherin Mytherin marked this pull request as ready for review February 28, 2024 09:08
@github-actions github-actions bot marked this pull request as draft March 2, 2024 14:54
@troycurtisjr
Copy link
Contributor Author

I implemented the review comments, including adding the family() function. It is ready for another round of review.

@troycurtisjr troycurtisjr marked this pull request as ready for review March 2, 2024 14:56
@carlopi
Copy link
Contributor

carlopi commented Mar 7, 2024

Single failure is on Checks extension entries since family function needs to be declared also there.
I recently made some improvement there, I would say merging (or rebasing) with main and adding this line:

     {"drop_fts_index", "fts", CatalogType::PRAGMA_FUNCTION_ENTRY},
     {"dsdgen", "tpcds", CatalogType::TABLE_FUNCTION_ENTRY},
     {"excel_text", "excel", CatalogType::SCALAR_FUNCTION_ENTRY},
+    {"family", "inet", CatalogType::SCALAR_FUNCTION_ENTRY},
     {"from_json", "json", CatalogType::SCALAR_FUNCTION_ENTRY},
     {"from_json_strict", "json", CatalogType::SCALAR_FUNCTION_ENTRY},
     {"from_substrait", "substrait", CatalogType::TABLE_FUNCTION_ENTRY},

to src/include/duckdb/main/extension_entries.hpp should make this work properly.

I am testing the changes here: https://github.com/carlopi/duckdb/actions/runs/8185666386

@Mytherin Mytherin merged commit dea2294 into duckdb:main Mar 11, 2024
48 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants