Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/posix: add module posix_netdb #16853

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

HendrikVE
Copy link
Contributor

@HendrikVE HendrikVE commented Sep 14, 2021

Contribution description

This PR adds a new posix module which provides netdb functions like gethostbyname or getaddrinfo. It does not come with its own implementation.

Testing procedure

You can test this PR e.g. in conjunction with everything else in #15969.

Issues/PRs references

Missing posix feature, see #7801.
This PR is split out from the chunky #15969.

Dependencies:

@github-actions github-actions bot added Area: network Area: Networking Area: pkg Area: External package ports Area: sys Area: System labels Sep 14, 2021
@HendrikVE HendrikVE added the Type: new feature The issue requests / The PR implemements a new feature for RIOT label Sep 14, 2021
@HendrikVE HendrikVE mentioned this pull request Sep 14, 2021
8 tasks
@HendrikVE HendrikVE added the Area: POSIX Area: POSIX API wrapper label Sep 14, 2021
miri64
miri64 previously requested changes Sep 14, 2021
Copy link
Member

@miri64 miri64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added the option to use posix sockets for the existing sock_dns module. Without using posix sockets I had extremely poor success rates for address resolution (approximately 1 out of 20).

NACK. 1. We rather should find the reason, why sock_udp performs so poorly compared to POSIX sockets (which below use sock_udp, so this fact seems weird to me). 2. sock_dns using POSIX sockets turns everything on its head, now we have a sock module using POSIX sockets using a sock module. Conceptually alone this sounds very dirty.

@HendrikVE
Copy link
Contributor Author

HendrikVE commented Sep 16, 2021

I have to correct myself. Either I did not remember correctly (has been some months now) or it is due to the rebase, but without using posix sockets it is not working at all. I traced it back to the function sys_arch_mbox_fetch at pkg/lwip/contrib/sys_arch.c which is repeated a few times and then gets stuck for some reason. I agree it is dirty, I should have mentioned that this is one of those hacks that should be removed in the end ;) But I needed this for now in order to proceed with the wolfmqtt port.
I pushed a throwaway commit to #15969. In the file dns.c you will find the line #define USE_POSIX 1 where you can switch between the original module and the posix extension. Any idea what could cause this?

@miri64
Copy link
Member

miri64 commented Sep 16, 2021

Ok, so the problem is with lwIP's sock implementation. That at least tells me, that sock is not completely broken ;-). As to sys_arch_mbox_fetch: It is very similar to what gnrc_sock_recv() does in gnrc_sock.c (the first is just more generalized). Could you check if there is any significant difference (e.g. #16831 was merged recently there) that could make a difference in your case? Maybe it is about time we add a xtimer/ztimer function for that kind of functionality (xtimer_mbox_get_timeout()...).

@miri64
Copy link
Member

miri64 commented Sep 16, 2021

In the file dns.c you will find the line #define USE_POSIX 1 where you can switch between the original module and the posix extension.

Still a dirty hack that should not go into sock_dns. Then I would rather have a new posix_dns module. Due to the DNS message being factored out to dns_msg now, that should be easy enough :-). But I guess in that case it makes more sense to use implement it in posix_netdb directly.

@miri64
Copy link
Member

miri64 commented Sep 16, 2021

Due to the DNS message being factored out to dns_msg now, that should be easy enough :-).

Speaking of, I really think you should rebase this PR ;-).

@HendrikVE
Copy link
Contributor Author

Ok, so the problem is with lwIP's sock implementation. That at least tells me, that sock is not completely broken ;-). As to sys_arch_mbox_fetch: It is very similar to what gnrc_sock_recv() does in gnrc_sock.c (the first is just more generalized). Could you check if there is any significant difference (e.g. #16831 was merged recently there) that could make a difference in your case? Maybe it is about time we add a xtimer/ztimer function for that kind of functionality (xtimer_mbox_get_timeout()...).

I'll have a look into it

@HendrikVE
Copy link
Contributor Author

In the file dns.c you will find the line #define USE_POSIX 1 where you can switch between the original module and the posix extension.

Still a dirty hack that should not go into sock_dns. Then I would rather have a new posix_dns module. Due to the DNS message being factored out to dns_msg now, that should be easy enough :-). But I guess in that case it makes more sense to use implement it in posix_netdb directly.

This was just so one could test this with a working wolfmqtt application ;) I would rather find out why this is happening, but if I can't find out the reason in reasonable time and if you say it is a viable option I might resort to implement a posix_dns module.

@miri64
Copy link
Member

miri64 commented Sep 16, 2021

This was just so one could test this with a working wolfmqtt application ;) I would rather find out why this is happening, but if I can't find out the reason in reasonable time and if you say it is a viable option I might resort to implement a posix_dns module.

Again, I think the more viable option in the latter case would be, I think to put the functionality directly into posix_netdb / gethostbyname instead of adding another module that would implement non-standard functions.

sys/posix/netdb/netdb.c Outdated Show resolved Hide resolved
@fjmolinas fjmolinas added this to the Release 2022.01 milestone Nov 18, 2021
@HendrikVE HendrikVE force-pushed the pr/wolfmqtt_split_5 branch from a972bb8 to 96aeab2 Compare January 9, 2022 14:07
@github-actions github-actions bot added Area: doc Area: Documentation Area: tests Area: tests and testing framework labels Jan 9, 2022
@github-actions github-actions bot removed the Area: build system Area: Build system label Apr 9, 2022
@miri64 miri64 dismissed their stale review May 23, 2022 11:43

My comments were addressed. Sorry for the long blocking.

@HendrikVE HendrikVE force-pushed the pr/wolfmqtt_split_5 branch from 965fd34 to 8efa4d4 Compare October 6, 2022 09:35
@github-actions github-actions bot removed the Area: pkg Area: External package ports label Oct 6, 2022
@riot-ci
Copy link

riot-ci commented Oct 6, 2022

Murdock results

FAILED

af92ba3 tests/posix_netdb: add test

Build failures (1)
Application Target Toolchain Runtime (s) Worker
tests/posix_netdb esp32-wroom-32 gnu 2.41 beast

Artifacts

This only reflects a subset of all builds from https://ci-prod.riot-os.org. Please refer to https://ci.riot-os.org for a complete build for now.

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the description needs an update, but what's LWIP specific about this?

sys/posix/netdb/netdb.c Outdated Show resolved Hide resolved
sys/posix/netdb/netdb.c Outdated Show resolved Hide resolved
@HendrikVE HendrikVE force-pushed the pr/wolfmqtt_split_5 branch from 8efa4d4 to 81e3f63 Compare October 6, 2022 11:47
@benpicco benpicco removed this from the Release 2022.04 milestone Oct 6, 2022
@HendrikVE HendrikVE force-pushed the pr/wolfmqtt_split_5 branch from 81e3f63 to af92ba3 Compare October 6, 2022 12:44
@benpicco
Copy link
Contributor

benpicco commented Oct 7, 2022

CI is unhappy

@HendrikVE
Copy link
Contributor Author

CI is unhappy

Seems like the esp32 platform, in contrast to the other platforms, defines its own errno.h in ../../../build/pkg/esp32_sdk/components/newlib/platform_include/errno.h. We still need do provide these values for all the other platforms though. So either we check for every value whether it is set already or we think about a way how to handle errno.h in general, like providing a RIOT-global errno.h. I'd prefer the latter.

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by providing errno.h in RIOT - it's already provided by libc

/**
* @brief The name could not be resolved at this time. Future attempts may succeed.
*/
#define EAI_AGAIN 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define EAI_AGAIN 1
#ifndef EAI_AGAIN
#define EAI_AGAIN 1
#endif

would be the simplest solution

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could lead to overlapping or diverging number spaces, when the libc/IDE provides defines differ from the defines here. Rather, it should just be avoided to include external headers (as we did already e.g. for defines in net/inet.h).

@HendrikVE
Copy link
Contributor Author

I'm not sure what you mean by providing errno.h in RIOT - it's already provided by libc

If this source is correct, then libc defines EAGAIN but not EAI_AGAIN. Futhermore if one looks for it, EAI_AGAIN is directly associated with getaddrinfo/posix. It's strange that the esp32 would include these posix errors in its errno.h then, isn't it?

@HendrikVE
Copy link
Contributor Author

For some weird reason the three error codes EAI_SOCKTYPE, EAI_AGAIN and EAI_BADFLAGS are declared in esp32's errno.h. ifndef would not fix this issue, since I would need to adapt to the same values as given by the esp32. Since esp32 defines EAI_BADFLAGS=3 it would have the same value as my EAI_FAIL which is really bad.

@gschorcht Can we get rid of these error codes for the esp32?

@HendrikVE HendrikVE force-pushed the pr/wolfmqtt_split_5 branch from af92ba3 to 3d3fc4e Compare October 7, 2022 15:58
@HendrikVE HendrikVE removed the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Oct 7, 2022
@github-actions github-actions bot removed the Area: doc Area: Documentation label Oct 7, 2022
@HendrikVE HendrikVE force-pushed the pr/wolfmqtt_split_5 branch from 3d3fc4e to 7cec42c Compare May 30, 2023 15:46
Comment on lines +161 to +167
* @retval EAI_AGAIN Recoverable error, try again later
* @retval EAI_BADFLAGS Invalid value in flag parameters (hints->ai_flags)
* @retval EAI_FAIL Non-recoverable error
* @retval EAI_FAMILY Unsupported address family
* @retval EAI_MEMORY No memory available to receive data or to allocate result object
* @retval EAI_NONAME Name does not resolve
* @retval EAI_SERVICE Service does not resolve
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSIX spec just says a non-zero return value indicates failure, but the errno should be set. The implementation should reflect the specified behavior.

*/
#define INADDR_ANY ((in_addr_t)0x00000000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant is specified for this header. Please do not remove it.

Comment on lines +96 to +97
/* invalid name length */
return EAI_FAIL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other instances of early return

Suggested change
/* invalid name length */
return EAI_FAIL;
/* invalid name length */
errno = EAI_FAIL
return EAI_FAIL;

Copy link
Member

@miri64 miri64 May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even use an error goto?

At the very end of your function put

error:
    return errno;

And then, whenever there is an error (such in this case) use

Suggested change
/* invalid name length */
return EAI_FAIL;
/* invalid name length */
errno = EAI_FAIL;
goto error;

Comment on lines +176 to +195
uint8_t addr_buf[16] = { 0 };
int res = dns_query(nodename, addr_buf, hints->ai_family);
switch (res) {
case -ETIMEDOUT:
/* fall-through */
case -EAGAIN:
return EAI_AGAIN;

case -ENOMEM:
return EAI_MEMORY;

case INADDRSZ:
addr.type = AF_INET;
memcpy(&addr.ipv4, addr_buf, INADDRSZ);
break;

case IN6ADDRSZ:
addr.type = AF_INET6;
memcpy(&addr.ipv6, addr_buf, IN6ADDRSZ);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use addr directly?

Make ipvx_addr_t the following:

typedef struct ipvx_addr {
    sa_family_t type;
    union {
        struct in_addr ipv4;
        struct in6_addr ipv6;
        uint8_t u8[16];
    };
} ipvx_addr_t;

Then just do

Suggested change
uint8_t addr_buf[16] = { 0 };
int res = dns_query(nodename, addr_buf, hints->ai_family);
switch (res) {
case -ETIMEDOUT:
/* fall-through */
case -EAGAIN:
return EAI_AGAIN;
case -ENOMEM:
return EAI_MEMORY;
case INADDRSZ:
addr.type = AF_INET;
memcpy(&addr.ipv4, addr_buf, INADDRSZ);
break;
case IN6ADDRSZ:
addr.type = AF_INET6;
memcpy(&addr.ipv6, addr_buf, IN6ADDRSZ);
break;
int res = dns_query(nodename, addr->u8, hints->ai_family);
switch (res) {
case -ETIMEDOUT:
/* fall-through */
case -EAGAIN:
return EAI_AGAIN;
case -ENOMEM:
return EAI_MEMORY;
case INADDRSZ:
addr.type = AF_INET;
break;
case IN6ADDRSZ:
addr.type = AF_INET6;
break;

And safe both a little ROM and RAM

Comment on lines +238 to +253
if (addr.type == AF_INET) {
ai->ai_family = AF_INET;

struct sockaddr_in *sa4 = (struct sockaddr_in*) sas;
sa4->sin_addr = addr.ipv4;
sa4->sin_family = AF_INET;
sa4->sin_port = htons(port);
}
else if (addr.type == AF_INET6) {
ai->ai_family = AF_INET6;

struct sockaddr_in6 *sa6 = (struct sockaddr_in6*) sas;
sa6->sin6_addr = addr.ipv6;
sa6->sin6_family = AF_INET6;
sa6->sin6_port = htons(port);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (addr.type == AF_INET) {
ai->ai_family = AF_INET;
struct sockaddr_in *sa4 = (struct sockaddr_in*) sas;
sa4->sin_addr = addr.ipv4;
sa4->sin_family = AF_INET;
sa4->sin_port = htons(port);
}
else if (addr.type == AF_INET6) {
ai->ai_family = AF_INET6;
struct sockaddr_in6 *sa6 = (struct sockaddr_in6*) sas;
sa6->sin6_addr = addr.ipv6;
sa6->sin6_family = AF_INET6;
sa6->sin6_port = htons(port);
}
ai->ai_family = addr.type;
struct sockaddr_in6 *sa = (struct sockaddr_in6*) sas;
sa->sin6_addr = addr.ipv6;
sa->sin6_family = addr.type;
sa->sin6_port = htons(port);

Should suffice.

return EAI_MEMORY;
}

memcpy(ai->ai_canonname, nodename, name_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Use strncpy() instead? You are copying a string here after all.

Copy link
Contributor

@benpicco benpicco May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +6 to +8
USEMODULE += sock_udp
USEMODULE += gnrc_ipv6

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need the whole stack if you mock DNS anyway?

TEST_ASSERT_EQUAL_INT(AF_INET, result->ai_family);
TEST_ASSERT_NOT_NULL(inet_ntop(AF_INET, &((struct sockaddr_in*)(result->ai_addr))->sin_addr,
addr4_str, sizeof(addr4_str)));
TEST_ASSERT_EQUAL_INT(0, strcmp(addr4_str, test_ipv4));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other instances where you used TEST_ASSERT_EQUAL_INT(0, strcmp(...))

Suggested change
TEST_ASSERT_EQUAL_INT(0, strcmp(addr4_str, test_ipv4));
TEST_ASSERT_EQUAL_STRING(addr4_str, test_ipv4);

Comment on lines +192 to +193
TEST_ASSERT_NOT_NULL(inet_ntop(AF_INET6, &((struct sockaddr_in6*)(result->ai_addr))->sin6_addr,
addr6_str, sizeof(addr6_str)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use a temporary variable here for sockaddr_in6 for better readability?


switch (result->ai_family) {
case AF_INET:
memcpy(&addr_ipv4, &((struct sockaddr_in*)(result->ai_addr))->sin_addr.s_addr, sizeof(ipv4_addr_t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dito (also: in its current state the line is longer than 100 chars)

Comment on lines +168 to +169


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment on lines +1 to +17
include ../Makefile.tests_common

USEMODULE += embunit

# make sure we have an implementation of sock_types.h
USEMODULE += sock_udp
USEMODULE += gnrc_ipv6

USEMODULE += ipv4_addr
USEMODULE += ipv6_addr

# use mock data for the test
USEMODULE += sock_dns_mock

USEMODULE += posix_netdb

include $(RIOTBASE)/Makefile.include
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make -C tests/posix_netdb/ all-valgrind term-valgrind

yields

--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
--105511-- warning: evaluate_Dwarf3_Expr: unhandled DW_OP_ 0x9b
==105511== Invalid read of size 4
==105511==    at 0x804BC0C: ??? (in /home/mlenders/Repositories/RIOT-OS/RIOT/tests/posix_netdb/bin/native/tests_posix_netdb.elf)
==105511==  Address 0x806c790 is in the brk data segment 0x8079000-0x8078fff
==105511== 
{ "threads": [{ "name": "main", "stack_size": 12288, "stack_used": 2444}]}

Sadly no line number but somewhere is an out of bounds read error to one of the malloc'd memory areas in this test (or the netdb code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking Area: POSIX Area: POSIX API wrapper Area: sys Area: System Area: tests Area: tests and testing framework Type: new feature The issue requests / The PR implemements a new feature for RIOT
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants