Unify DEBUG_BPF logging to be handled in libbpf #1433

palmtenor · 2017-11-04T00:10:36Z

In BCC we have a few debug flags. Except for DEBUG_BPF, all other debug flags are handled at ClangLoader / Clang / LLVM layer.

DEBUG_BPF is for log message come from BPF_PROG_LOAD syscall, and it's current logic is a bit messy:

libbpf's bpf_prog_load enables this logging (log_level = 1) whenever non-empty log_buf is provided. On the other hand, the Python API always provide such buffer, causing unnecessary work when debug is not enabled.
Python API has its own logic to print the message (only when DEBUG_BPF is enabled), and to incrementally double the size of the log buffer if the buffer size is insufficient.
On the other hand, libbpf's bpf_prog_load also has the logic to incrementally double the size of the log buffer if the buffer size is insufficient, but it only does it on error, and will print the log message along with some hints. However, it will not copy this buffer to user provided log_buf even if provided.
C++ and Lua API doesn't have the logic to handle DEBUG_BPF flag, and if we want, we'll need to replicate the printing logic as Python does.

This PR tries to address these issues:

Added a new bpf_prog_load_flag interface. Instead of log_buf and log_buf_size, it simply takes a log_level argument. When setting to 1, it would automatically create temporary log buffer, increase the size if not sufficient, and print the log message with hints. This way, Python / C++ / Lua / etc. APIs won't need to handle these logic separately, and only need to pass the log_level flag correctly. This behavior is more consistent with other debug flags such as DEBUG_SOURCE, etc.
The original bpf_prog_load is a very commonly used interface and should not break for this non-critical feature change. So its been kept and behavior remain unchanged, except it will not do the auto-increment of log buffer anymore.

The new interface is applied in Python, C++ and Lua API, and it worked well when tested with different debug flag options.

yonghong-song · 2017-11-06T23:38:31Z

src/cc/libbpf.c

-    buffer_size = LOG_BUF_SIZE;
-    // caller did not specify log_buf but failure should be printed,
-    // so repeat the syscall and print the result to stderr
+    // User did not provide log buffer. We will try to increase size of


Do you want to check whether the error is caused by insufficient buffer or not here?

At this point, we already know the load syscall failed, and user did not provide a buffer. Regardless of the reason, we will try allocate temporary buffer size from LOG_BUF_SIZE, and increase it until ENOSPC no longer happens.

At the worse case is user have specified log_level = 1 and we already tried temporary buffer size being LOG_BUF_SIZE and it turned out not to be enough. In such case, we just do an extra iteration of attempted load. It only happens on error and if we want to get around that this part of logic would be more complicated. I don't feel it's worth it......

yonghong-song · 2017-11-06T23:40:17Z

src/cc/libbpf.c

+  return bpf_prog_load_impl(prog_type, name, insns, prog_len, license,
+                            kern_version, NULL, 0, log_level);
+}
+


Now we have two APIs, bpf_prog_load and bpf_prog_load_flag. I think this is not a good idea. Can we consolidate both into one?

We can. But as I stated in the description, for folks using C API directly, bpf_prog_load almost certainly going to be used and if we change its function signature, all those call sites would break. That's why I added the new interface. I would mark the old one with DEPRECATED attribute. What do you think?

yonghong-song · 2017-11-06T23:43:45Z

src/python/bcc/__init__.py

-                break
-
-        if self.debug & DEBUG_BPF and log_buf.value:
-            print(log_buf.value.decode(), file=sys.stderr)


Not sure whether we have issue or not. The log buffer passed back to python and python can print out. If you printout in C/C++, not sure whether it will actually print out. I have numerous cases where when running python tools, if I added some fprintf in C++ and I did not see the output in the screen, could you double check this?

Yes I did. I tried both a few BCC tools, and FB internal build using Python logging library and everything works well. If you have example that this happens I could help debug.

yonghong-song · 2017-11-07T17:25:43Z

src/cc/libbpf.c

+    if (tmp_log_buf)
+      free(tmp_log_buf);
+    tmp_log_buf_size = LOG_BUF_SIZE;
+    attr.log_level = 1;


One more thing. Do we want integer log_level or boolean do_logging? Providing log_level is more flexible for users as log_level=2 can print out more info than log_level=1 in kernel. Maybe the above line needs to inherit user log_level?

Good point. log_level is already integer, I will do another patch to add a new flag to do log_level=2 for more detailed register state inspection.

If user_buf != NULL, we will have
if (input_log_level == 0)
log_level = 1
else
log_level = input_log_level

Can we do the same thing here?
attr.log_level = (log_level == 0) ? 1 : log_level.

This will cover log_level=2 case for C++ API, where people are more expert...
For Python side, we have log_level=1 based on debug flag settting, I feel this is good enough.
log_level=2 prints reg state for every insn, this is mostly useful to track verifier bug and often happens with bigger programs (C++ side) than python side.

I thing enabling log_level based on that user provided log_buf makes the logic is a bit messy. In brief words, the implemented logic in this patch is:

If user specified positive log_level, use the value for log_level.

Otherwise if load failed, set log_level to be 1.

When log_level enabled due to any above case:

If user provided log_buf, use user's log_buf.

Otherwise allocate use our temporary log buffer.

Do you feel that's a reasonable logic?

This sounds okay, even with "Otherwise if load failed, set log_level to be 1". This is probably what we did earlier. Could you add this brief algorithm as the comments?

@yonghong-song Yeah I was trying to keep current behavior mostly unchanged. Documentation is added on libbpf.h, do you have any suggestion on how to make it better?

yonghong-song

LGTM. Lets keep the existing load_prog error reporting mechanism. It is very useful to people even if they do not specify debug flag.

palmtenor requested review from 4ast, brendangregg, drzaeus77, goldshtn and yonghong-song as code owners November 4, 2017 00:10

palmtenor force-pushed the logmsg branch 2 times, most recently from 4df8a31 to ebe63ea Compare November 4, 2017 02:53

yonghong-song reviewed Nov 6, 2017

View reviewed changes

yonghong-song reviewed Nov 7, 2017

View reviewed changes

palmtenor force-pushed the logmsg branch from ebe63ea to 84810cf Compare November 7, 2017 23:51

palmtenor mentioned this pull request Nov 8, 2017

Add flag to enable verifier log_level 2 #1436

Merged

palmtenor added 2 commits November 7, 2017 19:51

Refactor log message handling when loading program

797669f

Use bpf_prog_load_flag in APIs

eea5528

palmtenor force-pushed the logmsg branch from 84810cf to eea5528 Compare November 8, 2017 03:51

yonghong-song approved these changes Nov 8, 2017

View reviewed changes

4ast merged commit 2d33f35 into iovisor:master Nov 8, 2017

palmtenor deleted the logmsg branch November 8, 2017 06:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify DEBUG_BPF logging to be handled in libbpf #1433

Unify DEBUG_BPF logging to be handled in libbpf #1433

palmtenor commented Nov 4, 2017 •

edited

Loading

yonghong-song Nov 6, 2017

palmtenor Nov 7, 2017

yonghong-song Nov 6, 2017

palmtenor Nov 7, 2017

yonghong-song Nov 6, 2017

palmtenor Nov 7, 2017

yonghong-song Nov 7, 2017

palmtenor Nov 7, 2017

yonghong-song Nov 8, 2017

palmtenor Nov 8, 2017

yonghong-song Nov 8, 2017

palmtenor Nov 8, 2017

yonghong-song left a comment

Unify DEBUG_BPF logging to be handled in libbpf #1433

Unify DEBUG_BPF logging to be handled in libbpf #1433

Conversation

palmtenor commented Nov 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yonghong-song left a comment

Choose a reason for hiding this comment

palmtenor commented Nov 4, 2017 •

edited

Loading