Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counters incompatibilities? #2

Closed
marmeladema opened this issue Apr 17, 2020 · 6 comments
Closed

Counters incompatibilities? #2

marmeladema opened this issue Apr 17, 2020 · 6 comments

Comments

@marmeladema
Copy link

Hello!

First of all, thank you very much for this crate, its exactly what i was looking for.

I was trying to use various counters and I noticed that some counters appear to be incompatible.
For example, I tried to add the CPU_CYCLES counter to https://github.com/jimblandy/perf-event/blob/master/examples/group.rs and suddenly all values are 0:

    let cycles = Builder::new().group(&group).kind(Hardware::CPU_CYCLES).build()?;

Is this a known issue of this crate that can be fix? If so, I'd be happy to work on a PR with some guidance.
I believe, but i might be wrong, that perf itself supports it?

Thank you again!

@repk
Copy link

repk commented Apr 17, 2020

Hi,

Just to confirm that I can reproduce that.

$ ./target/debug/examples/group
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]
L1D cache misses/references: 110 / 10904 (1%)
branch prediction misses/total: 182 / 8447 (2%)
Counter id 218 has value 10904
Counter id 219 has value 110
Counter id 220 has value 8447
Counter id 221 has value 182
$ vim examples/group.rs
$ cargo build --examples
   Compiling perf-event v0.4.2 (/tmp/perf-event)
warning: unused variable: `cycles`
  --> examples/group.rs:17:9
   |
17 |     let cycles = Builder::new().group(&group).kind(Hardware::CPU_CYCLES).build()?;
   |         ^^^^^^ help: consider prefixing with an underscore: `_cycles`
   |
   = note: `#[warn(unused_variables)]` on by default

    Finished dev [unoptimized + debuginfo] target(s) in 0.37s
$ ./target/debug/examples/group
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]
L1D cache misses/references: 0 / 0 (NaN%)
branch prediction misses/total: 0 / 0 (NaN%)
Counter id 223 has value 0
Counter id 224 has value 0
Counter id 225 has value 0
Counter id 226 has value 0
Counter id 227 has value 0

Also perf seems to handle that fine:

$ sudo perf stat -e cpu-cycles,instructions,cache-misses -a sleep 10

 Performance counter stats for 'system wide':

     3,203,979,163      cpu-cycles                                                  
     2,737,470,443      instructions              #    0.85  insn per cycle         
        28,993,403      cache-misses                                                

      10.003352898 seconds time elapsed

Thanks

@jimblandy
Copy link
Owner

Thanks for the bug report - sorry for the slow reply!

If I remove other counters from that group, then I can add Hardware::CPU_CYCLES to the group and still get counts. It seems to be a problem with the group being large?

@jimblandy
Copy link
Owner

I am not seeing any error codes returned by the kernel. However, it does say that the period of time for which the group was enabled was zero, which suggests that we're running into multiplexing:

         Total time the event was enabled and running.  Normally these values are the same.   If  more
         events  are  started,  then available counter slots on the PMU, then multiplexing happens and
         events run only part of the time.  In that case, the time_enabled and time running values can
         be used to scale an estimated value for the count.

@jimblandy
Copy link
Owner

If you make the CPU_CYCLES counter an independent counter, and don't include it in the group, then it works.

@jimblandy
Copy link
Owner

This is a kernel limitation, not a problem with the library. From the perf kernel documentation:

Globally pinned events can limit the number of counters available for
other groups. On x86 systems, the NMI watchdog pins a counter by default.
The nmi watchdog can be disabled as root with

echo 0 > /proc/sys/kernel/nmi_watchdog

If I disable the NMI watchdog as suggested, then a group that contains the CPU_CYCLES counter works fine.

I'll make the documentation mention this.

The Linux perf utility seems to be able to recognize when this has happened, and suggest disabling the watchdog (that's how I figured this out). I don't really understand how it knows when something has gone wrong; there are no errors returned from the kernel when the groups example isn't working. The perf source code responsible for the hint isn't clear to me.

@jimblandy
Copy link
Owner

The Linux perf utility seems to be able to recognize when this has happened, and suggest disabling the watchdog (that's how I figured this out). I don't really understand how it knows when something has gone wrong; there are no errors returned from the kernel when the groups example isn't working. The perf source code responsible for the hint isn't clear to me.

This may be covered by #5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants