-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor process parsing to accommodate commands with spaces #2751
Refactor process parsing to accommodate commands with spaces #2751
Conversation
4af47c1
to
e9385cd
Compare
manager/container.go
Outdated
if err != nil { | ||
return nil, fmt.Errorf("invalid pid %q: %v", fields[1], err) | ||
} | ||
cmd += fmt.Sprintf(" %s", fields[11]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I have forgotten that strings.Join()
exists :)
manager/container.go
Outdated
if err != nil { | ||
return nil, fmt.Errorf("invalid pid %q: %v", fields[1], err) | ||
fieldNo++ | ||
psr, err = strconv.Atoi(fields[fieldNo]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding the logic here, the idea is that the cmd should be at index 10 and the psr should be at index 11 ASSUMING there is no spaces in the cmd name (this was old logic).
The fix here is basically assume that cmd==index10
and psr==index11
, but if you parse psr at index11 and get error from strconv.Atoi
it means you have might space in the cmd name. In that case, it means psr is at some greater index than 11 and keep advancing fieldNo
(which is assumed to represent psr index) until you don't get an error from strconv.Atoi
.
Am I understanding the workaround here correctly? If so, I feel like this a bit fragile... what if the command name has a space and the string after the space is a number, e.g. mycmd 1
. In that case, wouldn't you assume that "1" actually represents psr value, but actually it's just part of the command name and thus it would parse the output incorrectly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that parsing logic has been significantly improved and simplified now :)
t.Run(fmt.Sprintf("iteration %d", i), func(tt *testing.T) { | ||
cd := &containerData{} | ||
_, err := cd.parseProcessList("/", true, ps) | ||
// checking *only* parsing errors - otherwise /proc would have to be emulated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have some unit tests that check the actual parsing logic and ensure correct data is returned after the parse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not. Loop in parseProcessList()
besides parsing out put of ps
:
reads from cgroup fs- reads from proc fs
I will try to break it even more so that at least some of the code can be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is new shiny test function just below :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, thanks for adding tests here!
Left a few comments above. BTW, side question (out of scope for this PR): why are we even parsing the results from running |
I was as surprised as you are. Looking at what I did here makes me think it would be good idea to get rid of output parsing. |
@bobbypage, I hope you will find this code less enigmatic and prone to error :) |
// According to `man ps`: The following user-defined format specifiers may contain spaces: args, cmd, comm, command, | ||
// fname, ucmd, ucomm, lstart, bsdstart, start. | ||
// Therefore we need to be able to parse comm that consists of multiple space-separated parts. | ||
info.Cmd = strings.Join(fields[10:len(fields)-2], " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea re. using fields[10:len(fields)-2]
as cmd :)
Thanks for your changes here, I think the updated parsing logic is a lot more clear and less error prone! Left a few smaller comments. |
@bobbypage, as you can see this PR fixes more than one issue now. I came across #2770 while trying to verify that process are really displayed as expected. They have not been, so I started debugging. I hope it's fine with you - these are still closely related changes to same file. |
// Memory has been chosen, as it is one of the default cgroups that is enabled for most containers. | ||
var cgroupPathRegExp = regexp.MustCompile(`memory[^:]*:(.*?)[,;$]`) | ||
// Memory has been chosen, as it is one of the default cgroups that is enabled for most containers... | ||
var cgroupMemoryPathRegExp = regexp.MustCompile(`memory[^:]*:(.*?)[,;$]`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random comment, out of scope for this PR: I wonder if this regex is really needed here. The format seems pretty easily parsable,
e.g. from my system
11:pids:/system.slice/ssh.service,10:devices:/system.slice/ssh.service,4:cpu,cpuacct:/system.slice/ssh.service,2:memory:/system.slice/ssh.service,1:name=systemd:/system.slice/ssh.service,0::/system.slice/ssh.service
I think you could just split the string by commas and remove the leading hierarchy-ID
, e.g. 11:
. We can also see which cgroup controller mounted by reading /proc/cgroups
. Regex is totally fine for now (and what is used currently) but maybe worth a TODO as future improvement :)
Few small comments, but LGTM otherwise. Thanks again for all your work on this! Can you please squash into maybe two commits, one for the changes of parsing and one for the memory cgroup fix? |
76aae80
to
c4b6683
Compare
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
c4b6683
to
5dbc0d9
Compare
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
5dbc0d9
to
0f6fe07
Compare
Thanks again for all your work on this! LGTM! |
Fixes #2750. Fixes #2770.