Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor process parsing to accommodate commands with spaces #2751

Merged
merged 2 commits into from
Jan 1, 2021

Conversation

iwankgb
Copy link
Collaborator

@iwankgb iwankgb commented Dec 4, 2020

Fixes #2750. Fixes #2770.

@google-cla google-cla bot added the cla: yes label Dec 4, 2020
@iwankgb iwankgb force-pushed the space_is_allowed_in_command_name branch 3 times, most recently from 4af47c1 to e9385cd Compare December 4, 2020 00:30
@Creatone
Copy link
Collaborator

@bobbypage

@bobbypage bobbypage self-assigned this Dec 17, 2020
if err != nil {
return nil, fmt.Errorf("invalid pid %q: %v", fields[1], err)
}
cmd += fmt.Sprintf(" %s", fields[11])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I have forgotten that strings.Join() exists :)

if err != nil {
return nil, fmt.Errorf("invalid pid %q: %v", fields[1], err)
fieldNo++
psr, err = strconv.Atoi(fields[fieldNo])
Copy link
Collaborator

@bobbypage bobbypage Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding the logic here, the idea is that the cmd should be at index 10 and the psr should be at index 11 ASSUMING there is no spaces in the cmd name (this was old logic).

The fix here is basically assume that cmd==index10 and psr==index11, but if you parse psr at index11 and get error from strconv.Atoi it means you have might space in the cmd name. In that case, it means psr is at some greater index than 11 and keep advancing fieldNo (which is assumed to represent psr index) until you don't get an error from strconv.Atoi.

Am I understanding the workaround here correctly? If so, I feel like this a bit fragile... what if the command name has a space and the string after the space is a number, e.g. mycmd 1. In that case, wouldn't you assume that "1" actually represents psr value, but actually it's just part of the command name and thus it would parse the output incorrectly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that parsing logic has been significantly improved and simplified now :)

t.Run(fmt.Sprintf("iteration %d", i), func(tt *testing.T) {
cd := &containerData{}
_, err := cd.parseProcessList("/", true, ps)
// checking *only* parsing errors - otherwise /proc would have to be emulated.
Copy link
Collaborator

@bobbypage bobbypage Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have some unit tests that check the actual parsing logic and ensure correct data is returned after the parse?

Copy link
Collaborator Author

@iwankgb iwankgb Dec 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not. Loop in parseProcessList() besides parsing out put of ps:

  • reads from cgroup fs
  • reads from proc fs

I will try to break it even more so that at least some of the code can be tested.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is new shiny test function just below :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks for adding tests here!

@bobbypage
Copy link
Collaborator

Left a few comments above. BTW, side question (out of scope for this PR): why are we even parsing the results from running ps? Wouldn't it make more sense to gather this data from /proc or something like that directly?

@iwankgb
Copy link
Collaborator Author

iwankgb commented Dec 21, 2020

I was as surprised as you are. Looking at what I did here makes me think it would be good idea to get rid of output parsing.

@iwankgb
Copy link
Collaborator Author

iwankgb commented Dec 26, 2020

@bobbypage, I hope you will find this code less enigmatic and prone to error :)

manager/container.go Outdated Show resolved Hide resolved
manager/container.go Outdated Show resolved Hide resolved
// According to `man ps`: The following user-defined format specifiers may contain spaces: args, cmd, comm, command,
// fname, ucmd, ucomm, lstart, bsdstart, start.
// Therefore we need to be able to parse comm that consists of multiple space-separated parts.
info.Cmd = strings.Join(fields[10:len(fields)-2], " ")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea re. using fields[10:len(fields)-2]as cmd :)

@bobbypage
Copy link
Collaborator

Thanks for your changes here, I think the updated parsing logic is a lot more clear and less error prone! Left a few smaller comments.

@iwankgb
Copy link
Collaborator Author

iwankgb commented Dec 29, 2020

@bobbypage, as you can see this PR fixes more than one issue now. I came across #2770 while trying to verify that process are really displayed as expected. They have not been, so I started debugging. I hope it's fine with you - these are still closely related changes to same file.

// Memory has been chosen, as it is one of the default cgroups that is enabled for most containers.
var cgroupPathRegExp = regexp.MustCompile(`memory[^:]*:(.*?)[,;$]`)
// Memory has been chosen, as it is one of the default cgroups that is enabled for most containers...
var cgroupMemoryPathRegExp = regexp.MustCompile(`memory[^:]*:(.*?)[,;$]`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random comment, out of scope for this PR: I wonder if this regex is really needed here. The format seems pretty easily parsable,

e.g. from my system

11:pids:/system.slice/ssh.service,10:devices:/system.slice/ssh.service,4:cpu,cpuacct:/system.slice/ssh.service,2:memory:/system.slice/ssh.service,1:name=systemd:/system.slice/ssh.service,0::/system.slice/ssh.service

I think you could just split the string by commas and remove the leading hierarchy-ID, e.g. 11:. We can also see which cgroup controller mounted by reading /proc/cgroups. Regex is totally fine for now (and what is used currently) but maybe worth a TODO as future improvement :)

@bobbypage
Copy link
Collaborator

Few small comments, but LGTM otherwise. Thanks again for all your work on this!

Can you please squash into maybe two commits, one for the changes of parsing and one for the memory cgroup fix?

@iwankgb iwankgb force-pushed the space_is_allowed_in_command_name branch from 76aae80 to c4b6683 Compare December 31, 2020 10:43
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
@iwankgb iwankgb force-pushed the space_is_allowed_in_command_name branch from c4b6683 to 5dbc0d9 Compare December 31, 2020 10:46
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
@iwankgb iwankgb force-pushed the space_is_allowed_in_command_name branch from 5dbc0d9 to 0f6fe07 Compare December 31, 2020 10:52
@bobbypage
Copy link
Collaborator

Thanks again for all your work on this! LGTM!

@bobbypage bobbypage changed the title Command may contain spaces Refactor process parsing to accommodate commands with spaces Jan 1, 2021
@bobbypage bobbypage merged commit 7a9ceab into google:master Jan 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

memory cgroup controller disabled by default on some systems Spaces in command name
3 participants