-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read gluster log to surface glusterfs plugin errors properly in describe events #24808
Conversation
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
2 similar comments
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
@screeley44 the new message is a lot more helpful -- how do you feel about this vs. your other patch? |
@pmorie - I like this approach much better, much cleaner and gives a better error message |
// adding log-level ERROR to remove noise | ||
// and more specific log path so each pod has | ||
// it's own log based on PV + Pod | ||
log := path.Join(p, b.pod.Name + "-glusterfs.log") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the log file is now computed in two places, perhaps this should be a property of the mounter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmorie - not sure I understand what you mean by 'computed in two places' - can you explain a bit more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding Pod.Name
could raise the risk of exceeding PATH_MAX
limit, see commit 8bbc86d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rootfs - yeah, I was thinking of those cases, although probably rare, but could have multiple pods using the same PV and would have the log clobbered and maybe even hard to pin point specific errors - so was thinking of a way to better isolate the log but still have the user be able to easily find it. Maybe it's over kill? The pod name can not be longer than 253 characters, so what if I substring'd the pod if it was over X length to first 32 characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this could be rare, let keep it that way then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@kubernetes/sig-storage |
// it all goes in a log file, we will read the log file | ||
logerror := readGlusterLog(log, b.pod.Name) | ||
if logerror != nil { | ||
glog.Errorf("glusterfs: mount failed: %v", logerror) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need to double logging - error is propagated to caller
Hmm.. this log parsing business makes me very uncomfortable. Have we really exhausted all options ? If so I think we should file a bug (or post a patch ?) against the appropriate glusterfs util. |
@swagiaal - yes, from what I can tell, per current functionality and options available, there are no valid parameters to pass in to the gluster mount to get back the errors, the default behavior is to send to log files, you either specify the log file as a parameter or it creates a default one for you. |
// Since gluster does not return eror text | ||
// it all goes in a log file, we will read the log file | ||
logerror := readGlusterLog(log, b.pod.Name) | ||
if logerror != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if no /sbin/mount.glusterfs
is on your node, mount will fail (as reported by errs
) and readGlusterLog
will return an error (failed to open log file), but since errs
is not returned, the real cause is not reported .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point...added back in @rootfs and upated new describe event messages above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some work being done to add in a --verbose flag - but still unclear at this time if it will be able to send back the error specific log entries to the caller and/or give us what we are looking for http://review.gluster.org/#/c/11469/ it might be worth while to follow this PR approach until the upstream solution is fully worked out and then changing at that time |
That would be fine, just make sure to open a tracking issue to remove this code and use the verbose flag once that option is available. |
+1 @screeley44 if you are unsure file an issue upstream asking for exactly what you need. If it is covered by that patch they will point that out. |
@swagiaal - yep, @rootfs and I have been in conversations with them today and will continue to follow up and create necessary issues when needed. thanks |
That is great.. thanks guys! |
// on failure of gluster SetUp and return those so kubelet can | ||
// properly expose them | ||
// return nil on any failure | ||
func readGlusterLog(path string, podName string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check out https://github.com/hpcloud/tail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmorie - I tested with hpcloud/tail and although it did tail the log it was a bit clunky in how it acted with our volumes, and it was pretty limited in overall functionality and seems overkill for what we are attempting to accomplish, and I would argue it's not worth integrating another dependency for what is hopefully a short term fix after mount.glusterfs does what we want. Thoughts?
Added reminder Issue #26367 to revert this PR after new functionality is added to mount.glusterfs call |
logerror := readGlusterLog(log, b.pod.Name) | ||
if logerror != nil { | ||
// return fmt.Errorf("glusterfs: mount failed: %v", logerror) | ||
return fmt.Errorf("glusterfs: mount failed: %v the following error information was pulled from the log to help resolve this issue: %v", errs, logerror) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify this is the glusterfs log and i will tag this LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmorie - does that look better?
@screeley44 one last nit and I will tag. Sorry for the delay; I think this is a great patch. |
@k8s-bot test this issue: #IGNORE |
I couldn't determine the root cause of the failure and couldn't find a matching flake. |
1 similar comment
@screeley44 need to
|
@screeley44 (i.e. the build log above http://pr-test.k8s.io/24808/kubernetes-pull-build-test-e2e-gce/42856/build-log.txt ) see the build log buttons |
@pmorie can you tag again, I ran gofmt and had to change a few things |
@screeley44 verify-boilerplate failed:
|
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e build/test passed for commit a36cd3d. |
Automatic merge from submit-queue |
glusterfs.go does not properly expose errors as all mount errors go to a log file, I propose we read the log file to expose the errors without asking the users to 'go look at this log'
This PR does the following:
old behavior:
improved behavior: (updated after suggestions from community)
also this PR is alternate approach to : #24624