-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GridEngine parallel jobs/tasks show up as occupying only one core #289
Comments
@mightybigcar : thanks for raising this. we've been seeing this for a while and in fact it has been discussed before, because indeed it makes you think that only 1 out of X (x=12,16,20, whatever) cores is being allocated. when we had a first crack at it, it turned out that it was not very reliable to identify when the whole node is allocated, under SGE - do you have a reliable means to establish that? (we'd like to hear a method). I could be beta tester in this because the need here is similar |
@mightybigcar also, could you kindly provide an xml file showing how -pe make 16 manifests in there? |
hi @mightybigcar : in PR #295, using the queue names |
@sfranky , Here's the fragment I think you're looking for:
I've also attached the full qstat output. Cheers, |
Fsck! I banana fingered the touchpad and accidentally closed this? Is it possible to reopen it? Sorry about that..... |
Thanks for that, I'll incorporate it into the system! issue reopened 👍 |
btw the queue names that follow this rule are customizeable in qtopconf.yaml |
@fgeorgatos Here's a qstat.xml output with the parallel environment info (look for requested_pe). |
Hi @fgeorgatos,
For determining whether a node is fully allocated, I use a rather simplistic approach and look at the qstat.xml for a given node. For example:
|
A job or task submitted with a parallel environment specification such as
-pe make 16
will occupy 16 cores, but qtop only shows them as occupying a single core. This leads naive users to think the cluster is being under-utilized.
PS I might chase this bug myself, but it could take a while. If someone else with actual experience with the qtop code wants to jump on it in the meantime, feel free.
The text was updated successfully, but these errors were encountered: