3D Classification gets stuck immediately after noise estimation #473
Closed
Description
I am experiencing a problem with a particular data.star file during 3D classification. The run always hangs immediately after noise estimation when launched with that particular file regardless what else is changed (see attached log).
A 3D refinement launched with that particular data.star file runs fine. The data.star file from this refinement run as an input to another 3D classification again stalls the classification run.
So far I was not able to determine an obvious problem with the data.
Thanks for your help,
Clemens
RELION version: 3.0.5
Precision: BASE=double, CUDA-ACC=single
=== RELION MPI setup ===
- Number of MPI processes = 6
- Number of threads per MPI process = 8
- Total number of threads therefore = 48
- Master (0) runs on host = wbbc148
- Slave 1 runs on host = wbbc148
- Slave 2 runs on host = wbbc148
- Slave 3 runs on host = wbbc148
- Slave 4 runs on host = wbbc148
- Slave 5 runs on host = wbbc148
=================
Running CPU instructions in double precision.
Estimating initial noise spectra
11.82/11.82 min ............................................................~~(,_,">
Setting subset size to 4500 particles
HANGS HERE <<<<<<<<<<<<<<<<<<<<<
Metadata
Assignees
Labels
No labels
Activity
biochem-fan commentedon May 23, 2019
Can you find out which particle is problematic? For example, you can split your input particles into half and see which dataset causes the issue. Continue this until the dataset becomes sufficiently small.
biochem-fan commentedon May 23, 2019
Possibly related: https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=CCPEM;4363c392.1905
clemensgrimm commentedon May 23, 2019
biochem-fan commentedon May 23, 2019
Unfortunately no.
clemensgrimm commentedon May 23, 2019
OK, after splitting the data into four parts, all four parts work. I do not see any NANs in the data.
clemensgrimm commentedon May 23, 2019
... there was an issue during splitting which limited the subset size to 100 lines.
According to the documentation, the --size_split option should be ignored when giving --nr_split. This might be be a bug ...
biochem-fan commentedon May 23, 2019
You are right. It is a bug in
relion_star_handler
. Thank you very much for reporting. I will fix this in the next update.Meanwhile, you can split files in (roughly) half by a text editor.
clemensgrimm commentedon May 23, 2019
clemensgrimm commentedon May 23, 2019
biochem-fan commentedon May 23, 2019
What is the box size and the angular sampling? What happens if you use the non-MPI version?
clemensgrimm commentedon May 23, 2019
... after a while the small datasets (1567) have proceded to the iterations. So 'stalled' is probably better described as pausing for at least several hours. As far as I can remember, there used to be nearly no lag time between noise estimation and the first expectetion iteration, at least with related datasets and earlier relion versions ...
clemensgrimm commentedon May 23, 2019
The box size is 504. I am currently trying the non-MPI version ...
clemensgrimm commentedon May 23, 2019
clemensgrimm commentedon May 23, 2019
As the issue could have been just lack of patience, I will give the original datset another try and wait overnight ...
clemensgrimm commentedon May 24, 2019
5 remaining items