-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unused resources / stuck progress #139
Comments
Same |
Hi, thanks for testing, yes there are some bug with high values of RAM. At the beginning i think that this was related to the mutex to construct the bloom filter with a high number of threads but seems that is another bug. I will try to figure how to resolve this asap. Best regards! |
Why this bug is present. The bloom filter generation need to have some mutex to support multi thread access. That mutex make that the threads don't use the 100% of the CPU, the first 5% of the bPtable generation is more slow because it is generating 2 bloomfilter at the same time, after that percentage the speed will be near to the 100% of the CPU. I just add some changes to the code to have a better control over the multithread switching in the bP Table generation, once that I finish some basic test i will publish the code just to see if the bug is solved.
There was a bug
Well the number of threads is the recommended the number of threads that you core support. But with your 416CPU im not sure, maybe will be a good idea set the -t number upto 416 minus one or two just to keep some for OS multithread switching but it is upto you. About the values for k and n , I see that you already found some value that fill 8 Terabytes of RAM ( -n 0x100000000000000 -k 8192 ) Maybe you want to test some experimental value of K like 10240 or 9216 but those values (not exact 2^X values) have some bugs in the past, see the #130 as a reference, I'm solving those bugs when the users report it, I want that the users can use any value of K, so if you use those values make some publickey to test speed and hits. Once that the bug of 49% will be solved is recommended to you use the -S option to generate the files related to your bPTable just to read it on the fly without needed of generate the Table again. But the problem is that the generated 3 files together are the same length of RAM usage, so make sure to have available the same 8 or 11 TB in disk storage to save the files. |
@Farizov I didn't make it to the end of loading. Today was an update + the fact that I did not set -S and did not provide enough disk space caused that today I restarted with the parameter -k 65536 and -n 0x4000000000000. It loads very slowly but you can see something is happening. I also set -t 208. So far from 4 hours:
So far, I have managed the most 3 Ekeys / s (3551908604685192864) with the parameter -k 512 -n 0x4000000000000 which is far behind the available RAM resource. Well ... I will have to wait ... I hope that 5% will actually go smoothly. |
Yeap, 3 Ekeys looks like nothing for the resources you have. |
not sure but i guess his ram is DDR3 and CPU also perhaps from old Xeon generation, so in real world comparison i guess perhaps he will hardly get 18 to 22 Ekeys. but if he got modern DDR4 and newest generation CPUs he can break 100Ekeys. |
Well I'm thinking in make some changes to this, time ago at the beginning of this project i just figure how to reduce the size of the bPtable up to the 5% of his original size the trick was implement one second bloom filter.
This one, without it the size of the bP Table will be 20 times bigger, but now seeing the big amount of memory for your bP Table (~1.5 Terabytes) i think that i will add a third bloom filter. So instead of a bP table of 1.5 Terabytes it will be a third bloom filter of 19 GB and his bP table of just 72 GB, this is only 91 GB instead of that 1.5 Terabytes This will help a lot because you will get more space for the main bloom filter, each item in the bloom filter is about 28 bits. But indeed I can extent this trick of chained bloom filters at infinitum but what is the optimal number? How this affect the overall performance?
I was talking with @iceland2k14 and we estimate some speed around the 32 to 64 Exakeys/s, but that was just a fast calculations real calculations need CPU Clock and RAM speed and i don't know much about the hardware specs Regards! |
I don't know what is your point, that speed is incorrect, is WRONG for that version with that K value, please READ THE README. if you exceed the Max value of K the program can have a unknow behavior, the program can have a suboptimal performance, or in the wrong cases you can missing some hits and have an incorrect SPEED. For the default N value the maximum K value is 4096. |
In the last commit 3381843 I added the 3rd bloom filter, this Bloom filter save 20% of RAM, also I change a little the Thread synchronizations for the bPtable Construction (3 bloom filters and bPtable). @ZielarSRC Those changes will help to fill more faster the 11 TB of RAM also now you an chose any K value even values like Regards! |
Hi,
Having some spare time, I decided to test your project on hardware with 416CPU and 11TB of RAM memory.
So far, the setting "-k 512 / -n 0x4000000000000" has gone quite smoothly, which showed the result of 3 Ekeys / s (3551908604685192864 keys / s), but it is still a small part of the available memory. At the moment I am trying "-k 2048 / -n 0x4000000000000", but it's been 24 hours since the start and the system is stuck at 49%:
While I checked before - it is from 413 available CPUs - only number 11 had 100% consumption, so now I can see that no one is working on anything. The process list also shows that resources are still available ... So what's wrong?
Over what configuration by. Will you be the most effective for such equipment? -k -n -t?
The text was updated successfully, but these errors were encountered: