Skip to content

Latest commit

 

History

History

dataset

In addition to the above small files, we have tested eGap on the following collections

Name SizeGBs Num Docs Max DocLen Ave DocLen Max LCP Ave LCP Download Link
Shortreads 8.0 85,899,345 100 100 99 27.90 .tar.gz
Longreads 8.0 28,633,115 300 300 299 90.28 .tar.gz
Pacbio.1000 8.0 8,589,934 1,000 1,000 876 18.05 .tar.gz
Pacbio 8.0 942,248 71,561 9,116 3,084 18.32 .tar.gz

We have also used versions of the above collections shortened to 1GB. The shortened versions can be obtained by the above files using simple command line instructions. Check all md5sums after dowloading and extraction.