Description
Hello,
I've been using kallisto/sleuth for DE analysis of a virus/host dataset. It has been working well for analysing the full transcript set, but now I would like to normalize and analyze only the viral genes separately. I have tried to use the following code to subset the kallisto objects:
# old_path and new_path point to locations of existing and to-be-created HDF5 files
dir.create(dirname(new_path), recursive=T)
read_kallisto_h5(old_path, read_bootstrap=T) %>%
subset_kallisto(target_ids=viral_ids) %>%
write_kallisto_hdf5(fname=new_path)
This runs without error, but after troubleshooting subsequent sleuth errors and dumping the contents of the newly written HDF5 files, I found that all of the datasets in the new HDF5 are empty except for bias_observed
, bias_normalized
, fld
, and the bootstrap datasets. For instance, the ids
dataset is empty, which was the initial source of my downstream problems. I did double-check that viral_ids
were correct and matched those used in kallisto, and sleuth reports the expected number of transcripts filtered.
It appears the problem is in subset_kallisto()
, since if I don't subset and only read/write directly, the created HDF5 file seems to be valid. I have seen similar issues, e.g. #204, but I'm not sure if they are directly related or not.
I just tried with sleuth
installed from GitHub today using devtools
, with the same result.
Thanks in advance for any help.