subset not working as expected

Hello,

I've been using kallisto/sleuth for DE analysis of a virus/host dataset. It has been working well for analysing the full transcript set, but now I would like to normalize and analyze only the viral genes separately. I have tried to use the following code to subset the kallisto objects:

```R
# old_path and new_path point to locations of existing and to-be-created HDF5 files
dir.create(dirname(new_path), recursive=T)
read_kallisto_h5(old_path, read_bootstrap=T) %>%
    subset_kallisto(target_ids=viral_ids) %>%
    write_kallisto_hdf5(fname=new_path)

```

This runs without error, but after troubleshooting subsequent sleuth errors and dumping the contents of the newly written HDF5 files, I found that all of the datasets in the new HDF5 are empty except for `bias_observed`, `bias_normalized`, `fld`, and the bootstrap datasets. For instance, the `ids` dataset is empty, which was the initial source of my downstream problems. I did double-check that `viral_ids` were correct and matched those used in kallisto, and sleuth reports the expected number of transcripts filtered.

It appears the problem is in `subset_kallisto()`, since if I don't subset and only read/write directly, the created HDF5 file seems to be valid. I have seen similar issues, e.g. #204, but I'm not sure if they are directly related or not.

I just tried with `sleuth` installed from GitHub today using `devtools`, with the same result.

Thanks in advance for any help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subset not working as expected #258

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development