Skip to content

subset not working as expected #258

Open
@jvolkening

Description

Hello,

I've been using kallisto/sleuth for DE analysis of a virus/host dataset. It has been working well for analysing the full transcript set, but now I would like to normalize and analyze only the viral genes separately. I have tried to use the following code to subset the kallisto objects:

# old_path and new_path point to locations of existing and to-be-created HDF5 files
dir.create(dirname(new_path), recursive=T)
read_kallisto_h5(old_path, read_bootstrap=T) %>%
    subset_kallisto(target_ids=viral_ids) %>%
    write_kallisto_hdf5(fname=new_path)

This runs without error, but after troubleshooting subsequent sleuth errors and dumping the contents of the newly written HDF5 files, I found that all of the datasets in the new HDF5 are empty except for bias_observed, bias_normalized, fld, and the bootstrap datasets. For instance, the ids dataset is empty, which was the initial source of my downstream problems. I did double-check that viral_ids were correct and matched those used in kallisto, and sleuth reports the expected number of transcripts filtered.

It appears the problem is in subset_kallisto(), since if I don't subset and only read/write directly, the created HDF5 file seems to be valid. I have seen similar issues, e.g. #204, but I'm not sure if they are directly related or not.

I just tried with sleuth installed from GitHub today using devtools, with the same result.

Thanks in advance for any help.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions