-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data read #85
Add data read #85
Conversation
@stephprince when you get a chance ,could you please do a first code review of this PR to make sure this is heading in the right direction. I now have a first outline of one possible solution for how we might implement read. There is still a lot more work to be done before this PR is ready, but it would be useful if you could take a look before I go any further with this approach. I would start by looking at:
|
@stephprince I just added a documentation page as well, which hopefully helps explain the current proposed design for read so we can review and discuss. |
…t_open Make calling io->open() explicit
Update read example
@stephprince Looking through the remaining ToDo items, I think we can reasonably create separate issues and PRs for those. I think this is now reasonably well tested that we can merge this and create new issues for the missing items. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me! 🎉
This PR is to try and implement the proposed approach for data read from #83 to see if this approach is viable. This PR is experimental right now and should not be merged.
2. Proposed Implementation for reading data arrays
BaseReadData
ReadDatasetWrapper
andReadAttributeWrapper
classes for reading data. This is modified from the proposal, which suggested a singleBaseReadData
class for reading any array (datasets, attributes) from a file.BaseRecordingData
to inherit fromReadDataWrapper
because the two are not compatible right now.BaseReadData
uses the io and the path to access the data, whereasBaseRecordingData
leaves that choice to the I/O backend and stores the references to the dataset. We may or may not want to change this.BaseIO
BaseIO
(or more accurately I remove them) because: 1) I wanted to use the shared_ptr to the I/O rather than a raw pointer, which I can't get from BaseIO, and 2) with the ReadDatasetWrapper this is more approbriately done in the Container class directly.HDF5IO
BaseReadData
forHDF5
but left read logic toHDF5IO
itself so that theReadDatasetWrapper
can remain generic. To make this more manageable, I definedReadAttributeWrapper
separatelyHDF5ReadDataSet
andHDF5ReadAttribute
wrappers can call can call for readgetObjectType
method for getting the storage object type (Group, Dataset, Attribute) for a given pathContainer
io
object on theContainer
so that we can callio->readDataset
andio->readAttribute
in the read methodsNWB types:
TimeSeries
,ElectricalSeries
etc.Container
classes and replace them with access methods that returnBaseReadData
objects instead. This allows for reading in both read and write mode and avoids keeping data in memory that we have already written to disk. For example, inTimeSeries
, these variables would need to change to properties:aqnwb/src/nwb/base/TimeSeries.hpp
Lines 91 to 140 in e873d95
3. Proposed implementation for reading whole
Containers
(e.g., to read anElectricalSeries
)Container
that owns the respective objects, e.g.,NWBFile
owningElectricalSeries
objects to retrieve the objectContainer
to create an instance of the specificContainer
type using only theio
andpath
for theContainer
as input. The specificContainer
classes, such asTimeSeries
will then need to implement a corresponding constructor that usesio
andpath
as input.Step 1: Define the Template Factory Method in
Container
Step 2: Implement the constructors on the specific
Container
classes (e.g.,TimeSeries
)4. Proposed implementation for reading untyped groups (e.g.,
/acquisition
)I'm not sure we'll need do this, since a group by itself does not define data. To access the contents we could define access methods on the parent
Container
class (e.g.,NWBFile
) that owns the untyped group to access its contents.TODO
ReadDatasetWrapper
,ReadAttributeWrapper
(See Add read for neurodata_types, e.g., Container, TimeSeries #91)ReadDatasetWrapper
andReadAttributeWrapper
(I think this could be useful)Next steps
TimeSeries
, VectorData,
Data,
ElementIdentifers,
Device` #124HDF5IO
-specific functions (i.e., not inherited fromBaseIO
:HDF5IO::getH5ObjectType
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::getNativeType
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::getH5Type
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO
functions (which are core functions new inBaseIO
). While some of these are partially covered by theDEFINE_FIELDS
read tests, it will be useful to more fully test these separatelyHDF5IO::readDataset
(incl. with start, count, stride, and block parameters set). This is to test that reading subsets of larger arrays works as expected. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::open(FileMode mode)
test that file modes are being set correctly. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::readAttribute
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::getStorageObjectType
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::getGroupObjects
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::attributeExists
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::objectExists
. Implemented in Add HDF5IO tests for add_read PR #118HDF5IO::readDataset
to make sure it supports all supported types of BaseDataType, in partiuclar also strings and update theTEST_CASE("HDF5IO; read dataset", "[hdf5io]")
andTEST_CASE("HDF5IO; read dataset subset", "[hdf5io]")
(See Add HDF5IO tests for add_read PR #118 )cd build/dev/tests/data
and runnwbinspector . --threshold BEST_PRACTICE_VIOLATION
) see address validation errors in add_read #125DEFINE_FIELD
create virtual functions and resolve this comment https://github.com/NeurodataWithoutBorders/aqnwb/pull/85/files#r1894223365 NOTE: I created Allow virtual function for DEFINE_FIELD #117 for this and added a note to the docs to example this. I think it is fine to address this later.'boost/multi_array.hpp': No such file or directory
DataBlockGeneric
and use onlyDataBlock
https://github.com/NeurodataWithoutBorders/aqnwb/pull/85/files#r1773994249 (see Discuss options to avoid the need forDataBlockGeneric
and use onlyDataBlock
#132)Items moved to new issues
REGISTER_SUBCLASS_WITH_TYPENAME
#115colNames
fromDynamicTable
constructor toinitialize
#129 and also related to Adding columns to DynamicTable results in invalid colNames #107DynamicTable
#130Container
#131DataBlockGeneric
and use onlyDataBlock
#132