Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move implementation of the metadata pool to core #280

Closed
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f8e349f
Attach object type to basic_object
ErlendHaa Jun 17, 2020
df589fd
Formalize and enrich the index
ErlendHaa Jun 30, 2020
f715cc7
Add lrsh consistency checks to the index_entry
ErlendHaa Jun 30, 2020
ff05946
Pass logical file directly as lookup parameter
achaikou Jul 15, 2020
fc4f6a8
Replace on-fly object creation in tests with files
achaikou Jul 15, 2020
47f73d6
Use dl:basic_object as attic for Python types
ErlendHaa Jul 9, 2020
4153edf
Move the channel -> frame referencing out of load
ErlendHaa Jul 9, 2020
6c6d6f0
Make dl::object_set a self-parsing type
ErlendHaa Jul 9, 2020
29f6559
Add dl::object_set::at
ErlendHaa Jul 10, 2020
e7a16dd
Single out encrypted records in dl::findoffsets
ErlendHaa Jul 10, 2020
ebddc00
Add a queryable pool of metadata object
ErlendHaa Jul 10, 2020
bed2b74
Add dl::pool:types
ErlendHaa Jul 13, 2020
71189aa
Let dl::pool::match use external matchers
ErlendHaa Jul 22, 2020
8715bbb
Add an regex_matcher that uses python's re module
ErlendHaa Jul 23, 2020
0563a0f
Fix dl::value_vector comparison for monostate
ErlendHaa Jul 22, 2020
f6b0442
Let dl::pool be the driver for metadata in dlis
ErlendHaa Jul 13, 2020
cb9c4bc
Set infocodes on dl::object_attribute when parsing
ErlendHaa Jul 15, 2020
63b8500
Move object_attribute.invariant to attribute.info
ErlendHaa Jul 17, 2020
ee36163
Update encodings tests
ErlendHaa Jul 22, 2020
3e6582e
fixup! Set infocodes on dl::object_attribute when parsing
ErlendHaa Jul 23, 2020
f0b7402
fixup! Let dl::pool::match use external matchers
ErlendHaa Jul 24, 2020
785a1db
Pass std::vector< dl::index_entry >& to findfdata
ErlendHaa Jul 24, 2020
272a09d
Make std::vector< dl::index_entry > an opaque type
ErlendHaa Jul 24, 2020
4a4c4af
fixup! Set infocodes on dl::object_attribute when parsing
ErlendHaa Jul 24, 2020
b99f75f
Use dlis.object when creating obj-to-obj links
ErlendHaa Jul 24, 2020
12c69a1
Create fdata_index in findfdata
ErlendHaa Jul 24, 2020
028adfb
Add a dl::pool::match that only matches on type
ErlendHaa Jul 24, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add lrsh consistency checks to the index_entry
When index_entry contains attribute and code it makes sense that the
sanity checking of these header fields are done when creating the index.
Also these checks help validate the index, which is useful if we want to
allow an incomplete index to propagate further.
  • Loading branch information
ErlendHaa committed Jul 21, 2020
commit f715cc737930cd02dc3b6aee2049f7aa7c54a092
14 changes: 14 additions & 0 deletions lib/extension/dlisio/ext/types.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -498,11 +498,25 @@ const char* parse_set_component( const char*,
* Bodies. Rather, it contains the position of a Logical Record within a file,
* the Logical Record Segment Header attributes - and code (logical record
* type).
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general thing about this commit.
It gives the impression that with this commit "consistent" attribute is supposed to work.
But given how attr_consistent and type_consistent are implemented, this is misleading...

So maybe either implement it, or put a note into commit message and docs for now and implement later (update issue #224 to include not only explicitness problem)?

If something, one of the test files (the only one driven from real file situation) is under test_physical_layout/test_lrs_atributes_inconsistency
Can add more if needed

(Other option is just to drop this commit from this PR altogether and solve consistency + predecessor things separately)

* consistency indicates if the header information code and attributes are
* consistent across all Logical Record Segment Headers.
*
* I.e. code should hold the same value in all lrsh. While all the bits in
* attributes should be the same across all lrsh, with exception of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope. There is no requirement for trailing length, checksum and especially padding to be consistent across all attributes.
Basically it's only explicitness and encryption (not even encryption packet) that must be the same.

* predecessor and successor bits which should have the following structure:
*
* lrsh predecessor successor
* 0 0 1
* 1 1 1
* ..
* n 1 0
*/
struct index_entry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a formal complain: I still believe this must be a part of the record now. You even repeat stuff like is_explicit and is_encrypted.
But I am fine if that's planned, doesn't have to be in this PR.

long long tell;
std::uint8_t code; // Logical Record Type - Appendix A
std::uint8_t attributes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention explicitly that these are attributes of the very first segment, because they won't be the same across all the segments, but you still store them all. Basically explicit and encrypted are the only ones that make sense from the outside as they are (supposed to be) consistent.

bool consistent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: as we are digging deeper into errors, that might be possible to have consistent bit set separately for all 3 things that should be consistent (explicitness, encryption, code).
On the other side, inconsistency is not the problem that happens in real files, so this is probably too big overshooting.


bool isexplicit() const noexcept (true);
bool isencrypted() const noexcept (true);
Expand Down
42 changes: 33 additions & 9 deletions lib/src/io.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -397,44 +397,68 @@ record& extract(stream& file, long long tell, long long bytes, record& rec) noex

dl::stream_offsets findoffsets(dl::stream& file) noexcept (false) {
dl::stream_offsets idx;
shortvec< std::uint8_t > attributes;
shortvec< int > types;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint8_t to avoid surprises?

bool consistent = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need it. Or at least call it const


std::int64_t offset = 0;
std::int64_t offset_segment = 0;
std::int64_t offset_record = offset_segment;
char buffer[ DLIS_LRSH_SIZE ];

int len = 0;
while (true) {
file.seek(offset);
file.seek(offset_segment);
file.read(buffer, DLIS_LRSH_SIZE);
if (file.eof())
if (file.eof()) {
// TODO: check attributes and types -> should be empty
break;
}

int type;
std::uint8_t attrs;
dlis_lrsh( buffer, &len, &attrs, &type );

attributes.push_back( attrs );
types.push_back( type );

int isexplicit = attrs & DLIS_SEGATTR_EXFMTLR;

// Start of a new record
if (not (attrs & DLIS_SEGATTR_PREDSEG)) {
if (isexplicit and type == 0 and idx.explicits.size()) {
/*
* Wrap up when we encounter a EFLR of type FILE-HEADER that is
* NOT the first Logical Record. More precisely we expect the
* NOT the first Logical offset_record. More precisely we expect the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, still Logical Record

* _first_ LR we encounter to be a FILE-HEADER. We gather up
* this LR and all successive LR's until we encounter another
* FILE-HEADER.
*/
file.seek( offset );
file.seek( offset_segment );
break;
}
offset_record = offset_segment;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also check here that attrs and types have just 1 element (or, if you move adding them to the list for after this check - if they are empty?)
Because if they are not, something is screwed up, as this is supposed to be the first record.
I am thinking about situation from #224 (and test test_physical_layout/test_lrs_atributes_inconsistency )
Right now this file would seem normal (tell would be in correct place, consistency is not checked - and even if it would be, inconsistent marker would be set for wrong index entry tell.)

Also, checking that if there is a predecessor, then attrs and types must not be empty would help dealing with situation in #222 . Otherwise we might get a very confusing index

But may be taken as a separate PR along with consistency. I can create missing test files.


offset_segment += len;

// We reached the last LRS in the current LR - check consistency and
// add to index
if (not (attrs & DLIS_SEGATTR_SUCCSEG)) {
index_entry entry;
entry.tell = offset;
entry.code = type;
entry.attributes = attrs;
entry.tell = offset_record;
entry.code = types.front();
entry.attributes = attributes.front();
entry.consistent = consistent;

if (not attr_consistent( attributes )) entry.consistent = false;
if (not type_consistent( types )) entry.consistent = false;

if (entry.isexplicit()) idx.explicits.push_back(entry);
else idx.implicits.push_back(entry);

attributes.clear();
types.clear();
}
offset += len;
}
return idx;
}
Expand Down