-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move implementation of the metadata pool to core #280
Changes from 1 commit
f8e349f
df589fd
f715cc7
ff05946
fc4f6a8
47f73d6
4153edf
6c6d6f0
29f6559
e7a16dd
ebddc00
bed2b74
71189aa
8715bbb
0563a0f
f6b0442
cb9c4bc
63b8500
ee36163
3e6582e
f0b7402
785a1db
272a09d
4a4c4af
b99f75f
12c69a1
028adfb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
When index_entry contains attribute and code it makes sense that the sanity checking of these header fields are done when creating the index. Also these checks help validate the index, which is useful if we want to allow an incomplete index to propagate further.
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -498,11 +498,25 @@ const char* parse_set_component( const char*, | |
* Bodies. Rather, it contains the position of a Logical Record within a file, | ||
* the Logical Record Segment Header attributes - and code (logical record | ||
* type). | ||
* | ||
* consistency indicates if the header information code and attributes are | ||
* consistent across all Logical Record Segment Headers. | ||
* | ||
* I.e. code should hold the same value in all lrsh. While all the bits in | ||
* attributes should be the same across all lrsh, with exception of the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nope. There is no requirement for trailing length, checksum and especially padding to be consistent across all attributes. |
||
* predecessor and successor bits which should have the following structure: | ||
* | ||
* lrsh predecessor successor | ||
* 0 0 1 | ||
* 1 1 1 | ||
* .. | ||
* n 1 0 | ||
*/ | ||
struct index_entry { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just as a formal complain: I still believe this must be a part of the record now. You even repeat stuff like is_explicit and is_encrypted. |
||
long long tell; | ||
std::uint8_t code; // Logical Record Type - Appendix A | ||
std::uint8_t attributes; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mention explicitly that these are attributes of the very first segment, because they won't be the same across all the segments, but you still store them all. Basically explicit and encrypted are the only ones that make sense from the outside as they are (supposed to be) consistent. |
||
bool consistent; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a thought: as we are digging deeper into errors, that might be possible to have consistent bit set separately for all 3 things that should be consistent (explicitness, encryption, code). |
||
|
||
bool isexplicit() const noexcept (true); | ||
bool isencrypted() const noexcept (true); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -397,44 +397,68 @@ record& extract(stream& file, long long tell, long long bytes, record& rec) noex | |
|
||
dl::stream_offsets findoffsets(dl::stream& file) noexcept (false) { | ||
dl::stream_offsets idx; | ||
shortvec< std::uint8_t > attributes; | ||
shortvec< int > types; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. uint8_t to avoid surprises? |
||
bool consistent = true; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You don't need it. Or at least call it const |
||
|
||
std::int64_t offset = 0; | ||
std::int64_t offset_segment = 0; | ||
std::int64_t offset_record = offset_segment; | ||
char buffer[ DLIS_LRSH_SIZE ]; | ||
|
||
int len = 0; | ||
while (true) { | ||
file.seek(offset); | ||
file.seek(offset_segment); | ||
file.read(buffer, DLIS_LRSH_SIZE); | ||
if (file.eof()) | ||
if (file.eof()) { | ||
// TODO: check attributes and types -> should be empty | ||
break; | ||
} | ||
|
||
int type; | ||
std::uint8_t attrs; | ||
dlis_lrsh( buffer, &len, &attrs, &type ); | ||
|
||
attributes.push_back( attrs ); | ||
types.push_back( type ); | ||
|
||
int isexplicit = attrs & DLIS_SEGATTR_EXFMTLR; | ||
|
||
// Start of a new record | ||
if (not (attrs & DLIS_SEGATTR_PREDSEG)) { | ||
if (isexplicit and type == 0 and idx.explicits.size()) { | ||
/* | ||
* Wrap up when we encounter a EFLR of type FILE-HEADER that is | ||
* NOT the first Logical Record. More precisely we expect the | ||
* NOT the first Logical offset_record. More precisely we expect the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no, still Logical Record |
||
* _first_ LR we encounter to be a FILE-HEADER. We gather up | ||
* this LR and all successive LR's until we encounter another | ||
* FILE-HEADER. | ||
*/ | ||
file.seek( offset ); | ||
file.seek( offset_segment ); | ||
break; | ||
} | ||
offset_record = offset_segment; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we also check here that attrs and types have just 1 element (or, if you move adding them to the list for after this check - if they are empty?) Also, checking that if there is a predecessor, then attrs and types must not be empty would help dealing with situation in #222 . Otherwise we might get a very confusing index But may be taken as a separate PR along with consistency. I can create missing test files. |
||
|
||
offset_segment += len; | ||
|
||
// We reached the last LRS in the current LR - check consistency and | ||
// add to index | ||
if (not (attrs & DLIS_SEGATTR_SUCCSEG)) { | ||
index_entry entry; | ||
entry.tell = offset; | ||
entry.code = type; | ||
entry.attributes = attrs; | ||
entry.tell = offset_record; | ||
entry.code = types.front(); | ||
entry.attributes = attributes.front(); | ||
entry.consistent = consistent; | ||
|
||
if (not attr_consistent( attributes )) entry.consistent = false; | ||
if (not type_consistent( types )) entry.consistent = false; | ||
|
||
if (entry.isexplicit()) idx.explicits.push_back(entry); | ||
else idx.implicits.push_back(entry); | ||
|
||
attributes.clear(); | ||
types.clear(); | ||
} | ||
offset += len; | ||
} | ||
return idx; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general thing about this commit.
It gives the impression that with this commit "consistent" attribute is supposed to work.
But given how attr_consistent and type_consistent are implemented, this is misleading...
So maybe either implement it, or put a note into commit message and docs for now and implement later (update issue #224 to include not only explicitness problem)?
If something, one of the test files (the only one driven from real file situation) is under
test_physical_layout/test_lrs_atributes_inconsistency
Can add more if needed
(Other option is just to drop this commit from this PR altogether and solve consistency + predecessor things separately)