Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1505, #998, #1225 - results after remove are correct #1528

Merged
merged 6 commits into from
Sep 2, 2021

Conversation

sc1f
Copy link
Contributor

@sc1f sc1f commented Sep 2, 2021

This PR fixes #1505, #998, and #1225 by making changes to t_gstate::get_pkeyed_table, which is used to "seed" new views with the state of the underlying table. However, if the table had removed rows and a string primary key, it would enter a block of code that rebuilt the index column and did so with an offset of 0 if an empty string ("") was in the primary keys, and 1 otherwise. This caused the internal psp_pkey index column to be off-by-one, which does not appear as a problem visually but caused issues when filtering, as Perspective was now returning the wrong rows entirely in the filter.

By throughly testing the logic and removing the offending code block, the issue is fixed and filters after removes and updates should now return the correct rows. This behavior also only materializes when a new view is created after the remove/update, as get_pkeyed_table is only used for new views. Thus, we've added a battery of tests asserting the equality of behavior when a new view is created and when we reuse existing views. This provides confidence that removing the block of code is, in fact, the correct solution and guarantees correct behavior.

Additionally, I've fixed #1225 by making table.size() return the size of the mapping of the gnode state instead of the gnode state master table itself.

A moment to opine; all t_data_tables are allocated as a contiguous chunk of memory, and when rows are removed, the physical row is not removed from the master table, as it would cause a re-allocation. Instead, the row is cleared (its values removed) and its primary key removed from m_mapping, which is a mapping of primary key scalars to row indices on the master table. The row on the master table is then kept for later updates, which will seek to reuse existing rows that have been removed instead of appending new rows and thus causing more allocations.

This means thatt_gstate.m_table.size() is always the totality of all rows appended throughout its lifetime, and will always be >= to the size of m_mapping. In contrast, m_mapping is a hashmap and primary keys can be evicted at will. Thus, m_mapping.size() is always appended rows - removed rows, and is the correct size() of the table to be provided to the external user-facing API. I've held off on making this change for some time in order to better understand the remove mechanics, and I think this batch of fixes is a good chance to correct that behavior.

Finally I had to make some ESLint changes after rebasing to master, and this PR includes those changes as well to Javascript test files (commenting out some unused imports).

@sc1f sc1f added bug Concrete, reproducible bugs C++ labels Sep 2, 2021
@texodus texodus linked an issue Sep 2, 2021 that may be closed by this pull request
Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the PR!

The arsenal of tests accompanying this change IMO are great! This will help a lot to validate futures changes to the engine with confidence.

@texodus texodus merged commit 5ad08ca into master Sep 2, 2021
@texodus texodus deleted the fix-updates branch September 2, 2021 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Concrete, reproducible bugs C++
Projects
None yet
2 participants