Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEARCH-300 (BTS-789) remove database directory for arangosearch #18420

Merged
merged 14 commits into from
Mar 24, 2023

Conversation

MBkkt
Copy link
Contributor

@MBkkt MBkkt commented Mar 23, 2023

Scope & Purpose

SEARCH-300 remove database directory for arangosearch
and a lot of refactoring

  • 💩 Bugfix
  • 🍕 New feature
  • 🔥 Performance improvement
  • 🔨 Refactoring/simplification

Checklist

  • Tests
    • Regression tests
    • C++ Unit tests
    • integration tests
    • resilience tests
  • 📖 CHANGELOG entry made
  • 📚 documentation written (release notes, API changes, ...)
  • Backports
    • Backport for 3.10: (Please link PR)
    • Backport for 3.9: (Please link PR)
    • Backport for 3.8: (Please link PR)

Related Information

(Please reference tickets / specification / other PRs etc)

  • Docs PR:
  • Enterprise PR:
  • GitHub issue / Jira ticket:
  • Design document:

and a lot of refactoring
@MBkkt MBkkt requested review from jsteemann, gnusi and Dronplane March 23, 2023 13:22
@cla-bot cla-bot bot added the cla-signed label Mar 23, 2023
@MBkkt MBkkt changed the title SEARCH-300 remove database directory for arangosearch SEARCH-300 (BTS-789) remove database directory for arangosearch Mar 23, 2023
@MBkkt MBkkt marked this pull request as ready for review March 23, 2023 13:55
@MBkkt MBkkt requested a review from a team as a code owner March 23, 2023 13:55
@MBkkt MBkkt requested a review from a team March 23, 2023 13:55
arangod/IResearch/IResearchFeature.cpp Outdated Show resolved Hide resolved
arangod/IResearch/IResearchFeature.cpp Outdated Show resolved Hide resolved
arangod/RestServer/FlushFeature.cpp Show resolved Hide resolved
}
return true;
});
stale = static_cast<size_t>(end - it);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT

Suggested change
stale = static_cast<size_t>(end - it);
stale = static_cast<size_t>(std::distance(it, end));

Copy link
Contributor Author

@MBkkt MBkkt Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely disagree, if distance will do for each, I want instead of this code will be written ++stale in cycle above

arangod/V8Server/V8DealerFeature.h Outdated Show resolved Hide resolved
Comment on lines +149 to +150
// TODO(MBkkt) Why shouldn't we remove database data
// if exists database with same name?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this thread executes the deletion at some point after the user issued a db._dropDatabase() call.
Consider the following situation:

  • user executes db._dropDatabase("foo");
  • user executes db._createDatabase("foo");
  • DatabaseManagerThread now finds the deleted database "foo". Shall it remove its application directory now? We cannot be sure about it, because it may be the new database's application directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but we use id for new databases in path for v8 stuff (query register still use name :()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We unfortunately use the database name for the Foxx app directories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not always, please check following code

/// @brief return either the name of the database to be used as a folder name,
/// or its id if its name contains special characters and is not fully supported
/// in every OS
[[nodiscard]] static std::string_view getDatabaseDirName(std::string_view name,
                                                         std::string_view id) {
  bool const isOldStyleName = DatabaseNameValidator::isAllowedName(
      /*allowSystem=*/true, /*extendedNames=*/false, name);
  return (isOldStyleName || id.empty()) ? name : id;
}

// compare-exchange failed. try again!
expected = _refCount.load(std::memory_order_relaxed);
}
} while (!_refCount.compare_exchange_weak(v, v + 2, std::memory_order_acquire,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the memory order acquire here and not release?

Copy link
Contributor Author

@MBkkt MBkkt Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we probably want to synchronize with markAsDropped (if don't want should be relaxed), and our +2 not needed for synchronization code.

You can look on intrusive_ptr as an a good example of this pattern
https://www.boost.org/doc/libs/1_81_0/libs/atomic/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.example_reference_counters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For -2 we need release because we want to synchronize with isDangling

}

void TRI_vocbase_t::forceUse() { _refCount += 2; }
void TRI_vocbase_t::forceUse() noexcept {
_refCount.fetch_add(2, std::memory_order_relaxed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be memory order release here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not

Comment on lines +484 to +486
auto const v = _refCount.load(std::memory_order_acquire);
TRI_ASSERT((v & 1) == 0 || !isSystem());
return v == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be a behavior change and may be unsafe.
It someone manages to set the drop bit on the system database, this function will now return true, but previously it always returned false. It could have unintended side-effects for example in the DatabaseManagerThread, which will remove the database then if this function here returns true.
ofc it would be useful to not allow to set the drop bit for the system database in the first place, but this function here previously provided some extra security.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, please see assert in markAsDropped.
System database is never isDropped, so it's never isDangling

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess ASSERT than should be our choice. As system database should never have delete bit set.

Copy link
Contributor

@Dronplane Dronplane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. See comments and this PR needs tests

arangod/VocBase/vocbase.cpp Show resolved Hide resolved
arangod/VocBase/vocbase.cpp Outdated Show resolved Hide resolved
@MBkkt MBkkt requested review from Dronplane and jsteemann March 23, 2023 16:34
.clang-format Outdated Show resolved Hide resolved
Copy link
Contributor

@Dronplane Dronplane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jsteemann jsteemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KVS85 KVS85 added this to the devel milestone Mar 24, 2023
@KVS85 KVS85 added the 3 ArangoSearch Views label Mar 24, 2023
@Dronplane Dronplane merged commit f9ccaee into devel Mar 24, 2023
@Dronplane Dronplane deleted the bugfix/search-300 branch March 24, 2023 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants