Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fix GCS health check manager TSAN #49471

Merged
merged 3 commits into from
Dec 30, 2024

Conversation

dentiny
Copy link
Contributor

@dentiny dentiny commented Dec 28, 2024

Fix issue: #49469

The bug is:

  • We have two eventloops in the GCS health check manager, one for asio io context, one for grpc
  • In the unit test and production code elsewhere, we only synchronize on io context by io_context::stop, but not grpc
  • leading to grpc still accessing GcsHealthCheckManager, while we mistakenly think all async operation have been properly synchronized

In this PR, I use shared pointer to make sure all accesses to gcs health check manager is valid, even if io context has been stopped. Also contain a fix to data member declaration order to respect the usage dependency.

Signed-off-by: dentiny <dentinyhao@gmail.com>
@dentiny dentiny added the go add ONLY when ready to merge, run all tests label Dec 28, 2024
@dentiny dentiny requested a review from a team as a code owner December 28, 2024 08:26
grpc::ClientContext context_;
::grpc::health::v1::HealthCheckRequest request_;
::grpc::health::v1::HealthCheckResponse response_;

/// gRPC related fields
std::unique_ptr<::grpc::health::v1::Health::Stub> stub_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Respect the usage dependency to reorder data member declaration.

@jjyao
Copy link
Collaborator

jjyao commented Dec 28, 2024

So this is caused by the previous shared prt to unique ptr change PR?

@dentiny
Copy link
Contributor Author

dentiny commented Dec 28, 2024

So this is caused by the previous shared prt to unique ptr change PR?

No, checking the git history, the class is never a subclass for enabled shared from this: https://github.com/ray-project/ray/commits/master/src/ray/gcs/gcs_server/gcs_health_check_manager.h

In other words, the problem is we capture gcs_health_check_manager raw pointer instead of shared pointer in grpc eventloop.

I suspect it's a long-existing problem and we were just lucky.

stub_->async()->Check(
&context_, &request_, &response_, [this, start = now](::grpc::Status status) {
Copy link
Contributor

@rynewang rynewang Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: what if we don't change this lambda? this is a raw pointer -> a ctx, which holds a manager shared ptr. When this grpc callback is called, this is guaranteed to still be alive so it is guaranteed to hold a refcnt of the manager, so I think we don't need another shared ptr here captured in the lambda.

Copy link
Contributor Author

@dentiny dentiny Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

Consider the below example:

manager_->io_service_.post(
            [this, status]() {
              if (stopped_) {
                delete this;
                return;
              }
   // omit other things

After delete this, health check manager could be deleted since no other ref count, so our test case destructor could go through without waiting for the io context operation completes => TSAN on io context access

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, the reason to capture manager in the lambda, is to make sure manager only destructs when async grpc callback finishes, then followed by unit test class destruction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have any interest to have a try, you will get another TSAN failure without capture shared pointer here.

WARNING: ThreadSanitizer: data race (pid=12)
  Read of size 8 at 0x7b2c00000418 by main thread:
    #0 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:729:6 (gcs_health_check_manager_test+0x118e72)
    #1 std::__shared_ptr<GuardedEventStats, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:1169:31 (libsrc_Sray_Scommon_Slibasio.so+0xd2252)
    #2 std::shared_ptr<GuardedEventStats>::~shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr.h:103:11 (libsrc_Sray_Scommon_Slibasio.so+0xd220b)
    #3 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::~pair() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_iterator.h:1283:12 (libsrc_Sray_Scommon_Slibasio.so+0xd21c9)
    #4 void __gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> > >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:152:10 (libsrc_Sray_Scommon_Slibasio.so+0xd2184)
    #5 void std::allocator_traits<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:496:8 (libsrc_Sray_Scommon_Slibasio.so+0xd213b)
    #6 void absl::lts_20230802::container_internal::map_slot_policy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::destroy<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /proc/self/cwd/external/com_google_absl/absl/container/internal/container_memory.h:419:7 (libsrc_Sray_Scommon_Slibasio.so+0xd20eb)
    #7 void absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::destroy<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /proc/self/cwd/external/com_google_absl/absl/container/flat_hash_map.h:578:5 (libsrc_Sray_Scommon_Slibasio.so+0xd2098)
    #8 void absl::lts_20230802::container_internal::common_policy_traits<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, void>::destroy<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /proc/self/cwd/external/com_google_absl/absl/container/internal/common_policy_traits.h:50:5 (libsrc_Sray_Scommon_Slibasio.so+0xd2038)
    #9 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::destroy_slots() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:1946:9 (libsrc_Sray_Scommon_Slibasio.so+0xd1bca)
    #10 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::~raw_hash_set() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:1885:5 (libsrc_Sray_Scommon_Slibasio.so+0xd1975)
    #11 absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::~raw_hash_map() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:31:7 (libsrc_Sray_Scommon_Slibasio.so+0xd18eb)
    #12 absl::lts_20230802::flat_hash_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::~flat_hash_map() /proc/self/cwd/external/com_google_absl/absl/container/flat_hash_map.h:113:7 (libsrc_Sray_Scommon_Slibasio.so+0xcfefb)
    #13 EventTracker::~EventTracker() /proc/self/cwd/bazel-out/k8-dbg/bin/src/ray/common/_virtual_includes/event_stats/ray/common/event_stats.h:97:7 (libsrc_Sray_Scommon_Slibasio.so+0xd25cb)
    #14 void __gnu_cxx::new_allocator<EventTracker>::destroy<EventTracker>(EventTracker*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:152:10 (libsrc_Sray_Scommon_Slibasio.so+0xd2574)
    #15 void std::allocator_traits<std::allocator<EventTracker> >::destroy<EventTracker>(std::allocator<EventTracker>&, EventTracker*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:496:8 (libsrc_Sray_Scommon_Slibasio.so+0xd24db)
    #16 std::_Sp_counted_ptr_inplace<EventTracker, std::allocator<EventTracker>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:557:2 (libsrc_Sray_Scommon_Slibasio.so+0xcfad2)
    #17 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:155:6 (gcs_health_check_manager_test+0x118f18)
    #18 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:730:11 (gcs_health_check_manager_test+0x118e98)
    #19 std::__shared_ptr<EventTracker, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:1169:31 (gcs_health_check_manager_test+0x119532)
    #20 std::shared_ptr<EventTracker>::~shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr.h:103:11 (gcs_health_check_manager_test+0x1194eb)
    #21 instrumented_io_context::~instrumented_io_context() /proc/self/cwd/bazel-out/k8-dbg/bin/src/ray/common/_virtual_includes/asio/ray/common/asio/instrumented_io_context.h:27:7 (gcs_health_check_manager_test+0x117aa9)
    #22 GcsHealthCheckManagerTest::~GcsHealthCheckManagerTest() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:56:40 (gcs_health_check_manager_test+0x11743b)
    #23 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::~GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:166:1 (gcs_health_check_manager_test+0x114ddb)
    #24 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::~GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:166:1 (gcs_health_check_manager_test+0x114e1f)
    #25 testing::Test::DeleteSelf_() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:336:24 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x140fea)
    #26 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x168fac)
    #27 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x13fff1)
    #28 testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2842:5 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x111b76)
    #29 testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:3015:30 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x11296c)
    #30 testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5920:44 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12ccb1)
    #31 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x17263c)
    #32 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x1446b7)
    #33 testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5484:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12c4be)
    #34 RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2317:73 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xda7)
    #35 main /proc/self/cwd/external/com_google_googletest/googlemock/src/gmock_main.cc:71:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xd27)

  Previous write of size 8 at 0x7b2c00000418 by thread T51 (mutexes: write M408836992013439344):
    #0 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:616:45 (gcs_health_check_manager_test+0x117522)
    #1 std::__shared_ptr<GuardedEventStats, (__gnu_cxx::_Lock_policy)2>::__shared_ptr(std::__shared_ptr<GuardedEventStats, (__gnu_cxx::_Lock_policy)2>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:1177:29 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x3f3d3)
    #2 std::shared_ptr<GuardedEventStats>::shared_ptr(std::shared_ptr<GuardedEventStats>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr.h:255:9 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x3f268)
    #3 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, 0ul, std::shared_ptr<GuardedEventStats>&&, 0ul>(std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&, std::_Index_tuple<0ul>, std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/tuple:1674:9 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a61c)
    #4 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&>(std::piecewise_construct_t, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&>) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/tuple:1661:9 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a348)
    #5 void __gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >::construct<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:146:23 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a239)
    #6 void std::allocator_traits<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::construct<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:483:8 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a150)
    #7 void absl::lts_20230802::container_internal::map_slot_policy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::construct<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/container_memory.h:386:7 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a07e)
    #8 void absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::construct<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/flat_hash_map.h:573:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x49fa0)
    #9 void absl::lts_20230802::container_internal::common_policy_traits<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, void>::construct<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/common_policy_traits.h:43:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x49c30)
    #10 void absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::emplace_at<std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(unsigned long, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2699:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x46415)
    #11 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:204:13 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x45de2)
    #12 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats>, 0>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:139:12 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x34d2f)
    #13 EventTracker::GetOrCreate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/src/ray/common/event_stats.cc:175:29 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2dbf6)
    #14 EventTracker::RecordStart(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) /proc/self/cwd/src/ray/common/event_stats.cc:64:16 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2d737)
    #15 instrumented_io_context::post(std::function<void ()>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) /proc/self/cwd/src/ray/common/asio/instrumented_io_context.cc:95:39 (libsrc_Sray_Scommon_Slibasio.so+0x9ebba)
    #16 ray::gcs::GcsHealthCheckManager::HealthCheckContext::StartHealthCheck()::$_1::operator()(grpc::Status) const /proc/self/cwd/src/ray/gcs/gcs_server/gcs_health_check_manager.cc:145:31 (liblibgcs_Userver_Ulib.so+0xf77df9)
    #17 std::_Function_handler<void (grpc::Status), ray::gcs::GcsHealthCheckManager::HealthCheckContext::StartHealthCheck()::$_1>::_M_invoke(std::_Any_data const&, grpc::Status&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/std_function.h:300:2 (liblibgcs_Userver_Ulib.so+0xf77802)
    #18 std::function<void (grpc::Status)>::operator()(grpc::Status) const /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/std_function.h:688:14 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb5b6f)
    #19 void grpc::internal::CatchingCallback<std::function<void (grpc::Status)>, grpc::Status>(std::function<void (grpc::Status)>&&, grpc::Status&&) /proc/self/cwd/external/com_github_grpc_grpc/include/grpcpp/support/callback_common.h:43:5 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb5a5b)
    #20 grpc::internal::CallbackWithStatusTag::Run(bool) /proc/self/cwd/external/com_github_grpc_grpc/include/grpcpp/support/callback_common.h:128:5 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb57db)
    #21 grpc::internal::CallbackWithStatusTag::StaticRun(grpc_completion_queue_functor*, int) /proc/self/cwd/external/com_github_grpc_grpc/include/grpcpp/support/callback_common.h:112:46 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb550e)
    #22 grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)::operator()(void*) const /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:94:17 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1ccd)
    #23 grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)::__invoke(void*) /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:62:13 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1a08)
    #24 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&)::'lambda'(void*)::operator()(void*) const /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:145:11 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1ea8e)
    #25 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&)::'lambda'(void*)::__invoke(void*) /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:115:9 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1e888)

  Location is heap block of size 176 at 0x7b2c00000370 allocated by main thread:
    #0 malloc <null> (gcs_health_check_manager_test+0x7cf14)
    #1 operator new(unsigned long) <null> (libstdc++.so.6+0xaab28)
    #2 std::allocator_traits<std::allocator<absl::lts_20230802::container_internal::AlignedType<8ul> > >::allocate(std::allocator<absl::lts_20230802::container_internal::AlignedType<8ul> >&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:443:20 (liblibgcs_Userver_Ulib.so+0xd12f3a)
    #3 void* absl::lts_20230802::container_internal::Allocate<8ul, std::allocator<char> >(std::allocator<char>*, unsigned long) /proc/self/cwd/external/com_google_absl/absl/container/internal/container_memory.h:65:13 (liblibgcs_Userver_Ulib.so+0xd128b5)
    #4 void absl::lts_20230802::container_internal::InitializeSlots<std::allocator<char>, 48ul, 8ul>(absl::lts_20230802::container_internal::CommonFields&, std::allocator<char>) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:1407:7 (liblibgcs_Userver_Ulib.so+0xd1a27b)
    #5 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::initialize_slots() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2505:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x477db)
    #6 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::resize(unsigned long) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2515:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x46d93)
    #7 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::rehash_and_grow_if_necessary() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2603:7 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4729e)
    #8 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::prepare_insert(unsigned long) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2678:7 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4697b)
    #9 std::pair<unsigned long, bool> absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::find_or_prepare_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2659:13 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4626c)
    #10 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:202:22 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x45d44)
    #11 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats>, 0>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:139:12 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x34d2f)
    #12 EventTracker::GetOrCreate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/src/ray/common/event_stats.cc:175:29 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2dbf6)
    #13 EventTracker::RecordStart(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) /proc/self/cwd/src/ray/common/event_stats.cc:64:16 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2d737)
    #14 instrumented_io_context::dispatch(std::function<void ()>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/src/ray/common/asio/instrumented_io_context.cc:114:37 (libsrc_Sray_Scommon_Slibasio.so+0x9f107)
    #15 ray::gcs::GcsHealthCheckManager::MarkNodeHealthy(ray::NodeID const&) /proc/self/cwd/src/ray/gcs/gcs_server/gcs_health_check_manager.cc:86:15 (liblibgcs_Userver_Ulib.so+0xf730a7)
    #16 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::TestBody() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:174:17 (gcs_health_check_manager_test+0x10c5c0)
    #17 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x168fac)
    #18 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x13fff1)
    #19 testing::Test::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2687:5 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x110901)
    #20 testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2836:11 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x111afd)
    #21 testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:3015:30 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x11296c)
    #22 testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5920:44 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12ccb1)
    #23 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x17263c)
    #24 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x1446b7)
    #25 testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5484:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12c4be)
    #26 RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2317:73 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xda7)
    #27 main /proc/self/cwd/external/com_google_googletest/googlemock/src/gmock_main.cc:71:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xd27)

  Mutex M408836992013439344 is already destroyed.

  Thread T51 'nexting_thread' (tid=92, running) created by main thread at:
    #0 pthread_create <null> (gcs_health_check_manager_test+0x7e7db)
    #1 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&) /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:113:30 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1e20e)
    #2 grpc_core::Thread::Thread(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&) /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:199:15 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1db36)
    #3 void __gnu_cxx::new_allocator<grpc_core::Thread>::construct<grpc_core::Thread, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*), grpc::CompletionQueue*&>(grpc_core::Thread*, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)&&, grpc::CompletionQueue*&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:146:23 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d196a)
    #4 void std::allocator_traits<std::allocator<grpc_core::Thread> >::construct<grpc_core::Thread, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*), grpc::CompletionQueue*&>(std::allocator<grpc_core::Thread>&, grpc_core::Thread*, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)&&, grpc::CompletionQueue*&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:483:8 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1390)
    #5 grpc_core::Thread& std::vector<grpc_core::Thread, std::allocator<grpc_core::Thread> >::emplace_back<char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*), grpc::CompletionQueue*&>(char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)&&, grpc::CompletionQueue*&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/vector.tcc:115:6 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1213)
    #6 grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:60:26 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d0c0c)
    #7 grpc::CompletionQueue::CallbackAlternativeCQ() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:194:36 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d0a19)
    #8 grpc::Server::CallbackCQ() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_cc.cc:1386:19 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x2288d1)
    #9 grpc::Server::RegisterService(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, grpc::Service*) /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_cc.cc:1074:35 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x225e2d)
    #10 grpc::Server::Start(grpc::ServerCompletionQueue**, unsigned long) /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_cc.cc:1172:5 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x227123)
    #11 grpc::ServerBuilder::BuildAndStart() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_builder.cc:446:11 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x2089ce)
    #12 ray::rpc::GrpcServer::Run() /proc/self/cwd/src/ray/rpc/grpc_server.cc:122:21 (liblibgrpc_Ucommon_Ulib.so+0x60476)
    #13 GcsHealthCheckManagerTest::AddServer(bool) /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:91:13 (gcs_health_check_manager_test+0x1125c3)
    #14 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::TestBody() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:167:18 (gcs_health_check_manager_test+0x10c3a1)
    #15 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x168fac)
    #16 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x13fff1)
    #17 testing::Test::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2687:5 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x110901)
    #18 testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2836:11 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x111afd)
    #19 testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:3015:30 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x11296c)
    #20 testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5920:44 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12ccb1)
    #21 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x17263c)
    #22 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x1446b7)
    #23 testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5484:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12c4be)
    #24 RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2317:73 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xda7)
    #25 main /proc/self/cwd/external/com_google_googletest/googlemock/src/gmock_main.cc:71:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xd27)

SUMMARY: ThreadSanitizer: data race /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:729:6 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count()

@dentiny dentiny requested a review from rynewang December 29, 2024 04:15
Signed-off-by: dentiny <dentinyhao@gmail.com>
Signed-off-by: dentiny <dentinyhao@gmail.com>
@dentiny dentiny requested a review from jjyao December 29, 2024 06:38
@rynewang
Copy link
Contributor

I am not 100% convinced by the necessity of the shared ptr capturing in the lambda, but it's not too bad to keep it there. Approving

@rynewang rynewang merged commit 66cec2f into ray-project:master Dec 30, 2024
5 checks passed
&context_,
&request_,
&response_,
[this, start = now, manager = manager](::grpc::Status status) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to capture manager here. Since we captured this, we have the weak_ptr of the manager and we just need to do another weak_ptr.lock() inside the lambda.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, #49493

srinathk10 pushed a commit that referenced this pull request Jan 3, 2025
Fix issue: #49469

The bug is:
- We have two eventloops in the GCS health check manager, one for asio
io context, one for grpc
- In the unit test and production code elsewhere, we only synchronize on
io context by `io_context::stop`, but not grpc
- leading to grpc still accessing `GcsHealthCheckManager`, while we
**mistakenly think** all async operation have been properly synchronized

In this PR, I use shared pointer to make sure all accesses to gcs health
check manager is valid, even if io context has been stopped. Also
contain a fix to data member declaration order to respect the usage
dependency.

---------

Signed-off-by: dentiny <dentinyhao@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants