-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Fix GCS health check manager TSAN #49471
[core] Fix GCS health check manager TSAN #49471
Conversation
Signed-off-by: dentiny <dentinyhao@gmail.com>
grpc::ClientContext context_; | ||
::grpc::health::v1::HealthCheckRequest request_; | ||
::grpc::health::v1::HealthCheckResponse response_; | ||
|
||
/// gRPC related fields | ||
std::unique_ptr<::grpc::health::v1::Health::Stub> stub_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Respect the usage dependency to reorder data member declaration.
So this is caused by the previous shared prt to unique ptr change PR? |
No, checking the git history, the class is never a subclass for enabled shared from this: https://github.com/ray-project/ray/commits/master/src/ray/gcs/gcs_server/gcs_health_check_manager.h In other words, the problem is we capture I suspect it's a long-existing problem and we were just lucky. |
stub_->async()->Check( | ||
&context_, &request_, &response_, [this, start = now](::grpc::Status status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: what if we don't change this lambda? this
is a raw pointer -> a ctx, which holds a manager shared ptr. When this grpc callback is called, this
is guaranteed to still be alive so it is guaranteed to hold a refcnt of the manager, so I think we don't need another shared ptr here captured in the lambda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.
Consider the below example:
manager_->io_service_.post(
[this, status]() {
if (stopped_) {
delete this;
return;
}
// omit other things
After delete this
, health check manager could be deleted since no other ref count, so our test case destructor could go through without waiting for the io context operation completes => TSAN on io context access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, the reason to capture manager
in the lambda, is to make sure manager
only destructs when async grpc callback finishes, then followed by unit test class destruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have any interest to have a try, you will get another TSAN failure without capture shared pointer here.
WARNING: ThreadSanitizer: data race (pid=12)
Read of size 8 at 0x7b2c00000418 by main thread:
#0 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:729:6 (gcs_health_check_manager_test+0x118e72)
#1 std::__shared_ptr<GuardedEventStats, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:1169:31 (libsrc_Sray_Scommon_Slibasio.so+0xd2252)
#2 std::shared_ptr<GuardedEventStats>::~shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr.h:103:11 (libsrc_Sray_Scommon_Slibasio.so+0xd220b)
#3 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::~pair() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_iterator.h:1283:12 (libsrc_Sray_Scommon_Slibasio.so+0xd21c9)
#4 void __gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> > >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:152:10 (libsrc_Sray_Scommon_Slibasio.so+0xd2184)
#5 void std::allocator_traits<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:496:8 (libsrc_Sray_Scommon_Slibasio.so+0xd213b)
#6 void absl::lts_20230802::container_internal::map_slot_policy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::destroy<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /proc/self/cwd/external/com_google_absl/absl/container/internal/container_memory.h:419:7 (libsrc_Sray_Scommon_Slibasio.so+0xd20eb)
#7 void absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::destroy<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /proc/self/cwd/external/com_google_absl/absl/container/flat_hash_map.h:578:5 (libsrc_Sray_Scommon_Slibasio.so+0xd2098)
#8 void absl::lts_20230802::container_internal::common_policy_traits<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, void>::destroy<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*) /proc/self/cwd/external/com_google_absl/absl/container/internal/common_policy_traits.h:50:5 (libsrc_Sray_Scommon_Slibasio.so+0xd2038)
#9 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::destroy_slots() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:1946:9 (libsrc_Sray_Scommon_Slibasio.so+0xd1bca)
#10 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::~raw_hash_set() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:1885:5 (libsrc_Sray_Scommon_Slibasio.so+0xd1975)
#11 absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::~raw_hash_map() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:31:7 (libsrc_Sray_Scommon_Slibasio.so+0xd18eb)
#12 absl::lts_20230802::flat_hash_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats>, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::~flat_hash_map() /proc/self/cwd/external/com_google_absl/absl/container/flat_hash_map.h:113:7 (libsrc_Sray_Scommon_Slibasio.so+0xcfefb)
#13 EventTracker::~EventTracker() /proc/self/cwd/bazel-out/k8-dbg/bin/src/ray/common/_virtual_includes/event_stats/ray/common/event_stats.h:97:7 (libsrc_Sray_Scommon_Slibasio.so+0xd25cb)
#14 void __gnu_cxx::new_allocator<EventTracker>::destroy<EventTracker>(EventTracker*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:152:10 (libsrc_Sray_Scommon_Slibasio.so+0xd2574)
#15 void std::allocator_traits<std::allocator<EventTracker> >::destroy<EventTracker>(std::allocator<EventTracker>&, EventTracker*) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:496:8 (libsrc_Sray_Scommon_Slibasio.so+0xd24db)
#16 std::_Sp_counted_ptr_inplace<EventTracker, std::allocator<EventTracker>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:557:2 (libsrc_Sray_Scommon_Slibasio.so+0xcfad2)
#17 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:155:6 (gcs_health_check_manager_test+0x118f18)
#18 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:730:11 (gcs_health_check_manager_test+0x118e98)
#19 std::__shared_ptr<EventTracker, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:1169:31 (gcs_health_check_manager_test+0x119532)
#20 std::shared_ptr<EventTracker>::~shared_ptr() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr.h:103:11 (gcs_health_check_manager_test+0x1194eb)
#21 instrumented_io_context::~instrumented_io_context() /proc/self/cwd/bazel-out/k8-dbg/bin/src/ray/common/_virtual_includes/asio/ray/common/asio/instrumented_io_context.h:27:7 (gcs_health_check_manager_test+0x117aa9)
#22 GcsHealthCheckManagerTest::~GcsHealthCheckManagerTest() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:56:40 (gcs_health_check_manager_test+0x11743b)
#23 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::~GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:166:1 (gcs_health_check_manager_test+0x114ddb)
#24 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::~GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:166:1 (gcs_health_check_manager_test+0x114e1f)
#25 testing::Test::DeleteSelf_() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:336:24 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x140fea)
#26 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x168fac)
#27 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x13fff1)
#28 testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2842:5 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x111b76)
#29 testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:3015:30 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x11296c)
#30 testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5920:44 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12ccb1)
#31 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x17263c)
#32 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x1446b7)
#33 testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5484:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12c4be)
#34 RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2317:73 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xda7)
#35 main /proc/self/cwd/external/com_google_googletest/googlemock/src/gmock_main.cc:71:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xd27)
Previous write of size 8 at 0x7b2c00000418 by thread T51 (mutexes: write M408836992013439344):
#0 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count() /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:616:45 (gcs_health_check_manager_test+0x117522)
#1 std::__shared_ptr<GuardedEventStats, (__gnu_cxx::_Lock_policy)2>::__shared_ptr(std::__shared_ptr<GuardedEventStats, (__gnu_cxx::_Lock_policy)2>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:1177:29 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x3f3d3)
#2 std::shared_ptr<GuardedEventStats>::shared_ptr(std::shared_ptr<GuardedEventStats>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr.h:255:9 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x3f268)
#3 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, 0ul, std::shared_ptr<GuardedEventStats>&&, 0ul>(std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&, std::_Index_tuple<0ul>, std::_Index_tuple<0ul>) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/tuple:1674:9 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a61c)
#4 std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&>(std::piecewise_construct_t, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&>) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/tuple:1661:9 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a348)
#5 void __gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >::construct<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:146:23 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a239)
#6 void std::allocator_traits<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::construct<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:483:8 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a150)
#7 void absl::lts_20230802::container_internal::map_slot_policy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::construct<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/container_memory.h:386:7 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4a07e)
#8 void absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >::construct<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/flat_hash_map.h:573:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x49fa0)
#9 void absl::lts_20230802::container_internal::common_policy_traits<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, void>::construct<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > >*, absl::lts_20230802::container_internal::map_slot_type<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >*, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/common_policy_traits.h:43:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x49c30)
#10 void absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::emplace_at<std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<std::shared_ptr<GuardedEventStats>&&> >(unsigned long, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<std::shared_ptr<GuardedEventStats>&&>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2699:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x46415)
#11 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:204:13 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x45de2)
#12 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats>, 0>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:139:12 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x34d2f)
#13 EventTracker::GetOrCreate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/src/ray/common/event_stats.cc:175:29 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2dbf6)
#14 EventTracker::RecordStart(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) /proc/self/cwd/src/ray/common/event_stats.cc:64:16 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2d737)
#15 instrumented_io_context::post(std::function<void ()>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) /proc/self/cwd/src/ray/common/asio/instrumented_io_context.cc:95:39 (libsrc_Sray_Scommon_Slibasio.so+0x9ebba)
#16 ray::gcs::GcsHealthCheckManager::HealthCheckContext::StartHealthCheck()::$_1::operator()(grpc::Status) const /proc/self/cwd/src/ray/gcs/gcs_server/gcs_health_check_manager.cc:145:31 (liblibgcs_Userver_Ulib.so+0xf77df9)
#17 std::_Function_handler<void (grpc::Status), ray::gcs::GcsHealthCheckManager::HealthCheckContext::StartHealthCheck()::$_1>::_M_invoke(std::_Any_data const&, grpc::Status&&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/std_function.h:300:2 (liblibgcs_Userver_Ulib.so+0xf77802)
#18 std::function<void (grpc::Status)>::operator()(grpc::Status) const /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/std_function.h:688:14 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb5b6f)
#19 void grpc::internal::CatchingCallback<std::function<void (grpc::Status)>, grpc::Status>(std::function<void (grpc::Status)>&&, grpc::Status&&) /proc/self/cwd/external/com_github_grpc_grpc/include/grpcpp/support/callback_common.h:43:5 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb5a5b)
#20 grpc::internal::CallbackWithStatusTag::Run(bool) /proc/self/cwd/external/com_github_grpc_grpc/include/grpcpp/support/callback_common.h:128:5 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb57db)
#21 grpc::internal::CallbackWithStatusTag::StaticRun(grpc_completion_queue_functor*, int) /proc/self/cwd/external/com_github_grpc_grpc/include/grpcpp/support/callback_common.h:112:46 (liblibnode_Umanager_Ucc_Ugrpc.so+0xb550e)
#22 grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)::operator()(void*) const /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:94:17 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1ccd)
#23 grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)::__invoke(void*) /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:62:13 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1a08)
#24 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&)::'lambda'(void*)::operator()(void*) const /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:145:11 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1ea8e)
#25 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&)::'lambda'(void*)::__invoke(void*) /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:115:9 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1e888)
Location is heap block of size 176 at 0x7b2c00000370 allocated by main thread:
#0 malloc <null> (gcs_health_check_manager_test+0x7cf14)
#1 operator new(unsigned long) <null> (libstdc++.so.6+0xaab28)
#2 std::allocator_traits<std::allocator<absl::lts_20230802::container_internal::AlignedType<8ul> > >::allocate(std::allocator<absl::lts_20230802::container_internal::AlignedType<8ul> >&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:443:20 (liblibgcs_Userver_Ulib.so+0xd12f3a)
#3 void* absl::lts_20230802::container_internal::Allocate<8ul, std::allocator<char> >(std::allocator<char>*, unsigned long) /proc/self/cwd/external/com_google_absl/absl/container/internal/container_memory.h:65:13 (liblibgcs_Userver_Ulib.so+0xd128b5)
#4 void absl::lts_20230802::container_internal::InitializeSlots<std::allocator<char>, 48ul, 8ul>(absl::lts_20230802::container_internal::CommonFields&, std::allocator<char>) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:1407:7 (liblibgcs_Userver_Ulib.so+0xd1a27b)
#5 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::initialize_slots() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2505:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x477db)
#6 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::resize(unsigned long) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2515:5 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x46d93)
#7 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::rehash_and_grow_if_necessary() /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2603:7 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4729e)
#8 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::prepare_insert(unsigned long) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2678:7 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4697b)
#9 std::pair<unsigned long, bool> absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::find_or_prepare_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_set.h:2659:13 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x4626c)
#10 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:202:22 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x45d44)
#11 std::pair<absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::iterator, bool> absl::lts_20230802::container_internal::raw_hash_map<absl::lts_20230802::container_internal::FlatHashMapPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats> >, absl::lts_20230802::container_internal::StringHash, absl::lts_20230802::container_internal::StringEq, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<GuardedEventStats> > > >::try_emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<GuardedEventStats>, 0>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<GuardedEventStats>&&) /proc/self/cwd/external/com_google_absl/absl/container/internal/raw_hash_map.h:139:12 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x34d2f)
#12 EventTracker::GetOrCreate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/src/ray/common/event_stats.cc:175:29 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2dbf6)
#13 EventTracker::RecordStart(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long) /proc/self/cwd/src/ray/common/event_stats.cc:64:16 (libsrc_Sray_Scommon_Slibevent_Ustats.so+0x2d737)
#14 instrumented_io_context::dispatch(std::function<void ()>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /proc/self/cwd/src/ray/common/asio/instrumented_io_context.cc:114:37 (libsrc_Sray_Scommon_Slibasio.so+0x9f107)
#15 ray::gcs::GcsHealthCheckManager::MarkNodeHealthy(ray::NodeID const&) /proc/self/cwd/src/ray/gcs/gcs_server/gcs_health_check_manager.cc:86:15 (liblibgcs_Userver_Ulib.so+0xf730a7)
#16 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::TestBody() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:174:17 (gcs_health_check_manager_test+0x10c5c0)
#17 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x168fac)
#18 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x13fff1)
#19 testing::Test::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2687:5 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x110901)
#20 testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2836:11 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x111afd)
#21 testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:3015:30 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x11296c)
#22 testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5920:44 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12ccb1)
#23 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x17263c)
#24 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x1446b7)
#25 testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5484:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12c4be)
#26 RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2317:73 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xda7)
#27 main /proc/self/cwd/external/com_google_googletest/googlemock/src/gmock_main.cc:71:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xd27)
Mutex M408836992013439344 is already destroyed.
Thread T51 'nexting_thread' (tid=92, running) created by main thread at:
#0 pthread_create <null> (gcs_health_check_manager_test+0x7e7db)
#1 grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&) /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:113:30 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1e20e)
#2 grpc_core::Thread::Thread(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&) /proc/self/cwd/external/com_github_grpc_grpc/src/core/lib/gprpp/posix/thd.cc:199:15 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgpr.so+0x1db36)
#3 void __gnu_cxx::new_allocator<grpc_core::Thread>::construct<grpc_core::Thread, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*), grpc::CompletionQueue*&>(grpc_core::Thread*, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)&&, grpc::CompletionQueue*&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/new_allocator.h:146:23 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d196a)
#4 void std::allocator_traits<std::allocator<grpc_core::Thread> >::construct<grpc_core::Thread, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*), grpc::CompletionQueue*&>(std::allocator<grpc_core::Thread>&, grpc_core::Thread*, char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)&&, grpc::CompletionQueue*&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/alloc_traits.h:483:8 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1390)
#5 grpc_core::Thread& std::vector<grpc_core::Thread, std::allocator<grpc_core::Thread> >::emplace_back<char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*), grpc::CompletionQueue*&>(char const (&) [15], grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref()::'lambda'(void*)&&, grpc::CompletionQueue*&) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/vector.tcc:115:6 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d1213)
#6 grpc::(anonymous namespace)::CallbackAlternativeCQ::Ref() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:60:26 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d0c0c)
#7 grpc::CompletionQueue::CallbackAlternativeCQ() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/common/completion_queue_cc.cc:194:36 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x1d0a19)
#8 grpc::Server::CallbackCQ() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_cc.cc:1386:19 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x2288d1)
#9 grpc::Server::RegisterService(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, grpc::Service*) /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_cc.cc:1074:35 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x225e2d)
#10 grpc::Server::Start(grpc::ServerCompletionQueue**, unsigned long) /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_cc.cc:1172:5 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x227123)
#11 grpc::ServerBuilder::BuildAndStart() /proc/self/cwd/external/com_github_grpc_grpc/src/cpp/server/server_builder.cc:446:11 (libexternal_Scom_Ugithub_Ugrpc_Ugrpc_Slibgrpc++_Ubase.so+0x2089ce)
#12 ray::rpc::GrpcServer::Run() /proc/self/cwd/src/ray/rpc/grpc_server.cc:122:21 (liblibgrpc_Ucommon_Ulib.so+0x60476)
#13 GcsHealthCheckManagerTest::AddServer(bool) /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:91:13 (gcs_health_check_manager_test+0x1125c3)
#14 GcsHealthCheckManagerTest_MarkHealthAndSkipCheck_Test::TestBody() /proc/self/cwd/src/ray/gcs/gcs_server/test/gcs_health_check_manager_test.cc:167:18 (gcs_health_check_manager_test+0x10c3a1)
#15 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x168fac)
#16 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x13fff1)
#17 testing::Test::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2687:5 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x110901)
#18 testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2836:11 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x111afd)
#19 testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:3015:30 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x11296c)
#20 testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5920:44 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12ccb1)
#21 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2612:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x17263c)
#22 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2648:14 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x1446b7)
#23 testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5484:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so+0x12c4be)
#24 RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2317:73 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xda7)
#25 main /proc/self/cwd/external/com_google_googletest/googlemock/src/gmock_main.cc:71:10 (libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest_Umain.so+0xd27)
SUMMARY: ThreadSanitizer: data race /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/shared_ptr_base.h:729:6 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count()
Signed-off-by: dentiny <dentinyhao@gmail.com>
Signed-off-by: dentiny <dentinyhao@gmail.com>
I am not 100% convinced by the necessity of the shared ptr capturing in the lambda, but it's not too bad to keep it there. Approving |
&context_, | ||
&request_, | ||
&response_, | ||
[this, start = now, manager = manager](::grpc::Status status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to capture manager
here. Since we captured this
, we have the weak_ptr of the manager and we just need to do another weak_ptr.lock()
inside the lambda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, #49493
Fix issue: #49469 The bug is: - We have two eventloops in the GCS health check manager, one for asio io context, one for grpc - In the unit test and production code elsewhere, we only synchronize on io context by `io_context::stop`, but not grpc - leading to grpc still accessing `GcsHealthCheckManager`, while we **mistakenly think** all async operation have been properly synchronized In this PR, I use shared pointer to make sure all accesses to gcs health check manager is valid, even if io context has been stopped. Also contain a fix to data member declaration order to respect the usage dependency. --------- Signed-off-by: dentiny <dentinyhao@gmail.com>
Fix issue: #49469
The bug is:
io_context::stop
, but not grpcGcsHealthCheckManager
, while we mistakenly think all async operation have been properly synchronizedIn this PR, I use shared pointer to make sure all accesses to gcs health check manager is valid, even if io context has been stopped. Also contain a fix to data member declaration order to respect the usage dependency.