[BUG]: Intermittent test failures for clang & gcc-coverage #379
Description
Version
23.11
Which installation method(s) does this occur on?
Source
Describe the bug.
Tests are intermittently failing with a SIGSEGV from RPC and control plane tests.
Examples:
clang
[ RUN ] TestRPC.StreamingPingPong
*** Aborted at 1694016732 (unix time) try "date -d @1694016732" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 4166 (TID 0x7f84edca8000) from PID 0; stack trace: ***
@ 0x7f84fc542197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f84fc8ce420 (unknown)
@ 0x7f84fc4f4f2e boost::fibers::wait_queue::notify_all()
@ 0x7f84fc4f2ee3 boost::fibers::condition_variable_any::notify_all()
@ 0x7f84fcbe6202 boost::fibers::promise<>::set_value()
@ 0x7f84fcbe50de _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node11GenericSinkIS4_NS2_8runnable7ContextEEC1EvEUlS4_E_NS0_12OnErrorEmptyEvEEvE7on_nextEOS4_
@ 0x7f84fcbe5b2a _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node6RxSinkIS4_NS2_8runnable7ContextEE12do_subscribeERNS_22composite_subscriptionEEUlS4_E_ZNSB_12do_subscribeESD_EUlNSt15__exception_ptr13exception_ptrEE_ZNSB_12do_subscribeESD_EUlvE_EEvE7on_nextEOS4_
@ 0x7f84fcbd7d3a rxcpp::subscriber<>::nextdetacher::operator()<>()
@ 0x7f84fcbe1b61 mrc::node::RxSinkBase<>::progress_engine()
@ 0x7f84fcbe12f7 _ZN5rxcpp12on_exceptionIZNKS_7sources6detail6createIN3mrc3rpc13ProgressEventEZNS4_4node10RxSinkBaseIS6_EC1EvEUlNS_10subscriberIS6_NS_8observerIS6_vvvvEEEEE_E12on_subscribeISD_EEvT_EUlvE_SD_EENSt9enable_ifIXsr13is_subscriberIT0_EE5valueENS_6detail17maybe_from_resultISH_E4typeEE4typeERKSH_RKSK_
@ 0x7f84fcbe11c5 _ZSt13__invoke_implIvRZN5rxcpp18dynamic_observableIN3mrc3rpc13ProgressEventEE9constructINS0_7sources6detail6createIS4_ZNS2_4node10RxSinkBaseIS4_EC1EvEUlNS0_10subscriberIS4_NS0_8observerIS4_vvvvEEEEE_EEEEvOT_ONS7_10tag_sourceEEUlSG_E_JSG_EESJ_St14__invoke_otherOT0_DpOT1_
@ 0x7f84fcbd8c6b rxcpp::detail::safe_subscriber<>::subscribe()
@ 0x7f84fcbdd43a rxcpp::schedulers::detail::action_tailrecurser::operator()()
@ 0x559f17344986 rxcpp::schedulers::current_thread::current_worker::schedule()
@ 0x7f84fcbd896b _ZNK5rxcpp10schedulers6worker8scheduleIRNS_6detail15safe_subscriberINS_18dynamic_observableIN3mrc3rpc13ProgressEventEEENS_10subscriberIS8_NS_8observerIS8_vvvvEEEEEEJEEENSt9enable_ifIXaaoosr6detail18is_action_functionIT_EE5valuesr15is_subscriptionISH_EE5valuentsr14is_schedulableISH_EE5valueEvE4typeEOSH_DpOT0_
@ 0x7f84fcbd84d7 rxcpp::observable<>::detail_subscribe<>()
@ 0x7f84fcbe5978 rxcpp::observable<>::subscribe<>()
@ 0x7f84fcbdef91 mrc::node::RxSink<>::do_subscribe()
@ 0x559f17337d04 mrc::node::RxRunnable<>::run()
@ 0x7f84fcbcfabe mrc::runnable::RunnableWithContext<>::main()
@ 0x7f84fcdcb9e5 std::_Function_handler<>::_M_invoke()
@ 0x7f84fccf5a76 boost::fibers::detail::task_object<>::run()
@ 0x7f84fccf5db2 _ZN5boost6fibers6detail11task_objectIZN3mrc4core14FiberTaskQueue7enqueueISt8functionIFvvEEJEEENS0_6futureINSt9result_ofIFT_DpT0_EE4typeEEEONS3_13FiberMetaDataEOSC_DpOSD_EUlvE_SaINS0_13packaged_taskIS8_EEEvJEE3runEv
@ 0x7f84fcd1409e boost::fibers::worker_context<>::run_()
@ 0x7f84fcd141bd boost::context::detail::fiber_entry<>()
@ 0x7f84fc4e511f make_fcontext
gcc-coverage:
[ RUN ] TestControlPlane.DoubleClientConnectExchangeDisconnect
*** Aborted at 1694020510 (unix time) try "date -d @1694020510" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 26896 (TID 0x7f95767fc000) from PID 0; stack trace: ***
@ 0x7f963010b197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f96304a0420 (unknown)
@ 0x7f96300bdf2e boost::fibers::wait_queue::notify_all()
@ 0x7f96300bbee3 boost::fibers::condition_variable_any::notify_all()
@ 0x7f963107bf20 boost::fibers::condition_variable::notify_all()
@ 0x7f963107c12f boost::fibers::detail::shared_state_base::mark_ready_and_notify_()
@ 0x7f96310b686b boost::fibers::detail::shared_state<>::set_value_()
@ 0x7f96310aa8fc boost::fibers::detail::shared_state<>::set_value()
@ 0x7f963109ca83 boost::fibers::promise<>::set_value()
@ 0x7f963108f5c3 mrc::rpc::PromiseHandler::on_data()
@ 0x7f96310ad02a _ZZN3mrc4node11GenericSinkINS_3rpc13ProgressEventENS_8runnable7ContextEEC4EvENKUlS3_E_clES3_
@ 0x7f9631115470 _ZNK5rxcpp8observerIN3mrc3rpc13ProgressEventENS_6detail22stateless_observer_tagEZNS1_4node11GenericSinkIS3_NS1_8runnable7ContextEEC4EvEUlS3_E_NS4_12OnErrorEmptyEvE7on_nextEOS3_
@ 0x7f9631107c92 _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node11GenericSinkIS4_NS2_8runnable7ContextEEC4EvEUlS4_E_NS0_12OnErrorEmptyEvEEvE7on_nextEOS4_
@ 0x7f96310db796 rxcpp::observer<>::on_next<>()
@ 0x7f963110e90c _ZZN3mrc4node6RxSinkINS_3rpc13ProgressEventENS_8runnable7ContextEE12do_subscribeERN5rxcpp22composite_subscriptionEENKUlS3_E_clES3_
@ 0x7f9631162662 _ZNK5rxcpp8observerIN3mrc3rpc13ProgressEventENS_6detail22stateless_observer_tagEZNS1_4node6RxSinkIS3_NS1_8runnable7ContextEE12do_subscribeERNS_22composite_subscriptionEEUlS3_E_ZNSA_12do_subscribeESC_EUlNSt15__exception_ptr13exception_ptrEE0_ZNSA_12do_subscribeESC_EUlvE1_E7on_nextEOS3_
@ 0x7f963115cbfc _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node6RxSinkIS4_NS2_8runnable7ContextEE12do_subscribeERNS_22composite_subscriptionEEUlS4_E_ZNSB_12do_subscribeESD_EUlNSt15__exception_ptr13exception_ptrEE0_ZNSB_12do_subscribeESD_EUlvE1_EEvE7on_nextEOS4_
@ 0x7f96310db796 rxcpp::observer<>::on_next<>()
@ 0x7f96310d249a rxcpp::subscriber<>::nextdetacher::operator()<>()
@ 0x7f96310c8d86 rxcpp::subscriber<>::on_next<>()
@ 0x7f96310c06c3 mrc::node::RxSinkBase<>::progress_engine()
@ 0x7f96310b75ee _ZZN3mrc4node10RxSinkBaseINS_3rpc13ProgressEventEEC4EvENKUlN5rxcpp10subscriberIS3_NS5_8observerIS3_vvvvEEEEE_clES9_
@ 0x7f96310dbc82 _ZZNK5rxcpp7sources6detail6createIN3mrc3rpc13ProgressEventEZNS3_4node10RxSinkBaseIS5_EC4EvEUlNS_10subscriberIS5_NS_8observerIS5_vvvvEEEEE_E12on_subscribeISC_EEvT_ENKUlvE_clEv
@ 0x7f96310e348f _ZN5rxcpp12on_exceptionIZNKS_7sources6detail6createIN3mrc3rpc13ProgressEventEZNS4_4node10RxSinkBaseIS6_EC4EvEUlNS_10subscriberIS6_NS_8observerIS6_vvvvEEEEE_E12on_subscribeISD_EEvT_EUlvE_SD_EENSt9enable_ifIXsrNS_13is_subscriberIT0_EE5valueENS_6detail17maybe_from_resultISH_E4typeEE4typeERKSH_RKSL_
@ 0x7f96310dbd68 _ZNK5rxcpp7sources6detail6createIN3mrc3rpc13ProgressEventEZNS3_4node10RxSinkBaseIS5_EC4EvEUlNS_10subscriberIS5_NS_8observerIS5_vvvvEEEEE_E12on_subscribeISC_EEvT_
@ 0x7f96310d2a34 _ZZN5rxcpp18dynamic_observableIN3mrc3rpc13ProgressEventEE9constructINS_7sources6detail6createIS3_ZNS1_4node10RxSinkBaseIS3_EC4EvEUlNS_10subscriberIS3_NS_8observerIS3_vvvvEEEEE_EEEEvOT_ONS6_10tag_sourceEENUlSF_E_clESF_
@ 0x7f96310f498f _ZSt13__invoke_implIvRZN5rxcpp18dynamic_observableIN3mrc3rpc13ProgressEventEE9constructINS0_7sources6detail6createIS4_ZNS2_4node10RxSinkBaseIS4_EC4EvEUlNS0_10subscriberIS4_NS0_8observerIS4_vvvvEEEEE_EEEEvOT_ONS7_10tag_sourceEEUlSG_E_JSG_EESJ_St14__invoke_otherOT0_DpOT1_
@ 0x7f96310ef937 _ZSt10__invoke_rIvRZN5rxcpp18dynamic_observableIN3mrc3rpc13ProgressEventEE9constructINS0_7sources6detail6createIS4_ZNS2_4node10RxSinkBaseIS4_EC4EvEUlNS0_10subscriberIS4_NS0_8observerIS4_vvvvEEEEE_EEEEvOT_ONS7_10tag_sourceEEUlSG_E_JSG_EENSt9enable_ifIX16is_invocable_r_vISJ_T0_DpT1_EESJ_E4typeEOSQ_DpOSR_
@ 0x7f96310ea198 _ZNSt17_Function_handlerIFvN5rxcpp10subscriberIN3mrc3rpc13ProgressEventENS0_8observerIS4_vvvvEEEEEZNS0_18dynamic_observableIS4_E9constructINS0_7sources6detail6createIS4_ZNS2_4node10RxSinkBaseIS4_EC4EvEUlS7_E_EEEEvOT_ONSC_10tag_sourceEEUlS7_E_E9_M_invokeERKSt9_Any_dataOS7_
@ 0x7f963113115a std::function<>::operator()()
@ 0x7f963112d1cb rxcpp::dynamic_observable<>::on_subscribe()
@ 0x7f9631126d2d rxcpp::detail::safe_subscriber<>::subscribe()
clang:
[ RUN ] TestControlPlane.SingleClientConnectDisconnect
*** Aborted at 1694020471 (unix time) try "date -d @1694020471" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 4165 (TID 0x7fcce0ff9000) from PID 0; stack trace: ***
@ 0x7fcd8ebd6197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7fcd8ef62420 (unknown)
@ 0x7fcd8eb88f2e boost::fibers::wait_queue::notify_all()
@ 0x7fcd8eb86ee3 boost::fibers::condition_variable_any::notify_all()
@ 0x7fcd8f27a202 boost::fibers::promise<>::set_value()
@ 0x7fcd8f2790de _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node11GenericSinkIS4_NS2_8runnable7ContextEEC1EvEUlS4_E_NS0_12OnErrorEmptyEvEEvE7on_nextEOS4_
@ 0x7fcd8f279b2a _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node6RxSinkIS4_NS2_8runnable7ContextEE12do_subscribeERNS_22composite_subscriptionEEUlS4_E_ZNSB_12do_subscribeESD_EUlNSt15__exception_ptr13exception_ptrEE_ZNSB_12do_subscribeESD_EUlvE_EEvE7on_nextEOS4_
@ 0x7fcd8f26bd3a rxcpp::subscriber<>::nextdetacher::operator()<>()
@ 0x7fcd8f275b61 mrc::node::RxSinkBase<>::progress_engine()
@ 0x7fcd8f2752f7 _ZN5rxcpp12on_exceptionIZNKS_7sources6detail6createIN3mrc3rpc13ProgressEventEZNS4_4node10RxSinkBaseIS6_EC1EvEUlNS_10subscriberIS6_NS_8observerIS6_vvvvEEEEE_E12on_subscribeISD_EEvT_EUlvE_SD_EENSt9enable_ifIXsr13is_subscriberIT0_EE5valueENS_6detail17maybe_from_resultISH_E4typeEE4typeERKSH_RKSK_
@ 0x7fcd8f2751c5 _ZSt13__invoke_implIvRZN5rxcpp18dynamic_observableIN3mrc3rpc13ProgressEventEE9constructINS0_7sources6detail6createIS4_ZNS2_4node10RxSinkBaseIS4_EC1EvEUlNS0_10subscriberIS4_NS0_8observerIS4_vvvvEEEEE_EEEEvOT_ONS7_10tag_sourceEEUlSG_E_JSG_EESJ_St14__invoke_otherOT0_DpOT1_
@ 0x7fcd8f26cc6b rxcpp::detail::safe_subscriber<>::subscribe()
@ 0x7fcd8f27143a rxcpp::schedulers::detail::action_tailrecurser::operator()()
@ 0x559f67bd5986 rxcpp::schedulers::current_thread::current_worker::schedule()
@ 0x7fcd8f26c96b _ZNK5rxcpp10schedulers6worker8scheduleIRNS_6detail15safe_subscriberINS_18dynamic_observableIN3mrc3rpc13ProgressEventEEENS_10subscriberIS8_NS_8observerIS8_vvvvEEEEEEJEEENSt9enable_ifIXaaoosr6detail18is_action_functionIT_EE5valuesr15is_subscriptionISH_EE5valuentsr14is_schedulableISH_EE5valueEvE4typeEOSH_DpOT0_
@ 0x7fcd8f26c4d7 rxcpp::observable<>::detail_subscribe<>()
@ 0x7fcd8f279978 rxcpp::observable<>::subscribe<>()
@ 0x7fcd8f272f91 mrc::node::RxSink<>::do_subscribe()
@ 0x559f67bc8d04 mrc::node::RxRunnable<>::run()
@ 0x7fcd8f263abe mrc::runnable::RunnableWithContext<>::main()
@ 0x7fcd8f45f9e5 std::_Function_handler<>::_M_invoke()
@ 0x7fcd8f389a76 boost::fibers::detail::task_object<>::run()
@ 0x7fcd8f389db2 _ZN5boost6fibers6detail11task_objectIZN3mrc4core14FiberTaskQueue7enqueueISt8functionIFvvEEJEEENS0_6futureINSt9result_ofIFT_DpT0_EE4typeEEEONS3_13FiberMetaDataEOSC_DpOSD_EUlvE_SaINS0_13packaged_taskIS8_EEEvJEE3runEv
@ 0x7fcd8f3a809e boost::fibers::worker_context<>::run_()
@ 0x7fcd8f3a81bd boost::context::detail::fiber_entry<>()
@ 0x7fcd8eb7911f make_fcontext
clang:
[ RUN ] TestControlPlane.SingleClientConnectDisconnect
*** Aborted at 1694020205 (unix time) try "date -d @1694020205" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 347426 (TID 0x7f51a9ffb000) from PID 0; stack trace: ***
@ 0x7f5260053197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f52603df420 (unknown)
@ 0x7f5260005f2e boost::fibers::wait_queue::notify_all()
@ 0x7f5260003ee3 boost::fibers::condition_variable_any::notify_all()
@ 0x7f52606f7202 boost::fibers::promise<>::set_value()
@ 0x7f52606f60de _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node11GenericSinkIS4_NS2_8runnable7ContextEEC1EvEUlS4_E_NS0_12OnErrorEmptyEvEEvE7on_nextEOS4_
@ 0x7f52606f6b2a _ZNK5rxcpp6detail17specific_observerIN3mrc3rpc13ProgressEventENS_8observerIS4_NS0_22stateless_observer_tagEZNS2_4node6RxSinkIS4_NS2_8runnable7ContextEE12do_subscribeERNS_22composite_subscriptionEEUlS4_E_ZNSB_12do_subscribeESD_EUlNSt15__exception_ptr13exception_ptrEE_ZNSB_12do_subscribeESD_EUlvE_EEvE7on_nextEOS4_
@ 0x7f52606e8d3a rxcpp::subscriber<>::nextdetacher::operator()<>()
@ 0x7f52606f2b61 mrc::node::RxSinkBase<>::progress_engine()
@ 0x7f52606f22f7 _ZN5rxcpp12on_exceptionIZNKS_7sources6detail6createIN3mrc3rpc13ProgressEventEZNS4_4node10RxSinkBaseIS6_EC1EvEUlNS_10subscriberIS6_NS_8observerIS6_vvvvEEEEE_E12on_subscribeISD_EEvT_EUlvE_SD_EENSt9enable_ifIXsr13is_subscriberIT0_EE5valueENS_6detail17maybe_from_resultISH_E4typeEE4typeERKSH_RKSK_
@ 0x7f52606f21c5 _ZSt13__invoke_implIvRZN5rxcpp18dynamic_observableIN3mrc3rpc13ProgressEventEE9constructINS0_7sources6detail6createIS4_ZNS2_4node10RxSinkBaseIS4_EC1EvEUlNS0_10subscriberIS4_NS0_8observerIS4_vvvvEEEEE_EEEEvOT_ONS7_10tag_sourceEEUlSG_E_JSG_EESJ_St14__invoke_otherOT0_DpOT1_
@ 0x7f52606e9c6b rxcpp::detail::safe_subscriber<>::subscribe()
@ 0x7f52606ee43a rxcpp::schedulers::detail::action_tailrecurser::operator()()
@ 0x55f060ea13d6 rxcpp::schedulers::current_thread::current_worker::schedule()
@ 0x7f52606e996b _ZNK5rxcpp10schedulers6worker8scheduleIRNS_6detail15safe_subscriberINS_18dynamic_observableIN3mrc3rpc13ProgressEventEEENS_10subscriberIS8_NS_8observerIS8_vvvvEEEEEEJEEENSt9enable_ifIXaaoosr6detail18is_action_functionIT_EE5valuesr15is_subscriptionISH_EE5valuentsr14is_schedulableISH_EE5valueEvE4typeEOSH_DpOT0_
@ 0x7f52606e94d7 rxcpp::observable<>::detail_subscribe<>()
@ 0x7f52606f6978 rxcpp::observable<>::subscribe<>()
@ 0x7f52606eff91 mrc::node::RxSink<>::do_subscribe()
@ 0x55f060e94754 mrc::node::RxRunnable<>::run()
@ 0x7f52606e0abe mrc::runnable::RunnableWithContext<>::main()
@ 0x7f52608dc9e5 std::_Function_handler<>::_M_invoke()
@ 0x7f5260806a76 boost::fibers::detail::task_object<>::run()
@ 0x7f5260806db2 _ZN5boost6fibers6detail11task_objectIZN3mrc4core14FiberTaskQueue7enqueueISt8functionIFvvEEJEEENS0_6futureINSt9result_ofIFT_DpT0_EE4typeEEEONS3_13FiberMetaDataEOSC_DpOSD_EUlvE_SaINS0_13packaged_taskIS8_EEEvJEE3runEv
@ 0x7f526082509e boost::fibers::worker_context<>::run_()
@ 0x7f52608251bd boost::context::detail::fiber_entry<>()
@ 0x7f525fff611f make_fcontext
Minimum reproducible example
Open a PR
https://github.com/nv-morpheus/MRC/pull/378
Relevant log output
No response
Full env printout
No response
Other/Misc.
No response
Code of Conduct
- I agree to follow MRC's Code of Conduct
- I have searched the open bugs and have found no duplicates for this bug report
Metadata
Assignees
Labels
Type
Projects
Status
Done