-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KMS: terminate called after throwing an instance of '__gnu_cxx::recursive_init_error' #27727
Comments
Can you provide the following additional information:
It's likely this will need to be reported upstream to the gRPC team (https://github.com/grpc/grpc) as that's where the assertion failure appears to be coming from, but it would be good to get some of this Ruby-side runtime context in a report. Thanks! |
Thanks for your quick reply @dazuma.
Also probably worth noting that the Alpine image we're running is MUSL (rather than glibc) based. I've just set off a test run with the gems updated (apologies - they've been updated within the last week) to see if the issue is still present.
This is from within a Gem - I'll try monkeypatching it out and get back to you.
From my last test run, 1824 calls were successful and returned correct results. 1 failed.
We're running Puma, but with a multi-thread configuration rather than multi-process. There could potentially be a threading issue?
Possibly? |
I've run some more tests. Upgrading the gems to the latest (compatible) versions (noting Removing the I did, however, make the following change (appreciating that this is a toy example): require "google/cloud/kms"
kms_mutex = Mutex.new
client = Google::Cloud::Kms.key_management_service do |config|
config.timeout = 2
end
mutex.synchronize do
client.decrypt(name: key_id, ciphertext: encrypted_data_key).plaintext
end This does appear to have resolved the issue. I'm proposing that there is some sort of multi-threading issue at play here. |
Your threading issue theory does seem to be supported by the text of the exception coming out of grpc ("__gnu_cxx::recursive_init_error"). "recursive init" sounds to me like a reentrancy issue. And if it's coming from C++, we wouldn't have much visibility from the Ruby side. One other thing I notice, though. My understanding is that the |
Yeah - we've I'll see if I can convince bundler to not use the pre-compiled version. |
Okay. Building from source also seems to have made the problem go away. Gemfile snippet for completeness:
You weren't wrong, though - this added about 10 minutes to our build. I guess we could look into pre-building gRPC / google-protobuf for the correct platform to mitigate this. Really odd that this is the way it's failing, though, and that the mutex also appears to have made the problem go away. |
My guess from what we're seeing so far, is that there's a thread safety issue in the allocator in It would of course be better if we could get musl-based binary gem releases of google-protobuf and grpc. Not sure how feasible that will be, but we can look into it. |
We're seeing an intermittent Ruby crash after calling the KMS service.
Please see example logs below:
gRPC appears to be the only C++ native gem we use, which is used exclusively by Google Cloud KMS:
Environment details
google-cloud-kms@2.8.2
Steps to reproduce
Code example
Full backtrace
(Yes, that really is it. There's no Ruby backtrace. I've nothing else to go on.)
The text was updated successfully, but these errors were encountered: