Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled exception Type=Segmentation error #101

Open
Jenson3210 opened this issue Dec 16, 2024 · 7 comments
Open

Unhandled exception Type=Segmentation error #101

Jenson3210 opened this issue Dec 16, 2024 · 7 comments

Comments

@Jenson3210
Copy link

Hi,

We're using open liberty 24.0.0.11-full-java17-openj9-ubi to run our Jee/ear application.
However, I think also 24.0.0.12-full-java17-openj9-ubi would not work.

We're seeing crashes with error log:

Unhandled exception
Type=Segmentation error vmState=0x00040000
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007F780A6CFD60 Handler2=00007F780BE5C750 InaccessibleAddress=0000000000000000
RDI=00007F77A6D04090 RSI=0000000000000000 RAX=0000000000000000 RBX=00007F78040F2040
RCX=00007F77A6D04120 RDX=000000000000FFFF R8=00007F78040F2040 R9=0000000000000050
R10=00000000FFFFFFFF R11=0000000000000001 R12=00007F77A6D04120 R13=00007F77A6D04070
R14=00007F77A6D04090 R15=0000000000000001
RIP=00007F7808D6024A GS=0000 FS=0000 RSP=00007F77A6D04000
EFlags=0000000000010246 CS=0033 RBP=00007F77A6D04090 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000000
xmm0=0bf42630b44dd101 (f: 3024998656.000000, d: 4.397249e-251)
xmm1=0bf4263000000000 (f: 0.000000, d: 4.397246e-251)
xmm2=00000000b7654321 (f: 3076866816.000000, d: 1.520174e-314)
xmm3=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm4=00000000ff000000 (f: 4278190080.000000, d: 2.113707e-314)
xmm5=0000003000000020 (f: 32.000000, d: 1.018558e-312)
xmm6=00000000ff000000 (f: 4278190080.000000, d: 2.113707e-314)
xmm7=0000000002e6a0a0 (f: 48668832.000000, d: 2.404560e-316)
xmm8=00000000c0111de8 (f: 3222347264.000000, d: 1.592051e-314)
xmm9=24245b52fc756d8c (f: 4235554304.000000, d: 1.400361e-134)
xmm10=24245b52fc756dbc (f: 4235554304.000000, d: 1.400361e-134)
xmm11=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/opt/java/openjdk/lib/default/libj9vrb29.so
Module_base_address=00007F7808D4F000
Target=2_90_20241015_886 (Linux 5.14.0-284.57.1.el9_2.x86_64)
CPU=amd64 (10 logical CPUs) (0x1f718e9000 RAM)
----------- Stack Backtrace -----------
decodeStackFrameDataFromStackMapTable+0x1a (0x00007F7808D6024A [libj9vrb29.so+0x1124a])
generateJ9RtvExceptionDetails+0x9a5 (0x00007F7808D5F0F5 [libj9vrb29.so+0x100f5])
j9bcv_createVerifyErrorString+0x332 (0x00007F780A80C932 [libj9vm29.so+0x17c932])
classInitStateMachine+0xab5 (0x00007F780A6B3225 [libj9vm29.so+0x23225])
resolveStaticMethodRefInto+0x2c8 (0x00007F780A707BD8 [libj9vm29.so+0x77bd8])
resolveStaticMethodRef+0x22 (0x00007F780A707DB2 [libj9vm29.so+0x77db2])
_ZN37VM_DebugBytecodeInterpreterCompressed3runEP10J9VMThread+0xc379 (0x00007F780A785AA9 [libj9vm29.so+0xf5aa9])
debugBytecodeLoopCompressed+0xd2 (0x00007F780A779722 [libj9vm29.so+0xe9722])
 (0x00007F780A7E5C42 [libj9vm29.so+0x155c42])
---------------------------------------

We think it might be related to this bug.
According to the details there, it should be resolved by semeru 17.0.14.0
As the stacktrace does not 100% match, we want to be sure that we're not eagerly waiting for a release expecting it to fix our issue, only to have to raise an issue at that time.

Regarding the containers that ship with semeru, it seems that there is no version available other than the one used in openliberty in the dockerfile
that contains semeru 17.0.14.

@Jenson3210
Copy link
Author

We also had a question on this line:

CPU=amd64 (10 logical CPUs) (0x1f718e9000 RAM)

If i assign 1cpu limit/request in k8s, this still shows 10 (available on the host machine).
Should the detected amount of RAM/CPU not be the assigned RAM/CPU instead of the host values?
Or are we misconfiguring something?

@pshipton
Copy link
Member

We can't confirm your crash is the same unless you obtain and provide a core file that has been processed with jpackcore (https://eclipse.dev/openj9/docs/tool_jextract/#dump-extractor-jpackcore).

If you want try an early access build of 17.0.14 I've provided the Milestone 2 build. Hopefully we will publish all of these M2 builds shortly and announce them in the OpenJ9 Slack, but in case this is delayed.
https://ibm.box.com/s/pzf29u4vc1j4hnaznjwin9jho0ydf6v9

About the logical CPUs, I haven't checked the code, but I expect it's reporting details of the machine rather than details of the container.

@pshipton
Copy link
Member

The M2 builds are officially published (same build I already provided).
https://github.com/ibmruntimes/semeru17-binaries/releases/tag/jdk-17.0.14%2B6_openj9-0.49.0-m2

@Jenson3210
Copy link
Author

Thanks! We're currently checking how to extract the dumps from our kubernetes environment.

@Jenson3210
Copy link
Author

@pshipton thanks for the patience here. Finally we where able to get the dump from our worker nodes.
we've extracted a zstd file from our openshift cluster. Then we decompressed it using zstd --rm -d ....

We've volume mounted this dump on a liberty server container to have the same runtime and processed the decompressed file with jpackcore ./mounted/file ./mounted/dump.zip

This generates a dump.zip on my host machine which can be analysed locally after installing 17.0.13-sem locally(sdk install java 17.0.13-sem), using ~/.sdkman/candidates/java/17.0.13-sem/bin/jdmpview ./dump.zip

As this file appears to contain loads of data (some of it sensitive), any advice on how to best continue here?
I can run certain commands and share the results if that would be helpful?

@pshipton
Copy link
Member

pshipton commented Jan 9, 2025

You are using Liberty, are you entitled to IBM support? You could open a service case and the core file will be handled by service, and isolated.

@pshipton
Copy link
Member

pshipton commented Jan 9, 2025

If you can't go the service route, you could get a native stack trace with debug. You can open the core file under gdb to get the stack trace, running where. If using a machine other than the one that produced the core, use the gdb command setsysroot to point to all the libraries gathered by jpackcore, so getting a stack trace will work.

You'll need to add the debug libraries, which I assume are the following. The files in bin and lib should go in the bin and lib directories that contain the JVM (from the jpackcore output).
https://github.com/ibmruntimes/semeru17-binaries/releases/download/jdk-17.0.13%2B11_openj9-0.48.0/ibm-semeru-open-debugimage_x64_linux_17.0.13_11_openj9-0.48.0.tar.gz
(from https://github.com/ibmruntimes/semeru17-binaries/releases/tag/jdk-17.0.13%2B11_openj9-0.48.0)

You can process the diagnostic Snap file with traceformat, and either provide that (which you can do privately, I'm not sure it contains any sensitive info), or look for entries near the end from processing a VerifyException to share. There may be some entries that contain the following.
j9vm.294 Entry >setCurrentException
j9bcverify(j9vm).121 Exception * j9rtv_verifyArguments
j9bcverify(j9vm).29 Exception * verifyBytecodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants