[oe] [meta-java] Sporadic segfaults when compiling jamvm-native and classpath-native

Kaaria, Erkka erkka.kaaria at intel.com
Fri Nov 6 11:44:00 UTC 2015


Hi everyone,

There seems to be something wrong with cacao-initial-native JIT compiler that causes sporadic segfaults when compiling jamvm-native or classpath-native. This is fairly rare, I estimate around 1% of the builds fail due to segfault. I use 64 bit Ubuntu 15.04 with GCC 4.9.2 as build machine.

Example do_compile log:

make[6]: Entering directory '<path-snip>/build/tmp/work/x86_64-linux/jamvm-native/1.5.5+1.6.0-devel+gitAUTOINC+ebd11bde0a-r0/build/src/classlib/gnuclasspath/lib'
mkdir classes
<snip, ecj-initial compiling lots of java files>
LOG: [0x00007f32c46a7700] We received a SIGSEGV and tried to handle it, but we were
LOG: [0x00007f32c46a7700] unable to find a Java method at:
LOG: [0x00007f32c46a7700]
LOG: [0x00007f32c46a7700] PC=0x0000000000440c09
LOG: [0x00007f32c46a7700]
LOG: [0x00007f32c46a7700] Dumping the current stacktrace:
    <<No stacktrace available>>
LOG: [0x00007f32c46a7700] Exiting...
Aborted (core dumped)
Makefile:662: recipe for target 'classes.zip' failed


Examination of the core dump in this particular case indicates that the segfault is caused by null pointer dereference in builtin_new at builtin.c:764:

<snip, signal handlers, java stacktrace code etc>
#5 <signal handler called>
#6 builtin_new (c=0x0) at builtin.c:764
#7 0x00007f23badb5a5f in ?? ()
#8 0x0000000000000000 in ?? ()

Gdb is unable to show the calling method, and judging by the fact that the old instruction pointer does not match any shared library addresses, I think it originates from JIT compiled code.

This http://c1.complang.tuwien.ac.at/pipermail/cacao/2011-March/001350.html thread seems to indicate that there could be a race condition in cacao JIT compiler when multiple Java threads are involved. I forced the ecj to only use one thread and segfaulting seems to have stopped.

Given the non-deterministic nature of the bug, I am not sure if I actually managed to fix this or if I have just been lucky. As such, I request help to

a) Verify the presence of the bug
b) Verify that forcing ecj to only use single thread fixes the issue.

Multihreading can be disabled by adding -Djdt.compiler.useSingleThread=true to the third line in "ecj-initial" script in build/tmp/sysroots/<build machine folder>/usr/bin/

example line:
${RUNTIME} -Xmx512m -Djdt.compiler.useSingleThread=true -cp ${ECJ_JAR} org.eclipse.jdt.internal.compiler.batch.Main ${1+"$@"}


Script I use for the recompilation. I usually get a failure within 90 minutes, but this naturally varies between computers. Run from the build folder

#!/bin/bash
while [ true ]; do
  echo $(date)
  echo "Cleaning..."
  bitbake -c cleansstate jamvm-native > /dev/null
  echo "Compiling..."
  bitbake -c compile jamvm-native > /dev/null   || { echo "Stopping"; exit 1; }
done

Best regards,
Erkka Kääriä
---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.




More information about the Openembedded-devel mailing list