Emacs 25.1 build failures on Ubuntu arm64

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Emacs 25.1 build failures on Ubuntu arm64

Barry Warsaw
I'm not sure how much of this email will be actionable, but I wanted to at
least let y'all know about a problem we've seen building Emacs 25.1 on Ubuntu
17.04 arm64.

Quick background: In Ubuntu we have a few minor changes above the package in
Debian, which we mostly inherit.  One of those changes is that we build with
-j1 on arm64 only.  There are a couple of other changes that shouldn't have an
impact on the problem I'll describe, but the changelogs can be easily
inspected.

In any case, for several weeks now we've seen Emacs segfault on arm64 builders
consistently in what I think are the CEDET tests.  The full traceback, when
we've been able to capture it, is here:

https://launchpadlibrarian.net/302409391/gdb-bt-full.txt

The full build log is here:

https://launchpadlibrarian.net/310046964/buildlog_ubuntu-zesty-arm64.emacs25_25.1+1-3ubuntu3~ppa0_BUILDING.txt.gz

(scroll to the bottom to see the crash)

and the bug that tracks this is here:

https://bugs.launchpad.net/ubuntu/+source/emacs25/+bug/1656474

We're using gcc based on 6.2.1/6.3.0 (depending on the build).

All the other architectures have been building fine, but arm64 crashes
consistently on our build machines.  Attempts to reproduce the crash on
development machines, even on arm64 chroots and bare metal have not been
successful, so debugging the problem isn't easy.  Based on some research into
other crashes, various other attempts to work around the problem, and some
discussions among Ubuntu developers on IRC, we wondered whether turning off
optimization would help.  Indeed, adding -O0 to CFLAGS only on arm64 solved
the problem in a test build.

I've now uploaded a new build to the archive with -O0 on arm64 and I expect
this to build and get published once it's approved (Ubuntu is currently in
final beta freeze).

I'll watch this thread via Gmane and try to answer any additional questions
you might have.  As I said, I'm not sure what the Emacs developers can do
about it, but I wanted you to know about the problem and how we think we've
solved it.

Cheers,
-Barry

attachment0 (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Emacs 25.1 build failures on Ubuntu arm64

Eli Zaretskii
> From: Barry Warsaw <[hidden email]>
> Date: Tue, 28 Mar 2017 20:30:40 -0400
>
> I'm not sure how much of this email will be actionable, but I wanted to at
> least let y'all know about a problem we've seen building Emacs 25.1 on Ubuntu
> 17.04 arm64.

Did this configuration build correctly with Emacs 24.5?

> In any case, for several weeks now we've seen Emacs segfault on arm64 builders
> consistently in what I think are the CEDET tests.  The full traceback, when
> we've been able to capture it, is here:
>
> https://launchpadlibrarian.net/302409391/gdb-bt-full.txt

I see this there:

  #5  handle_sigsegv (sig=11, siginfo=<optimized out>, arg=<optimized out>) at sysdep.c:1695
          fatal = <optimized out>
  #6  <signal handler called>
  No symbol table info available.
  #7  unchain_marker (marker=marker@entry=0x130ad18) at marker.c:605
          tail = <optimized out>
          prev = 0x676e696e6e6967e5
          b = 0x130acf0
  #8  0x00000000005334a4 in free_marker (marker=marker@entry=19967257) at alloc.c:3850
  No locals.

The value of 'prev' in frame #7 looks garbled: it's ASCII text
"ginning", probably was "beginning" at some point.  This might
indicate memory corruption, either some dynamically allocated memory
or the stack got smashed.

Does your build compile ralloc.c?  If so, could you try configuring
with REL_ALLOC=no?

> and the bug that tracks this is here:
>
> https://bugs.launchpad.net/ubuntu/+source/emacs25/+bug/1656474

That bug mentions several fixed done by Debian; did you try them?

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Emacs 25.1 build failures on Ubuntu arm64

Barry Warsaw
On Mar 29, 2017, at 06:03 PM, Eli Zaretskii wrote:

>> I'm not sure how much of this email will be actionable, but I wanted to at
>> least let y'all know about a problem we've seen building Emacs 25.1 on
>> Ubuntu 17.04 arm64.
>
>Did this configuration build correctly with Emacs 24.5?

It did.  Ubuntu 17.04 does still have Emacs 24.5, and its arm64 build did
succeed.  It was last built in the archive back February but I just tried an
experimental rebuild with today's toolchain and arm64 passed.

>  prev = 0x676e696e6e6967e5
>
>The value of 'prev' in frame #7 looks garbled: it's ASCII text "ginning",
>probably was "beginning" at some point.  This might indicate memory
>corruption, either some dynamically allocated memory or the stack got
>smashed.

Indeed, that's interesting.

>Does your build compile ralloc.c?  If so, could you try configuring
>with REL_ALLOC=no?

We already build with REL_ALLOC=no for both emacs24 and emacs25.

>> and the bug that tracks this is here:
>>
>> https://bugs.launchpad.net/ubuntu/+source/emacs25/+bug/1656474 
>
>That bug mentions several fixed done by Debian; did you try them?

I haven't; here are the list of Ubuntu deltas.  It's possible of course that
one of these introduces a problem, but I'm more suspicious of the toolchain.
The last upload into Debian unstable did successfully build arm64.

- build with parallel=1 on arm64
- Rebuild against new imagemagick 6.9.7.0. (no changes to emacs)
- Don't build-depend on gconf. emacs has supported gsettings for years

I'll see if I can do a build of the Debian version of 25.1 in an Ubuntu 17.04
PPA to see if there are any different results.

Cheers,
-Barry

attachment0 (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Emacs 25.1 build failures on Ubuntu arm64

Barry Warsaw
On Mar 29, 2017, at 04:46 PM, Barry Warsaw wrote:

>I'll see if I can do a build of the Debian version of 25.1 in an Ubuntu 17.04
>PPA to see if there are any different results.

Indeed, Debian unstable's version of Emacs 25.1 also fails with the same
segfault in the Ubuntu 17.04 PPA, as expected.  We can rule out any Ubuntu
deltas.

https://launchpadlibrarian.net/313465977/buildlog_ubuntu-zesty-arm64.emacs25_25.1+1-4~ppa0_BUILDING.txt.gz

Cheers,
-Barry

attachment0 (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Emacs 25.1 build failures on Ubuntu arm64

Rob Browning
Barry Warsaw <[hidden email]> writes:

> On Mar 29, 2017, at 04:46 PM, Barry Warsaw wrote:
>
>>I'll see if I can do a build of the Debian version of 25.1 in an Ubuntu 17.04
>>PPA to see if there are any different results.
>
> Indeed, Debian unstable's version of Emacs 25.1 also fails with the same
> segfault in the Ubuntu 17.04 PPA, as expected.  We can rule out any Ubuntu
> deltas.
>
> https://launchpadlibrarian.net/313465977/buildlog_ubuntu-zesty-arm64.emacs25_25.1+1-4~ppa0_BUILDING.txt.gz

I spent a bit of time on one of the porterboxes and gathered a little
more information, though in the end, we just added -O0 on arm64 for now
as well.

I was able to reproduce this on asachi using a git checkout of
origin/emacs-25.2 (i.e. clean upstream tree, no debian adjustments),
though I did emulate our VPATH build (see below).

One notable oddity -- while trying to narrow down the cause, I found
that the crash could be reliably triggered by adding the .git dir to the
(copied) build tree.  For example, this either builds, or crashes,
depending on whether or not the .git dir is introduced.

  rm -rf debian
  mkdir -p debian/build-src
  cp -a $(ls -A | egrep -v '^(\.git|\.pc|debian)$') debian/build-src

  # If this line is removed, the build works fine, otherwise it crashes
  cp -a .git debian/build-src/

  pushd debian/build-src
  ./autogen.sh
  popd
  mkdir debian/build-x
  cd debian/build-x
  ../build-src/configure ...
  make -j

When the build crashes, it's always while trying to produce c-by.el, and
the crash looks similar to the one reported earlier in this thread,
i.e.:

  Starting program: /home/rlb/git/emacs25-25.2+1/debian/build-x/src/emacs -batch --no-site-file --no-site-lisp -l semantic/bovine/grammar -f bovine-batch-make-parser -o /home/rlb/git/emacs25-25.2+1/debian/build-x/../../../emacs25-25.2+1/debian/build-src/lisp/cedet/semantic/bovine/c-by.el /home/rlb/git/emacs25-25.2+1/debian/build-x/../../../emacs25-25.2+1/debian/build-src/admin/grammars/c.by
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
  [New Thread 0xffffb36dbe10 (LWP 22399)]
  ../../../build-src/lisp/emacs-lisp/eieio.el: `eieio-object-name-string' is an obsolete generic function (as of 25.1); use `eieio-named' instead.
  ../../../build-src/lisp/emacs-lisp/eieio-base.el: `eieio-object-name-string' is an obsolete generic function (as of 25.1); use `eieio-named' instead.
  ../../../build-src/lisp/cedet/semantic/db-ref.el: Obsolete name arg "DEBUG" to constructor semanticdb-ref-adebug

  Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
  unchain_marker (marker=marker@entry=0x188d850) at ./debian/build-src/src/marker.c:605
  605     ./debian/build-src/src/marker.c: No such file or directory.
  (gdb) where
  #0  unchain_marker (marker=marker@entry=0x188d850) at ./debian/build-src/src/marker.c:605
  #1  0x000000000053608c in free_marker (marker=marker@entry=25745489)
      at ./debian/build-src/src/alloc.c:3850
  #2  0x0000000000508688 in signal_before_change (preserve_ptr=0x0, end_int=9894256,
      start_int=9890072) at ./debian/build-src/src/insdel.c:2041
  ...

I wondered if the presence of the .git dir was altering the behavior of
autogen.sh (or something else) in a way that exposes the problem.

Hope this helps
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Emacs 25.1 build failures on Ubuntu arm64

Byung-Hee HWANG (황병희)
Dear Rob and Barry,

I don't know about build stuff in technical. However i'm interesting
this issue. Because i did pre-order ARM64 (MT8173) chromebook, 3 weeks
ago.

Cheer up, and thanks!!!
 
--
^고맙습니다 _布德天下_ 감사합니다_^))//


Loading...