bug#47596: File descriptor error when exiting emacs on android 11

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Henrik Grimler
Hi,

Android 10 introduced a new "file descriptor sanitizer" (fdsan, see [1]
for docs) which detects issues when opening and closing files across
multiple threads. On android 10 warnings were given, while on android
11 programs are killed instantly if issues are detected. 

Starting and then exiting emacs, or simply just running for example
`emacs --version`, gives an error. `gdb --args emacs --version` gives
this backtrace:

```
Starting program: /data/data/com.termux/files/usr/bin/emacs --version
GNU Emacs 27.2
Copyright (C) 2021 Free Software Foundation, Inc.
GNU Emacs comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GNU Emacs
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.
fdsan: attempted to close file descriptor 2, expected to be unowned,
actually owned by FILE* 0xb6c8800c

Program received signal SIGABRT, Aborted.
0xb63789e8 in fdsan_error(char const*, ...) ()
   from /apex/com.android.runtime/lib/bionic/libc.so
(gdb) bt full
#0  0xb63789e8 in fdsan_error(char const*, ...) ()
   from /apex/com.android.runtime/lib/bionic/libc.so
No symbol table info available.
#1  0xb63786fe in android_fdsan_close_with_tag ()
   from /apex/com.android.runtime/lib/bionic/libc.so
No symbol table info available.
#2  0xb63b83c0 in __sclose () from
/apex/com.android.runtime/lib/bionic/libc.so
No symbol table info available.
#3  0xb63b8f24 in __FILE_close(__sFILE*) ()
   from /apex/com.android.runtime/lib/bionic/libc.so
No symbol table info available.
#4  0x7f7e6a76 in close_stream (stream=0xb63cde44 <__sF+168>)
    at /home/builder/.termux-build/emacs/src/lib/close-stream.c:61
        some_pending = false
        prev_fail = false
        fclose_fail = 243
#5  0x7f68a872 in close_output_streams ()
    at /home/builder/.termux-build/emacs/src/src/sysdep.c:2840
        err = false
#6  0xb63c10d6 in __cxa_finalize ()
   from /apex/com.android.runtime/lib/bionic/libc.so
No symbol table info available.
#7  0xb63bc7b8 in exit () from
/apex/com.android.runtime/lib/bionic/libc.so
No symbol table info available.
#8  0x7f655d92 in main (argc=2, argv=0xbefff504)
    at /home/builder/.termux-build/emacs/src/src/emacs.c:1132
        version = 0xb5e68133 "27.2"
        copyright = 0xb5e68138 "Copyright (C) 2021 Free Software
Foundation, Inc."
        stack_bottom_variable = 0x7f655a61 <main>
        do_initial_setlocale = false
        no_loadup = false
        junk = 0x0
        dname_arg = 0x0
        ch_to_dir = 0x0
        original_pwd = 0x0
        dump_mode = 0x0
        skip_args = 1
        temacs = 0x0
        attempt_load_pdump = true
        sockfd = -1225460908
        module_assertions = 244
```

The emacs here was a debug build from the 27.2 tag, configured with:


 'configure --disable-dependency-tracking
 --prefix=/data/data/com.termux/files/usr
 --libdir=/data/data/com.termux/files/usr/lib
 --sbindir=/data/data/com.termux/files/usr/bin --disable-rpath
 --disable-rpath-hack --host=arm-linux-androideabi --disable-autodepend
 --with-gif=no --with-gnutls --with-jpeg=no --without-gconf
 --without-gsettings --without-lcms2 --without-x --with-png=no
 --with-tiff=no --with-xml2 --with-xpm=no --without-dbus
 --without-selinux --with-modules --with-pdumper=yes --with-
dumping=none
 --enable-checking=yes,glyphs --enable-check-lisp-object-type
 emacs_cv_sanitize_address=yes emacs_cv_prog_cc_no_pie=no
 ac_cv_lib_elf_elf_begin=no gl_cv_func_dup2_works=no
 ac_cv_func_setrlimit=no emacs_cv_func__setjmp=no
 emacs_cv_func_sigsetjmp=no --disable-nls --enable-shared
 --enable-static --libexecdir=/data/data/com.termux/files/usr/libexec
 'CFLAGS= -march=armv7-a -mfpu=neon -mfloat-abi=softfp -mthumb
 -fstack-protector-strong -g3 -O0 -Wall -gdwarf-4' 'CPPFLAGS=
 -D_FORTIFY_SOURCE=2 -D__USE_FORTIFY_LEVEL=2
 -I/data/data/com.termux/files/usr/include'
 'LDFLAGS=-L/data/data/com.termux/files/usr/lib
 -Wl,-rpath=/data/data/com.termux/files/usr/lib -march=armv7-a -fopenmp
 -static-openmp -Wl,--enable-new-dtags -Wl,--as-needed
 -Wl,-z,relro,-z,now''

Debugging these issues are very tedious in my experience so far
(probably easier for actual android apps). I will try to boil down the
relevant emacs code into a smaller program that still reproduces the
error.

Based on the backtrace it seem to be how stderr is opened (and closed)
that is problematic somehow.

Please let me know if you have any insights, or if I can provide
additional useful information.

Best regards,
Henrik Grimler

[1]
https://android.googlesource.com/platform/bionic/+/master/docs/fdsan.md




Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Eli Zaretskii
> From: Henrik Grimler <[hidden email]>
> Date: Sun, 04 Apr 2021 21:20:35 +0200
>
> Debugging these issues are very tedious in my experience so far
> (probably easier for actual android apps). I will try to boil down the
> relevant emacs code into a smaller program that still reproduces the
> error.

Yes, please.

> Based on the backtrace it seem to be how stderr is opened (and closed)
> that is problematic somehow.

Any idea what was "the other thread" involved in this situation?
Emacs is generally a single-threaded program.



Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Henrik Grimler
Hi,

Compiling this, on any optimisation level, is enough to trigger the
error:

```
#include <stdio.h>
int main()
{
  fdopen (2, "w");
  fclose (stderr);
}
```

In emacs fdopen is run in init_standard_fds, where we have

```
[...]

  force_open (STDERR_FILENO, O_RDONLY);

  /* Set buferr if possible on platforms defining _PC_PIPE_BUF, as
     they support the notion of atomic writes to pipes.  */
  #ifdef _PC_PIPE_BUF
    buferr = fdopen (STDERR_FILENO, "w");
    if (buferr)
      setvbuf (buferr, NULL, _IOLBF, 0);
  #endif
}
```

so I suppose there is either some very fundamental issue with 
`fdopen (STDERR_FILENO, "w")` here, or the file descriptor sanitizer on
android is broken.

If I avoid that part of init_standard_fds, i.e. change the ifdef to
something like `#if defined(_PC_PIPE_BUF) && !defined(__ANDROID__)`, I
get an emacs that seem to fully work on android.

Best regards,
Henrik Grimler




Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Henrik Grimler
Hi again,


> Compiling this, on any optimisation level, is enough to trigger the
> error:
>
> ```
> #include <stdio.h>
> int main()
> {
>   fdopen (2, "w");
>   fclose (stderr);
> }
> ```

Changing to this:

```

#include <stdio.h>
int main()
{
  FILE *err = fdopen (2, "w");
  fclose (err);
}
```

makes it work. So I suppose the sanitizer does not like that stderr is
closed with `fclose (stderr)` instead of by using the fd obtained from
fdopen (which was thrown away in my minimal example).

Still not sure if this is actually problematic, but at least now I
understand how the sanitizer "thinks".

Best regards,
Henrik Grimler




Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Eli Zaretskii
> From: Henrik Grimler <[hidden email]>
> Cc: [hidden email]
> Date: Mon, 05 Apr 2021 10:59:55 +0200
>
> > #include <stdio.h>
> > int main()
> > {
> >   fdopen (2, "w");
> >   fclose (stderr);
> > }
> > ```
>
> Changing to this:
>
> ```
>
> #include <stdio.h>
> int main()
> {
>   FILE *err = fdopen (2, "w");
>   fclose (err);
> }
> ```
>
> makes it work.

Which again makes no sense, because the program that works is a no-op:
it creates a copy of stderr and immediately closes it.

> So I suppose the sanitizer does not like that stderr is
> closed with `fclose (stderr)` instead of by using the fd obtained from
> fdopen (which was thrown away in my minimal example).
>
> Still not sure if this is actually problematic, but at least now I
> understand how the sanitizer "thinks".

If that's what it thinks, it's a clear bug in the sanitizer, IMO.



Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Eli Zaretskii
In reply to this post by Henrik Grimler
> From: Henrik Grimler <[hidden email]>
> Cc: [hidden email]
> Date: Mon, 05 Apr 2021 11:48:50 +0200
>
> A better way to fix the error is to use buferr instead of stderr when
> calling `close_streams` in `close_output_streams`. I cannot see any
> obvious problems with this, so I propose this patch:

You are operating on the wrong stream, so I do see a problem with this
patch.  The purpose of this fragment is to close stderr, not the
buffered copy of stderr.

> --- ./src/sysdep.c.orig 2021-04-05 09:06:33.847835653 +0000
> +++ ./src/sysdep.c 2021-04-05 09:05:47.957856162 +0000
> @@ -2837,8 +2837,8 @@
>       sanitizer might report to stderr after this function is invoked.
> */
>    bool err = buferr && (fflush (buferr) != 0 || ferror (buferr));
>    if (err | (ADDRESS_SANITIZER
> -     ? fflush (stderr) != 0 || ferror (stderr)
> -     : close_stream (stderr) != 0))
> +     ? fflush (buferr) != 0 || ferror (buferr)
> +     : close_stream (buferr) != 0))
>      _exit (EXIT_FAILURE);
>  }
>  ^L
> ```
>
> With it I can start and exit emacs on android without the fdsan error.

Is ADDRESS_SANITIZER defined in your build?

In any case, I do want to wait for the Android developers to respond
to the issue in their tracker.

Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Henrik Grimler

Hi,

> You are operating on the wrong stream, so I do see a problem with this
> patch.  The purpose of this fragment is to close stderr, not the
> buffered copy of stderr.

>
Thanks for the feedback!

> Is ADDRESS_SANITIZER defined in your build?

Yes, it is configured with emacs_cv_sanitize_address=yes. Setting this
option has been done for android since emacs 25, and supposedly fixed
some malloc issue back then [1]. I tried building without setting
emacs_cv_sanitize_address=yes just now (configure then sets =no), but
it does not seem to influence this fdsan error.

> In any case, I do want to wait for the Android developers to respond
> to the issue in their tracker.

I will update the issue and add the minimal non-working example
obtained here as well.

Best regards,
Henrik Grimler

[1]
https://github.com/termux/termux-packages/commit/8ce98d7fe9130cd151cc2ec34b3b1e7ecb0aeace#diff-8fd6b84ed67057870fa4f209d2309b53bf3c097b732cb3b8e639f58a37d138c1R7




Reply | Threaded
Open this post in threaded view
|

bug#47596: File descriptor error when exiting emacs on android 11

Henrik Grimler
In reply to this post by Henrik Grimler
Hi,

I received feedback on the android issue tracker 
https://issuetracker.google.com/issues/184380442 
that the snippet:

> > #include <stdio.h>
> > int main()
> > {
> >   fdopen (2, "w");
> >   fclose (stderr);
> > }

is problematic, because:

> yes. you have two FILE*s that both think they own file descriptor 2.
> depending on what you're actually trying to do, you probably meant to
> use freopen(3) instead?
https://man7.org/linux/man-pages/man3/fopen.3.html

Does freopen make sense for buferr?

> > In emacs fdopen is run in init_standard_fds, where we have
> >
> >   force_open (STDERR_FILENO, O_RDONLY);
> >
> >   /* Set buferr if possible on platforms defining _PC_PIPE_BUF, as
> >      they support the notion of atomic writes to pipes.  */
> >   #ifdef _PC_PIPE_BUF
> >     buferr = fdopen (STDERR_FILENO, "w");
> >     if (buferr)
> >       setvbuf (buferr, NULL, _IOLBF, 0);
> >   #endif
> > }
>
> This just creates a copy of stderr that has special buffering. 
> Again,
> entirely valid for a C program to do that.

I probably should have included a bit more context in the android bug
report to better show what the code does.

Best regards,
Henrik Grimler