bug#39977: 28.0.50; Unhelpful stack trace

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Madhu-8

Thanks Eli, Following up on the emacs-devel message, I recompiled emacs
and get substantially the same stack trace. This is with my ~/.emacs and
all my local customizations loaded.

when evaluating a form in a sly lisp buffer with
 (break)
M-x sly-eval-last-sexp
pops up a sly debug window in a new frame
quitting the window (and frame) tries to select the previous window
and the crash apparently happens at this point. I'm attaching the
backtraces as attachments

With a little guidance I expect to be able to investigate this further



In GNU Emacs 28.0.50 (build 3, x86_64-pc-linux-gnu, GTK+ Version 3.24.13, cairo version 1.16.0)
 of 2020-03-07 built on leonis4
Repository revision: 64c791cecf6a9a8593c6e818b0007fcba5cc1549
Repository branch: madhu-tip
Windowing system distributor 'The X.Org Foundation', version 11.0.12006000
System Description: Gentoo/Linux

Configured using:
 'configure -C --with-harfbuzz --with-cairo --with-x-toolkit=gtk
 'CFLAGS=-ggdb -O0''

Configured features:
XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB NOTIFY
INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF
ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS JSON PDUMPER
LCMS2 GMP

Important settings:
  value of $LC_COLLATE: C
  value of $LANG: en_US.utf8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  savehist-mode: t
  xclip-mode: t
  elisp-slime-nav-mode: t
  ivy-prescient-mode: t
  prescient-persist-mode: t
  ivy-mode: t
  save-place-mode: t
  recentf-mode: t
  show-paren-mode: t
  shell-dirtrack-mode: t
  minibuffer-depth-indicate-mode: t
  display-time-mode: t
  which-function-mode: t
  foo-clear-output-mode: t
  foo-translate-kbd-paren-mode: t
  new-shell-activate-mode: t
  foo-mode: t
  tooltip-mode: t
  eldoc-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Features:
(shadow sort mail-extr emacsbug message rmc puny rfc822 mml mml-sec epa
epg epg-config mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail lw-manual lww61-manual-data
savehist xclip elisp-slime-nav gnus nnheader gnus-util rmail
rmail-loaddefs rfc2047 rfc2045 ietf-drums text-property-search
mail-utils mm-util mail-prsvr company pcase cus-start cus-load
ivy-prescient prescient counsel xdg swiper ivy delsel colir color
ivy-overlay ggtags etags fileloop generator xref project compile ewoc
zenicb-color zenicb-whereis zenicb-complete zenicb-stamp zenicb-history
zenicb-away zenicb zenirc-sasl erc-goodies erc erc-backend erc-compat pp
erc-loaddefs zenirc-color zenirc-stamp zenirc-trigger zenirc-notify
zenirc-netsplit zenirc-ignore zenirc-history zenirc-format zenirc-dcc
zenirc-complete zenirc-command-queue zenirc-away zenirc org-mew mew-auth
mew-config mew-imap2 mew-imap mew-nntp2 mew-nntp mew-pop mew-smtp
mew-ssl mew-ssh mew-net mew-highlight mew-sort mew-fib mew-ext
mew-refile mew-demo mew-attach mew-draft mew-message mew-thread
mew-virtual mew-summary4 mew-summary3 mew-summary2 mew-summary
mew-search mew-pick mew-passwd mew-scan mew-syntax mew-bq mew-smime
mew-pgp mew-header mew-exec mew-mark mew-mime mew-unix mew-edit
mew-decode mew-encode mew-cache mew-minibuf mew-complete mew-addrbook
mew-local mew-vars3 mew-vars2 mew-vars mew-env mew-mule3 mew-mule
mew-gemacs mew-key mew-func mew-blvs mew-const mew server winner
windmove whitespace tramp-sh tramp tramp-loaddefs trampver
tramp-integration files-x tramp-compat ls-lisp ange-ftp term disp-table
ehelp saveplace recentf tree-widget wid-edit paren ob-lisp ob-shell
shell org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro
org-footnote org-src ob-comint org-pcomplete pcomplete comint ring
org-list org-faces org-entities noutline outline org-version
ob-emacs-lisp ob-core ob-eval org-table ol org-keys org-compat org-macs
org-loaddefs format-spec find-func cal-menu calendar cal-loaddefs
rng-nxml rng-valid rng-loc rng-uri rng-parse nxml-parse rng-match rng-dt
rng-util rng-pttrn nxml-ns nxml-mode nxml-outln nxml-rap sgml-mode dom
nxml-util nxml-enc xmltok mb-depth ffap thingatpt battery dbus xml time
so-long which-func imenu parse-time iso8601 time-date cookie1 diff
generic ansi-color derived easy-mmode edmacro kmacro advice cl-extra
help-mode dired-x dired dired-loaddefs cl gh-common marshal eieio-compat
rx info package easymenu browse-url url-handlers url-parse auth-source
cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache json
subr-x map url-vars seq byte-opt gv bytecomp byte-compile cconv
cl-loaddefs cl-lib tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode elisp-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote threads dbusbind
inotify lcms2 dynamic-setting system-font-setting font-render-setting
cairo move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 496211 16301)
 (symbols 48 37922 1)
 (strings 32 133899 4857)
 (string-bytes 1 3903977)
 (vectors 16 38303)
 (vector-slots 8 455102 14242)
 (floats 8 588 350)
 (intervals 56 1694 0)
 (buffers 1000 12))



bt.txt (2K) Download Attachment
bt-full.txt (82K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
> From: Madhu <[hidden email]>
> Date: Sat, 07 Mar 2020 23:13:16 +0530
>
> Thanks Eli, Following up on the emacs-devel message, I recompiled emacs
> and get substantially the same stack trace. This is with my ~/.emacs and
> all my local customizations loaded.
>
> when evaluating a form in a sly lisp buffer with
>  (break)
> M-x sly-eval-last-sexp
> pops up a sly debug window in a new frame
> quitting the window (and frame) tries to select the previous window
> and the crash apparently happens at this point. I'm attaching the
> backtraces as attachments
>
> With a little guidance I expect to be able to investigate this further

Thanks.  Please try the patch below, and tell if it makes the crash
go away.

diff --git a/src/window.c b/src/window.c
index 8cdad27..863fac4 100644
--- a/src/window.c
+++ b/src/window.c
@@ -541,8 +541,11 @@ select_window (Lisp_Object window, Lisp_Object norecord,
   else
     redisplay_other_windows ();
 
-  sf = SELECTED_FRAME ();
-  if (XFRAME (WINDOW_FRAME (w)) != sf)
+  if (FRAMEP (selected_frame) && FRAME_LIVE_P (XFRAME (selected_frame)))
+    sf = XFRAME (selected_frame);
+  else
+    sf = NULL;
+  if (!sf || XFRAME (WINDOW_FRAME (w)) != sf)
     {
       fset_selected_window (XFRAME (WINDOW_FRAME (w)), window);
       /* Use this rather than Fhandle_switch_frame



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
Ping!  Can you please try the proposed patch?

Martin, any thoughts or comments about this?

> Date: Sat, 07 Mar 2020 20:50:42 +0200
> From: Eli Zaretskii <[hidden email]>
> Cc: [hidden email]
>
> > From: Madhu <[hidden email]>
> > Date: Sat, 07 Mar 2020 23:13:16 +0530
> >
> > Thanks Eli, Following up on the emacs-devel message, I recompiled emacs
> > and get substantially the same stack trace. This is with my ~/.emacs and
> > all my local customizations loaded.
> >
> > when evaluating a form in a sly lisp buffer with
> >  (break)
> > M-x sly-eval-last-sexp
> > pops up a sly debug window in a new frame
> > quitting the window (and frame) tries to select the previous window
> > and the crash apparently happens at this point. I'm attaching the
> > backtraces as attachments
> >
> > With a little guidance I expect to be able to investigate this further
>
> Thanks.  Please try the patch below, and tell if it makes the crash
> go away.
>
> diff --git a/src/window.c b/src/window.c
> index 8cdad27..863fac4 100644
> --- a/src/window.c
> +++ b/src/window.c
> @@ -541,8 +541,11 @@ select_window (Lisp_Object window, Lisp_Object norecord,
>    else
>      redisplay_other_windows ();
>  
> -  sf = SELECTED_FRAME ();
> -  if (XFRAME (WINDOW_FRAME (w)) != sf)
> +  if (FRAMEP (selected_frame) && FRAME_LIVE_P (XFRAME (selected_frame)))
> +    sf = XFRAME (selected_frame);
> +  else
> +    sf = NULL;
> +  if (!sf || XFRAME (WINDOW_FRAME (w)) != sf)
>      {
>        fset_selected_window (XFRAME (WINDOW_FRAME (w)), window);
>        /* Use this rather than Fhandle_switch_frame
>
>
>
>



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
 > Martin, any thoughts or comments about this?

The selected frame must be invariantly live.  Madhu, could you find out
why we apparently manage to return from delete_frame in frame.c without
selecting another frame?

The dividing area is the part written as

   /* At this point, we are committed to deleting the frame.
      There is no more chance for errors to prevent it.  */
   minibuffer_selected = EQ (minibuf_window, selected_window);
   sf = SELECTED_FRAME ();
   /* Don't let the frame remain selected.  */
   if (f == sf)

starting around line 2012 in delete_frame.  Put a breakpoint anywhere
there and run your sly function.  If the (f == sf) check is not true, we
are lost.  Otherwise, try to step through the following FOR_EACH_FRAME
and tell us why it doesn't break out of that loop (and the subsequent
one).  It requires a bit of intuition, but since you probably will not
have more than one frame you should be able to find out quickly.

Other than that I cannot imagine what could have gone wrong here and/or
how to test this.  In either case sf = NULL; is not TRT but I think you
are aware of that.

martin



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
> Cc: [hidden email]
> From: martin rudalics <[hidden email]>
> Date: Fri, 13 Mar 2020 17:28:53 +0100
>
> In either case sf = NULL; is not TRT but I think you are aware of
> that.

No, I don't think I'm aware of that.  It's just a local variable, so
why assigning NULL could not be TRT?



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
 >> In either case sf = NULL; is not TRT but I think you are aware of
 >> that.
 >
 > No, I don't think I'm aware of that.  It's just a local variable, so
 > why assigning NULL could not be TRT?

Because it hides the underlying error.  The abort in SELECTED_FRAME is
there so we can find its cause and that's why I said you are aware of
it.  Obviously, we can set it to NULL to avoid an abort when, as in the
case at hand, we construct the mode line or the frame title.  But in
general doing such a thing in select_window is not TRT.  At least that's
what I learned from you.

martin



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
Maybe the following two changes would not harm:

(1) In fast_set_selected_frame check whether FRAME is live before doing
     selected_frame = frame;

(2) In display_mode_lines check whether new_frame is live before doing

     selected_frame = new_frame;

     and maybe also for old_selected_frame before

     selected_frame = old_selected_frame;

Maybe display_mode_lines should better use

    record_unwind_protect (fast_set_selected_frame, selected_frame);

What was the rationale for protecting frame reselection when drawing the
tab bar or the tool bar and not protecting it when drawing mode lines?

I have no idea whether these could help in any way but since we are in
redisplay and the selected frame has become dead all of a sudden ...

Another point is obviously the

       do_switch_frame (frame1, 0, 1, Qnil);
       sf = SELECTED_FRAME ();

combination in delete_frame itself.  If frame1 is dead, we select the
frame we are about to delete.  But this should not produce the abort at
hand since the assignment to selected_frame happens in do_switch_frame.
Guarding that assignment would not harm either and then we're done IMO.
WDYT?

martin



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
> From: martin rudalics <[hidden email]>
> Cc: [hidden email], [hidden email]
> Date: Sat, 14 Mar 2020 19:55:34 +0100
>
> Maybe the following two changes would not harm:
>
> (1) In fast_set_selected_frame check whether FRAME is live before doing
>      selected_frame = frame;
>
> (2) In display_mode_lines check whether new_frame is live before doing
>
>      selected_frame = new_frame;
>
>      and maybe also for old_selected_frame before
>
>      selected_frame = old_selected_frame;

And what do you suggest to do if the frame at RHS is not live?

> I have no idea whether these could help in any way but since we are in
> redisplay and the selected frame has become dead all of a sudden ...

Why do you think this is a problem for redisplay?



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
 > And what do you suggest to do if the frame at RHS is not live?

Sorry, what is RHS?  As far as xdisp.c is concerned it simply must not
set selected_frame to a dead frame.  Never ever.  As far as frame.c is
concerned, it should do something like in the attached patch.  In the
worst case this might make it impossible to remove a specific frame but
this can usually be fixed by C-x 5 2 followed by C-x 5 1 unless things
are awfully broken.  But at least this is a place from where to continue
investigating.  I have no better idea.

 >> I have no idea whether these could help in any way but since we are in
 >> redisplay and the selected frame has become dead all of a sudden ...
 >
 > Why do you think this is a problem for redisplay?

I didn't say that this is a problem for redisplay.  What I wanted to say
is that your fix which is supposed to handle a (maybe only temporary)
problem inside redisplay might cause more serious problems when
selecting a dead frame outside of redisplay.  But maybe I'm confusing
things.

martin

frame.c.diff (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
> Cc: [hidden email], [hidden email]
> From: martin rudalics <[hidden email]>
> Date: Sun, 15 Mar 2020 18:49:21 +0100
>
>  > And what do you suggest to do if the frame at RHS is not live?
>
> Sorry, what is RHS?

Right-hand side.

> As far as xdisp.c is concerned it simply must not set selected_frame
> to a dead frame.

I don't think that's possible in xdisp.c cases you've shown.

> Never ever.

Why not?

> As far as frame.c is concerned, it should do something like in the
> attached patch.

We cannot punt like that in the display engine.

>  > Why do you think this is a problem for redisplay?
>
> I didn't say that this is a problem for redisplay.  What I wanted to say
> is that your fix which is supposed to handle a (maybe only temporary)
> problem inside redisplay might cause more serious problems when
> selecting a dead frame outside of redisplay.  But maybe I'm confusing
> things.

So you are saying that selecting such a frame will cause trouble to
some other code, not to the display engine?



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Madhu-8
In reply to this post by martin rudalics
[Sorry for the delay in checking my email until now. I wasn't
subscribe to the debbgugs newsgroup]

* I didn't get Eli's message #8 (Sat, 07 Mar 2020 20:50:42 +0200) in
my mailbox.  The patch supplied in this message does indeed make the
crash go away.

* I did get Eli's message #11 (Fri, 13 Mar 2020 11:55:26 +0200) to my
mailbox. Subsequent messages seem to be delivered to my email address.

* Re Martin's message #14 (Fri, 13 Mar 2020 17:28:53 +0100), the check
(f == sf) in delet_frame is not true when I trigger the crash.

* Re Eli's message #23 (Sat, 14 Mar 2020 12:10:09 +0200): It seems out
that I am going out of the way to trigger the crash - I may be
introducing a "bug" in SLY code, or exposing a defect in SLY design.

Presumably under normal circumstances the crash should not occur.

I haven't understood the sequence of events which causes leads to this
crash case.  I'm a little embarrased to reveal it but I can try to
pass on the recipe to Martin.

* Re Martin's message #35 (Sun, 15 Mar 2020 18:49:21 +0100), the patch
frame.c.diff does make the crash go away.




Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
In reply to this post by Eli Zaretskii
 >> As far as xdisp.c is concerned it simply must not set selected_frame
 >> to a dead frame.
 >
 > I don't think that's possible in xdisp.c cases you've shown.
 >
 >> Never ever.
 >
 > Why not?

Because it might shift the abort to the next instance of SELECTED_FRAME.

 >> As far as frame.c is concerned, it should do something like in the
 >> attached patch.
 >
 > We cannot punt like that in the display engine.

Why not?  At least one of the frame restorations is unprotected anyway
and might leave the temporarily selected frame selected.

 > So you are saying that selecting such a frame will cause trouble to
 > some other code, not to the display engine?

Not "will" but "may".  The problem is that it then might be harder
to find the cause.

With emacs -Q evaluate

(defvar foo
   '(:eval
     (when (> (length (frame-list)) 1)
       (delete-frame (next-frame)))))

(setq-default mode-line-format foo)

and do C-x 5 2.  The backtrace I get here is


#0  0x000000000063f7a3 in terminate_due_to_signal (sig=6, backtrace_limit=40) at ../../src/emacs.c:371
#1  0x000000000068ac8a in emacs_abort () at ../../src/sysdep.c:2448
#2  0x00000000004ee088 in select_window (window=XIL(0x1be5745), norecord=XIL(0x30), inhibit_point_swap=false) at ../../src/window.c:544
#3  0x00000000004ee2f9 in Fselect_window (window=XIL(0x1be5745), norecord=XIL(0x30)) at ../../src/window.c:630
#4  0x0000000000484c33 in gui_consider_frame_title (frame=XIL(0x1be5505)) at ../../src/xdisp.c:12318
#5  0x00000000004974b6 in redisplay_window (window=XIL(0x1be5745), just_this_one_p=false) at ../../src/xdisp.c:18940
#6  0x000000000048cb00 in redisplay_window_0 (window=XIL(0x1be5745)) at ../../src/xdisp.c:16179
#7  0x00000000007b10dd in internal_condition_case_1 (bfun=0x48cabe <redisplay_window_0>, arg=XIL(0x1be5745), handlers=XIL(0x7ffff40bafbb), hfun=0x48ca86 <redisplay_window_error>) at ../../src/eval.c:1379
#8  0x000000000048ca58 in redisplay_windows (window=XIL(0x1be5745)) at ../../src/xdisp.c:16159
#9  0x000000000048b486 in redisplay_internal () at ../../src/xdisp.c:15627
#10 0x0000000000489084 in redisplay () at ../../src/xdisp.c:14854
#11 0x0000000000650828 in read_char (commandflag=1, map=XIL(0x17ce993), prev_event=XIL(0), used_mouse_menu=0x7fffffffe13f, end_time=0x0) at ../../src/keyboard.c:2493
#12 0x0000000000663705 in read_key_sequence (keybuf=0x7fffffffe2d0, prompt=XIL(0), dont_downcase_last=false, can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false) at ../../src/keyboard.c:9549
#13 0x000000000064ccee in command_loop_1 () at ../../src/keyboard.c:1350
#14 0x00000000007b1002 in internal_condition_case (bfun=0x64c872 <command_loop_1>, handlers=XIL(0x90), hfun=0x64be81 <cmd_error>) at ../../src/eval.c:1355
#15 0x000000000064c457 in command_loop_2 (ignore=XIL(0)) at ../../src/keyboard.c:1091
#16 0x00000000007b04b6 in internal_catch (tag=XIL(0xd0e0), func=0x64c42a <command_loop_2>, arg=XIL(0)) at ../../src/eval.c:1116
#17 0x000000000064c3f5 in command_loop () at ../../src/keyboard.c:1070
#18 0x000000000064b968 in recursive_edit_1 () at ../../src/keyboard.c:714
#19 0x000000000064bb60 in Frecursive_edit () at ../../src/keyboard.c:786
#20 0x0000000000641f98 in main (argc=2, argv=0x7fffffffe7c8) at ../../src/emacs.c:2035


which is almost the same as Madhu's.  So maybe the display engine should
simply set a global boolean inhibit_frame_changes while evaluating mode
lines or frame titles and have at least delete_frame not delete a frame
when that variable is set.  If we decide that fixing such a problem is
urgent.

martin



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
In reply to this post by Madhu-8
 > I haven't understood the sequence of events which causes leads to this
 > crash case.  I'm a little embarrased to reveal it but I can try to
 > pass on the recipe to Martin.

Do you ever evaluate something that could cause a frame deletion when
setting a mode or header-line string or a frame title.  Do you use the
tab-bar?  Where in your code does the "quitting the window (and frame)"
you mentioned earlier happen?

martin



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
In reply to this post by martin rudalics
> Cc: [hidden email], [hidden email]
> From: martin rudalics <[hidden email]>
> Date: Mon, 16 Mar 2020 10:24:14 +0100
>
>  >> As far as xdisp.c is concerned it simply must not set selected_frame
>  >> to a dead frame.
>  >
>  > I don't think that's possible in xdisp.c cases you've shown.
>  >
>  >> Never ever.
>  >
>  > Why not?
>
> Because it might shift the abort to the next instance of SELECTED_FRAME.

Why does it matter which SELECTED_FRAME crashes?

Anyway, my point was a different one: it was that we cannot simply
"not select" such a frame, we need to do something else.  What exactly
is not trivial, and I didn't understand what you were suggesting to
do.

>  >> As far as frame.c is concerned, it should do something like in the
>  >> attached patch.
>  >
>  > We cannot punt like that in the display engine.
>
> Why not?

Because we must have a frame that we were supposed to redisplay.

> At least one of the frame restorations is unprotected anyway
> and might leave the temporarily selected frame selected.

The display engine doesn't select frames to show them to the user, it
selects them to redraw their windows.  So the considerations what to
do in this case are different from those we need to consider when the
user selects a frame.

>  > So you are saying that selecting such a frame will cause trouble to
>  > some other code, not to the display engine?
>
> Not "will" but "may".  The problem is that it then might be harder
> to find the cause.
>
> With emacs -Q evaluate
>
> (defvar foo
>    '(:eval
>      (when (> (length (frame-list)) 1)
>        (delete-frame (next-frame)))))
>
> (setq-default mode-line-format foo)
>
> and do C-x 5 2.  The backtrace I get here is

Which just means we need to add the protection to SELECTED_FRAME
itself, so that it runs everywhere.



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
 > Why does it matter which SELECTED_FRAME crashes?

Because the next crash may happen at some time in the future.  Why not
cure the first crash we have right away?

 > Anyway, my point was a different one: it was that we cannot simply
 > "not select" such a frame, we need to do something else.  What exactly
 > is not trivial, and I didn't understand what you were suggesting to
 > do.
 >
 >>   >> As far as frame.c is concerned, it should do something like in the
 >>   >> attached patch.
 >>   >
 >>   > We cannot punt like that in the display engine.
 >>
 >> Why not?
 >
 > Because we must have a frame that we were supposed to redisplay.

Either we are miscommunicating or I' m just dumb.  I would in no way
restrict the display engine in choosing whatever live frame it wants to
redisplay.

 > The display engine doesn't select frames to show them to the user, it
 > selects them to redraw their windows.  So the considerations what to
 > do in this case are different from those we need to consider when the
 > user selects a frame.

As I said above: This is not about the frame its windows it has to
redraw.  It's about the display engine trying to select a frame after
it has redrawn (parts of) another frame's windows.

 >> Not "will" but "may".  The problem is that it then might be harder
 >> to find the cause.
 >>
 >> With emacs -Q evaluate
 >>
 >> (defvar foo
 >>     '(:eval
 >>       (when (> (length (frame-list)) 1)
 >>         (delete-frame (next-frame)))))
 >>
 >> (setq-default mode-line-format foo)
 >>
 >> and do C-x 5 2.  The backtrace I get here is
 >
 > Which just means we need to add the protection to SELECTED_FRAME
 > itself, so that it runs everywhere.

But SELECTED_FRAME is not the cause of this problem.  The cause of the
problem is AFAICT the fact that :eval is allowed to do silly things
while the display engine tries to redraw windows.  The example above is
only a mirror of

----------------------
With emacs -Q evaluate

(defvar foo
   '(:eval
     (when (> (length (frame-list)) 1)
       (delete-frame))))

(setq-default mode-line-format foo)

and do C-x 5 2.
---------------

where I'm told that

:eval deleted the frame being displayed

So the display engine is, in principle, aware of one incarnation of the
problem - the one where an :eval tries to delete under its feet the
frame it currently tries to redraw and the comment correctly says that

   This is a nonsensical thing to do,
   and signaling an error from redisplay might be
   dangerous, but we cannot continue with an invalid frame.

So here the display engine bows out.  OTOH we allow it to set
selected_frame to an equally invalid frame.  Isn't that a bit selfish?

martin



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
> Cc: [hidden email], [hidden email]
> From: martin rudalics <[hidden email]>
> Date: Tue, 17 Mar 2020 10:38:11 +0100
>
>  > Why does it matter which SELECTED_FRAME crashes?
>
> Because the next crash may happen at some time in the future.  Why not
> cure the first crash we have right away?

If we can cure it, sure.  But I don't yet see what kind of cure are
you suggesting.  And in any case, the cure is not in SELECTED_FRAME.

>  >>   >> As far as frame.c is concerned, it should do something like in the
>  >>   >> attached patch.
>  >>   >
>  >>   > We cannot punt like that in the display engine.
>  >>
>  >> Why not?
>  >
>  > Because we must have a frame that we were supposed to redisplay.
>
> Either we are miscommunicating or I' m just dumb.  I would in no way
> restrict the display engine in choosing whatever live frame it wants to
> redisplay.

The original crash, and the crash you reported a couple of messages
upthread, are both in redisplay, though.  So I'm looking for a
solution to those.  Assigning some arbitrary value to a local variable
and/or switching to a different frame can be such solutions, albeit
not optimal ones; the changes you propose for frame.c cannot.

So I'm still unsure what exactly would you propose for the display
engine to do when it needs to examine the selected frame and discovers
that this frame is invalid.

>  > The display engine doesn't select frames to show them to the user, it
>  > selects them to redraw their windows.  So the considerations what to
>  > do in this case are different from those we need to consider when the
>  > user selects a frame.
>
> As I said above: This is not about the frame its windows it has to
> redraw.  It's about the display engine trying to select a frame after
> it has redrawn (parts of) another frame's windows.

The display engine selects a frame because it needs to display
something related to that frame.  If it cannot select it, it should do
something about that, not just punt.

> :eval deleted the frame being displayed
>
> So the display engine is, in principle, aware of one incarnation of the
> problem - the one where an :eval tries to delete under its feet the
> frame it currently tries to redraw and the comment correctly says that
>
>    This is a nonsensical thing to do,
>    and signaling an error from redisplay might be
>    dangerous, but we cannot continue with an invalid frame.

You are proposing that we find all the places where SELECTED_FRAME is
used and fix them one by one?  I thought it could be better to fix
them all at once as part of SELECTED_FRAME.



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
 > You are proposing that we find all the places where SELECTED_FRAME is
 > used and fix them one by one?  I thought it could be better to fix
 > them all at once as part of SELECTED_FRAME.

We are still miscommunicating.  I only want to fix the parts that
restore selected_frame so to make sure that they never set it to a dead
frame.  See the attached selected_frame.diff.

martin

selected_frame.diff (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

Eli Zaretskii
> Cc: [hidden email], [hidden email]
> From: martin rudalics <[hidden email]>
> Date: Tue, 17 Mar 2020 18:31:18 +0100
>
>  > You are proposing that we find all the places where SELECTED_FRAME is
>  > used and fix them one by one?  I thought it could be better to fix
>  > them all at once as part of SELECTED_FRAME.
>
> We are still miscommunicating.  I only want to fix the parts that
> restore selected_frame so to make sure that they never set it to a dead
> frame.

Why?



Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
> Why?

So that SELECTED_FRAME does not abort.

martin





Reply | Threaded
Open this post in threaded view
|

bug#39977: 28.0.50; Unhelpful stack trace

martin rudalics
 > I'm sorry, I think I no longer know what we are discussing.

For me the present abort is just another instance of Bug#29726 where you
said:

   The reason for the crash is that the ':eval' form which you have on
   the header-line can delete the frame whose header-line Emacs is
   redrawing!  The Lisp-level backtrace below shows how delete-frame is
   called from your code; hopefully, this backtrace will allow you to fix
   your code so it doesn't do such nonsensical things.

Only that in the case at hand 'delete-frame' does not try to delete the
frame the display engine is working on and so your fix for Bug#29726
won't catch it.  Rather, the frame that gets deleted is the frame that
was selected before the display engine started to process the :eval
form.  When, after processing the :eval form and the containing mode
line or title bar format, the display engine wants to restore the
previously selected frame, it sets selected_frame to a dead frame.  And
the next attempt to use selected_frame via SELECTED_FRAME results in the
abort.

 > Feel free
 > to fix this (whatever it is) as you see fit.

The longer I'm looking into this, the more I think that we should be
much more restrictive wrt what an :eval form in mode line or title name
processing should be allowed to do.  Tab bars could provide even more
confusion.  I think we should disallow any such :eval to kill buffers
and delete windows or frames at the very least.

Maybe it should be also disallowed to select a window or frame or
whatever the display engine tries to restore after processing these
forms.  Such selections would be usually undone anyway by the display
engine.  Probably, we should disallow such :eval forms to modify
"anything" at all but I have no idea how to do that.

martin



12