bug#44007: 26.3; Strange shell mode performance

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#44007: 26.3; Strange shell mode performance

Herman
I'm experiencing strange shell (comint) mode performance. If I cat a 10
MB file, it takes 1:20 to cat it. Emacs uses only 30% cpu. But, if I
press enter after I send the cat command, it speeds up, and it only
takes 8 sec (emacs cpu usage goes up to 100%).

Same thing: if I execute "seq 100000" in a small emacs window, it takes
~3 seconds. But, if I press enter after executing seq, it takes only 0.4
sec (this is not 100% reproducible, sometimes pressing enter doesn't
make a difference).
(I executed comint-clear-buffer before measurements)

It happens with 26.3 and ~3-week-old master branch as well, without any
extra config (emacs -Q).


In GNU Emacs 26.3 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.20)
  of 2020-05-17, modified by Debian built on x86-csail-01
Windowing system distributor 'The X.Org Foundation', version 11.0.12008000
System Description:    Debian GNU/Linux bullseye/sid

Recent messages:
Overwrite mode enabled in current buffer
Overwrite mode disabled in current buffer
Mark set [2 times]
Setting ‘send-mail-function’ temporarily since "emacs -q" would
overwrite customizations
Sending...
Mark set [2 times]
Sending via mail...
Sending...done
mwheel-scroll: Beginning of buffer [8 times]
mwheel-scroll: End of buffer [6 times]
mwheel-scroll: Beginning of buffer [6 times]
Configured using:
  'configure --build x86_64-linux-gnu --prefix=/usr
  --sharedstatedir=/var/lib --libexecdir=/usr/lib
  --localstatedir=/var/lib --infodir=/usr/share/info
  --mandir=/usr/share/man --enable-libsystemd --with-pop=yes
  --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.3/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.3/site-lisp:/usr/share/emacs/site-lisp
  --with-sound=alsa --without-gconf --with-mailutils --build
  x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
  --libexecdir=/usr/lib --localstatedir=/var/lib
  --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd
  --with-pop=yes
  --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.3/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.3/site-lisp:/usr/share/emacs/site-lisp
  --with-sound=alsa --without-gconf --with-mailutils --with-x=yes
  --with-x-toolkit=gtk3 --with-toolkit-scroll-bars 'CFLAGS=-g -O2
  -fdebug-prefix-map=/build/emacs-mHAik2/emacs-26.3+1=.
-fstack-protector-strong
  -Wformat -Werror=format-security -Wall' 'CPPFLAGS=-Wdate-time
  -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS GLIB
NOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM THREADS LIBSYSTEMD LCMS2

Important settings:
   value of $LC_ALL: C.UTF-8
   value of $LANG: en_US.UTF-8
   locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
   tooltip-mode: t
   global-eldoc-mode: t
   eldoc-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t
   line-number-mode: t
   transient-mark-mode: t

Load-path shadows:
None found.

Features:
(mailalias mailclient browse-url cus-edit cus-start cus-load wid-edit
help-mode pp shadow sort mail-extr emacsbug message rmc puny seq
byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib dired
dired-loaddefs format-spec rfc822 mml easymenu mml-sec password-cache
epa derived epg epg-config gnus-util rmail rmail-loaddefs mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils
elec-pair time-date mule-util tooltip eldoc electric uniquify ediff-hook
vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode elisp-mode lisp-mode prog-mode register page
menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core term/tty-colors frame cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote threads dbusbind
inotify lcms2 dynamic-setting system-font-setting font-render-setting
move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 121337 21158)
  (symbols 48 22080 0)
  (miscs 40 64 154)
  (strings 32 32731 1152)
  (string-bytes 1 852243)
  (vectors 16 15681)
  (vector-slots 8 516540 10022)
  (floats 8 64 228)
  (intervals 56 612 0)
  (buffers 992 14))




Reply | Threaded
Open this post in threaded view
|

bug#44007: 26.3; Strange shell mode performance

Eli Zaretskii
> From: [hidden email], Géza <[hidden email]>
> Date: Thu, 15 Oct 2020 11:53:16 +0200
>
> I'm experiencing strange shell (comint) mode performance. If I cat a 10
> MB file, it takes 1:20 to cat it. Emacs uses only 30% cpu. But, if I
> press enter after I send the cat command, it speeds up, and it only
> takes 8 sec (emacs cpu usage goes up to 100%).
>
> Same thing: if I execute "seq 100000" in a small emacs window, it takes
> ~3 seconds. But, if I press enter after executing seq, it takes only 0.4
> sec (this is not 100% reproducible, sometimes pressing enter doesn't
> make a difference).
> (I executed comint-clear-buffer before measurements)

Can you please describe a detailed reproduction recipe, starting from
"emacs -Q"?  It is not exactly clear from the above whether you do
this in a shell-mode buffer or in some other mode, where exactly do
you press Enter, etc.  A detailed recipe will resolve all those
unclear aspects.

Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#44007: 26.3; Strange shell mode performance

Herman, Geza
Sure!

1. execute emacs by "emacs -Q"
2. M-x shell RET
3. in the shell, execute "seq 100000" (or cat a large file)
4. immediately after you hit enter on "seq 100000" press enter again

It doesn't reproduce 100% unfortunately. It seems that if it doesn't
reproduce, you need to start from step 1., so it has a higher chance to
happen.

Another strange thing is that enter can have the opposite effect as
well: if I press it during "seq 100000" several times (not immediately
after "seq 100000", but after ~1 sec), it can make seq slower. Instead
of the usual 3-4 sec, it slows down to >10 sec.
(I measured by "time seq 100000"). It seems that the more enter I press,
the slower it gets. But maybe it's a different issue.

On 2020-10-15 16:10, Eli Zaretskii wrote:

>> From: [hidden email], Géza <[hidden email]>
>> Date: Thu, 15 Oct 2020 11:53:16 +0200
>>
>> I'm experiencing strange shell (comint) mode performance. If I cat a 10
>> MB file, it takes 1:20 to cat it. Emacs uses only 30% cpu. But, if I
>> press enter after I send the cat command, it speeds up, and it only
>> takes 8 sec (emacs cpu usage goes up to 100%).
>>
>> Same thing: if I execute "seq 100000" in a small emacs window, it takes
>> ~3 seconds. But, if I press enter after executing seq, it takes only 0.4
>> sec (this is not 100% reproducible, sometimes pressing enter doesn't
>> make a difference).
>> (I executed comint-clear-buffer before measurements)
> Can you please describe a detailed reproduction recipe, starting from
> "emacs -Q"?  It is not exactly clear from the above whether you do
> this in a shell-mode buffer or in some other mode, where exactly do
> you press Enter, etc.  A detailed recipe will resolve all those
> unclear aspects.
>
> Thanks.




Reply | Threaded
Open this post in threaded view
|

bug#44007: 26.3; Strange shell mode performance

Herman
In reply to this post by Eli Zaretskii
It reproduces for me, both 26.3 and 28.0.50. And both for the X and
console version (by console version, I mean the usual version, but with
DISPLAY unset).

I attached a text file (hopefully it gets through).

cat-ing this file takes ~18 sec. But with enter, it takes just ~3 sec
(these measurements are done with the console version). It behaves like
this 100% of the time (it's a more reliable test than the "seq" test).

And it also speeds up, if I don't press enter, but constantly move the
mouse (using the X version).

It seems that emacs waits for some kind of event here (hence the low cpu
utilization). But if there's some event (mouse, or an unprocessed
input), then it speeds up (and uses 100% cpu).

test.txt.xz (84K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#44007: 26.3; Strange shell mode performance

Eli Zaretskii
> From: [hidden email], Géza <[hidden email]>
> Date: Fri, 16 Oct 2020 12:44:32 +0200
>
> It reproduces for me, both 26.3 and 28.0.50. And both for the X and
> console version (by console version, I mean the usual version, but with
> DISPLAY unset).
>
> I attached a text file (hopefully it gets through).
>
> cat-ing this file takes ~18 sec. But with enter, it takes just ~3 sec
> (these measurements are done with the console version). It behaves like
> this 100% of the time (it's a more reliable test than the "seq" test).

Does it help to play with the value of process-adaptive-read-buffering?
Or with read-process-output-max?



Reply | Threaded
Open this post in threaded view
|

bug#44007: 26.3; Strange shell mode performance

Herman

> Does it help to play with the value of process-adaptive-read-buffering?
> Or with read-process-output-max?
It seems it doesn't. I tried all 4 combinations of
read-process-output-max = 4096/1 MB and process-adaptive-read-buffering
= nil/t, emacs behaves the same.



Reply | Threaded
Open this post in threaded view
|

bug#44007: 26.3; Strange shell mode performance

Herman
In reply to this post by Eli Zaretskii
I've debugged the issue a little bit, and the cause of it is
process-adaptive-read-buffering indeed. I misused this variable before:
I tried to set it to nil before executing "cat", but it needs to be set
before "M-x shell".

Setting it to nil completely fixes the issue. If its value is "t", my
"cat test.txt" runs in 18 sec. Setting it to "nil", cat runs in ~2 sec.

Looking at the logic which does adaptive reading, I see why it can be
suboptimal. If a process sends less than 256 bytes at first, then delay
is increased, and it will be decreased only if the process can send
bytes which completely fills the buffer. But this may never be true
(because some other part of the system already throttles the process,
because emacs didn't read enough from it - I suppose this can depend on
the value read-process-output-max).

The issue can be made less severe by modifying emacs_intr_read to read
as much as data fit into the buffer (execute read() until it reports
failure/EOF). It solves the issue completely for me, if
read-process-output-max is 4096. If it's 1MB, this modification makes
the issue less severe: "cat test.txt" runs in 6 sec.

Here's the modification I did to emacs_intr_read, put this after the
do/while loop:

   if (result > 0) {
       for (;;) {
           ssize_t r = read(fd, buf+result, nbyte-result);
           if (r<=0) break;
           if (interruptible) maybe_quit();
           result += r;
       }
   }

(I'm not saying that this is the correct solution, I put this here so
you can repro my results)

The reason that it's needed that read() returns 4095 for some reason,
even though nbyte is 4096. An additional read() call will provide the
missing 1 byte. As far as I know, this is a valid and allowed, yet
strange behavior. I'm not sure what causes it.

I think that the logic around decreasing the delay is not good. The
current "decrease delay when buffer is full" is a fragile condition. To
me, it seems logical to clear/decrease delay if more than 256 bytes
arrived. It may not be 100% optimal, but it's near to it. On the other
hand, the current behavior can cause severe performance problems for
fast streams. Having read-process-output-max a large value is essential:
"cat test.txt" runs in 0.3 sec, when this value is 1MB (adaptive is
turned off). If read-process-output-max is 4K, it runs in 1.4 sec
(adaptive is turned off).