bug#30735: 25.3; slow comment c++-mode

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#30735: 25.3; slow comment c++-mode

Nil Geisweiller
1. Open the following file (make sure you have write privilege)

https://raw.githubusercontent.com/opencog/atomspace/master/tests/unify/
UnifyUTest.cxxtest

2. Enable c++-mode (M-x c++-mode)

3. Mark a large region of that file (say l.175 to the end)

4. Comment out that region (M-;)

5. Meditate in the church of Emacs for a few minutes

If c++-mode is disabled (using fundamental-mode for instance)
commenting
that same region is instantaneous. After running some profiling it
seems
most of the CPU resources is spent in c-syntactic-skip-backward.

The same slowness occurs with Emacs 25.1. I couldn't try older versions
due to temacs Segfault compiling issue (unresolved by disabling
randomize_va_space).


In GNU Emacs 25.3.1 (x86_64-redhat-linux-gnu, GTK+ Version 3.22.19)
 of 2017-09-15 built on buildvm-31.phx2.fedoraproject.org
Windowing system distributor 'Fedora Project', version 11.0.11906000
Configured using:
 'configure --build=x86_64-redhat-linux-gnu
 --host=x86_64-redhat-linux-gnu --program-prefix=
 --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr
 --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc
 --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64
 --libexecdir=/usr/libexec --localstatedir=/var
 --sharedstatedir=/var/lib --mandir=/usr/share/man
 --infodir=/usr/share/info --with-dbus --with-gif --with-jpeg --with-
png
 --with-rsvg --with-tiff --with-xft --with-xpm --with-x-toolkit=gtk3
 --with-gpm=no --with-xwidgets --with-modules
 build_alias=x86_64-redhat-linux-gnu host_alias=x86_64-redhat-linux-gnu
 'CFLAGS=-DMAIL_USE_LOCKF -O2 -g -pipe -Wall -Werror=format-security
 -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong
 --param=ssp-buffer-size=4 -grecord-gcc-switches
 -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic'
 LDFLAGS=-Wl,-z,relro
 PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND DBUS GCONF GSETTINGS
NOTIFY
ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 MODULES XWIDGETS

Important settings:
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix

Major mode: C++/l

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t
  abbrev-mode: t

Recent messages:
Saved text until ", ts_expected));
}

#undef al
#undef an
"
Mark set [2 times]
Saved text from "#include <opencog/util/Logger.h>

#inclu"

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message dired format-spec rfc822 mml
mml-sec password-cache epg epg-config gnus-util mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail
rfc2047 rfc2045 ietf-drums mm-util help-fns mail-prsvr mail-utils
cl-extra help-mode cc-mode cc-fonts easymenu cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-vars cc-defs cl-loaddefs pcase cl-lib
time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel x-win term/common-win x-dnd tool-bar dnd fontset
image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cl-generic
cham
georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese charscript case-table epa-
hook
jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote dbusbind inotify dynamic-setting
system-font-setting font-render-setting xwidget-internal move-toolbar
gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 136237 5145)
 (symbols 48 21890 0)
 (miscs 40 56 157)
 (strings 32 19783 4769)
 (string-bytes 1 778551)
 (vectors 16 14190)
 (vector-slots 8 458261 5239)
 (floats 8 167 43)
 (intervals 56 9646 10)
 (buffers 976 18))



Reply | Threaded
Open this post in threaded view
|

bug#30735: (no subject)

Nil Geisweiller
Important info: the slow down occurs in 24.4.1 but does not in 23.3.1



Reply | Threaded
Open this post in threaded view
|

bug#30735: (no subject)

Nil Geisweiller
In reply to this post by Nil Geisweiller
In case it helps, I should mention that the slow down occurs in 24.3.1
as well.



Reply | Threaded
Open this post in threaded view
|

bug#30735: 25.3; slow comment c++-mode

Alan Mackenzie
In reply to this post by Nil Geisweiller
Hello, Nil.

In article <[hidden email]> you wrote:
> 1. Open the following file (make sure you have write privilege)

> https://raw.githubusercontent.com/opencog/atomspace/master/tests/unify/
> UnifyUTest.cxxtest

> 2. Enable c++-mode (M-x c++-mode)

> 3. Mark a large region of that file (say l.175 to the end)

> 4. Comment out that region (M-;)

> 5. Meditate in the church of Emacs for a few minutes

> If c++-mode is disabled (using fundamental-mode for instance)
> commenting that same region is instantaneous. After running some
> profiling it seems most of the CPU resources is spent in
> c-syntactic-skip-backward.

Thanks for taking the trouble to report this bug.

Commenting out 2300 lines in fundamental mode is going to be
instantaneous.  However, in C++ Mode, the mode has to check whether the
syntax (i.e. whether in a string/comment, and particularly, whether any
< or > characters are template delimiters) has changed, at any buffer
modification.  It also has to check whether any "starting position" of a
(possibly nested) declaration or function has changed.  This all takes
time.

However, it is taking too much time.  The problem is in a routine which
determines a backward search limit n non-literal characters back (where
"literal" means string or comment).  The lines being commented contain
no such non-literals at all, so each backward limit was being determined
to be n characters before line 175.  Thus the time taken was quadratic
with the size of the section to be commented out.

I've amended the two critical parts of the code so that there is an
absolute limit to how far back this backward search limit may be.  The
code is now significantly faster.

The timings for commenting out those lines with the latest version of CC
Mode running on the unreleased Emacs-26.0.91 are:
  o - (without patch) 622.9 seconds;
  o - (with patch) 40.3 seconds.

40 seconds may still feel too long, but we are talking about over 2000
lines.

> The same slowness occurs with Emacs 25.1. I couldn't try older versions
> due to temacs Segfault compiling issue (unresolved by disabling
> randomize_va_space).

I've adapted the patch to apply to Emacs 25.3.  Would you please try
applying it, recompiling the two changed files (which are in
.../emacs/lisp/progmodes/) and let me know whether the result is
satisfactory.  (If you want any help applying the patch or byte
compiling the lisp files, feel free to ask me by private mail.)

> In GNU Emacs 25.3.1 (x86_64-redhat-linux-gnu, GTK+ Version 3.22.19)
>  of 2017-09-15 built on buildvm-31.phx2.fedoraproject.org

Here's the patch:


--- cc-engine.el~ 2018-03-11 10:11:17.488821748 +0000
+++ cc-engine.el 2018-03-11 09:59:50.592847587 +0000
@@ -4677,10 +4677,10 @@
       (t 'c))) ; Assuming the range is valid.
     range))
 
-(defsubst c-determine-limit-get-base (start try-size)
+(defsubst c-determine-limit-get-base (start try-size &optional abs-limit)
   ;; Get a "safe place" approximately TRY-SIZE characters before START.
-  ;; This doesn't preserve point.
-  (let* ((pos (max (- start try-size) (point-min)))
+  ;; This defsubst doesn't preserve point.
+  (let* ((pos (max (- start try-size) (point-min) (or abs-limit 0)))
  (base (c-state-semi-safe-place pos))
  (s (save-restriction
       (widen)
@@ -4688,18 +4688,18 @@
  (cand (if (or (nth 4 s) (nth 3 s)) ; comment or string
    (nth 8 s)
  pos)))
-    (if (>= cand (point-min))
+    (if (>= cand (max (point-min) (or abs-limit 0)))
  cand
       (parse-partial-sexp pos start nil nil s 'syntax-table)
       (point))))
 
-(defun c-determine-limit (how-far-back &optional start try-size)
+(defun c-determine-limit (how-far-back &optional abs-limit start try-size)
   ;; Return a buffer position HOW-FAR-BACK non-literal characters from
   ;; START (default point).  The starting position, either point or
   ;; START may not be in a comment or string.
   ;;
-  ;; The position found will not be before POINT-MIN and won't be in a
-  ;; literal.
+  ;; The position found will not be before POINT-MIN, won't be before
+  ;; ABS-LIMIT, and won't be in a literal.
   ;;
   ;; We start searching for the sought position TRY-SIZE (default
   ;; twice HOW-FAR-BACK) bytes back from START.
@@ -4708,7 +4708,7 @@
   (save-excursion
     (let* ((start (or start (point)))
    (try-size (or try-size (* 2 how-far-back)))
-   (base (c-determine-limit-get-base start try-size))
+   (base (c-determine-limit-get-base start try-size abs-limit))
    (pos base)
 
    (s (parse-partial-sexp pos pos)) ; null state.
@@ -4760,10 +4760,11 @@
  (+ (car elt) (- count how-far-back)))
        ((eq base (point-min))
  (point-min))
-       ((> base (- start try-size)) ; Can only happen if we hit point-min.
- (car elt))
+       ((> base (- start try-size)) ; Can only happen if we hit
+    ; point-min or ABS-LIMIT.
+ (or (car elt) base))
        (t
- (c-determine-limit (- how-far-back count) base try-size))))))
+ (c-determine-limit (- how-far-back count) abs-limit base try-size))))))
 
 (defun c-determine-+ve-limit (how-far &optional start-pos)
   ;; Return a buffer position about HOW-FAR non-literal characters forward
@@ -5621,7 +5622,7 @@
       ;; Locate the earliest < after the barrier before the changed region,
       ;; which isn't already marked as a paren.
       (goto-char  (if beg-lit-limits (car beg-lit-limits) beg))
-      (setq beg-limit (c-determine-limit 512))
+      (setq beg-limit (c-determine-limit 512 (- (point) 1024)))
 
       ;; Remove the syntax-table/category properties from each pertinent <...>
       ;; pair.  Firstly, the ones with the < before beg and > after beg....
--- cc-mode.el~ 2018-03-11 10:12:03.419820020 +0000
+++ cc-mode.el 2018-03-11 10:02:01.447842665 +0000
@@ -1195,7 +1195,7 @@
     (goto-char (c-point 'bol new-pos))
     (when lit-limits ; Comment or string.
       (goto-char (car lit-limits)))
-    (setq bod-lim (c-determine-limit 500))
+    (setq bod-lim (c-determine-limit 500 (- (point) 1000)))
 
     (while
  ;; Go to a less nested declaration each time round this loop.



--
Alan Mackenzie (Nuremberg, Germany).




Reply | Threaded
Open this post in threaded view
|

bug#30735: (no subject)

Nil Geisweiller
In reply to this post by Nil Geisweiller
Thank you soooooooo much for taking the trouble to fix it! I tried
myself but I'm very far behind.

Unfortunately, I couldn't apply the patch completely because I couldn't
get my hand on the source code of 25.3.1, only 25.3. I did patch as far
as possible, the result is that it comments out really fast but the
syntax highlighting is wrong for the last few lines.

So I'm sorry I can't give you feedback to test your fix on 25.3.1. That
is said, I'm perfectly happy if you directly push your fix on the
master, I'll use HEAD instead of 25.3.1.



Reply | Threaded
Open this post in threaded view
|

bug#30735: 25.3; slow comment c++-mode

Noam Postavsky
In reply to this post by Nil Geisweiller
On Sun, Mar 11, 2018 at 4:15 PM, Nil Geisweiller
<[hidden email]> wrote:

> Unfortunately, I couldn't apply the patch completely because I couldn't get
> my hand on the source code of 25.3.1, only 25.3.

25.3 is the same as 25.3.1, the ".1" just refers to the build number
(i.e., how many times you've built that version of Emacs).

That said, it seems that the patch is against emacs-26 or master, not
25.3. In particular, the first hunk:

@@ -4677,10 +4677,10 @@
           (t 'c)))            ; Assuming the range is valid.
     range))

-(defsubst c-determine-limit-get-base (start try-size)
+(defsubst c-determine-limit-get-base (start try-size &optional abs-limit)
   ;; Get a "safe place" approximately TRY-SIZE characters before START.
-  ;; This doesn't preserve point.
-  (let* ((pos (max (- start try-size) (point-min)))
+  ;; This defsubst doesn't preserve point.
+  (let* ((pos (max (- start try-size) (point-min) (or abs-limit 0)))
      (base (c-state-semi-safe-place pos))
      (s (save-restriction
           (widen)

In 25.3, the end context would look like this:

     (base (c-state-semi-safe-place pos))
     (s (parse-partial-sexp base pos)))
    (if (or (nth 4 s) (nth 3 s))    ; comment or string



Reply | Threaded
Open this post in threaded view
|

bug#30735: (no subject)

Nil Geisweiller
In reply to this post by Nil Geisweiller
With a bit of tweaking I managed to apply the entire patch to HEAD and
it works!!!

I get a similar result as yours, 40s with the patch, 10m without. A
massive speed increase, and IMO completely acceptable given that
commenting out that many lines is rather rare.

Thanks again!



Reply | Threaded
Open this post in threaded view
|

bug#30735: (no subject)

Alan Mackenzie
In reply to this post by Nil Geisweiller
Bug (excessively slow comment-region in CC-Mode) fixed by these commits:

eb0d10d567af76967d8e738e51a79ef4998470b7
424103a6e351a6d2d8b94f86998c90fdf6afea27
b393ecf8e288f1e1b6a8ac55006715fa1046a5d4

.

--
Alan Mackenzie (Nuremberg, Germany).