bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Vincent Lefevre-10

When I open a large XML file and immediately go to the end of the
file with '<ESC> >', Emacs hangs for several seconds. For instance,
on /usr/share/xml/iso-codes/iso_639-3.xml from iso-codes in Debian
(a 1-MB file), it takes 5 seconds. On a 4-MB personal XML file, it
takes 15 seconds.

This is a regression: Emacs 25 did not hang at all.


In GNU Emacs 26.1 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.2)
 of 2018-12-26, modified by Debian built on x86-ubc-01
Windowing system distributor 'The X.Org Foundation', version 11.0.12003000
System Description: Debian GNU/Linux buster/sid

Recent messages:
Loading /etc/emacs/site-start.d/50latex-cjk-common.el (source)...done
Loading /etc/emacs/site-start.d/50latex-cjk-thai.el (source)...done
Loading /etc/emacs/site-start.d/50maxima-emacs.el (source)...done
Loading /etc/emacs/site-start.d/50psvn.el (source)...done
Loading /etc/emacs/site-start.d/50python-docutils.el (source)...done
Loading /etc/emacs/site-start.d/50texlive-lang-english.el (source)...done
Loading /etc/emacs/site-start.d/50why3.el (source)...done
Loading /home/vinc17/share/emacs/site-lisp/mutteditor.el (source)...done
Loading time...done
For information about GNU Emacs and the GNU system, type C-h C-a.

Configured using:
 'configure --build x86_64-linux-gnu --prefix=/usr
 --sharedstatedir=/var/lib --libexecdir=/usr/lib
 --localstatedir=/var/lib --infodir=/usr/share/info
 --mandir=/usr/share/man --enable-libsystemd --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.1/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils --build
 x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
 --libexecdir=/usr/lib --localstatedir=/var/lib
 --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd
 --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.1/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils --with-x=yes
 --with-x-toolkit=gtk3 --with-toolkit-scroll-bars 'CFLAGS=-g -O2
 -fdebug-prefix-map=/build/emacs-3ThesY/emacs-26.1+1=.
 -fstack-protector-strong -Wformat -Werror=format-security -Wall'
 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS NOTIFY
ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 THREADS LIBSYSTEMD LCMS2

Important settings:
  value of $LC_COLLATE: POSIX
  value of $LC_CTYPE: en_US.UTF-8
  value of $LC_TIME: en_DK
  value of $LANG: POSIX
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  display-time-mode: t
  show-paren-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
/usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.6/tablegen-mode
/usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.6/llvm-mode
/usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.6/emacs
/usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.7/tablegen-mode
/usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.7/llvm-mode
/usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.7/emacs
/usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.8/tablegen-mode
/usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.8/llvm-mode
/usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.8/emacs
/usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.9/tablegen-mode
/usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.9/llvm-mode
/usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-3.9/emacs
/usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-4.0/tablegen-mode
/usr/share/emacs/site-lisp/llvm-3.5/llvm-mode hides /usr/share/emacs/site-lisp/llvm-4.0/llvm-mode
/usr/share/emacs/site-lisp/llvm-3.5/emacs hides /usr/share/emacs/site-lisp/llvm-4.0/emacs
/usr/share/emacs/site-lisp/rst hides /usr/share/emacs/26.1/lisp/textmodes/rst
/usr/share/emacs/site-lisp/latex-cjk-thai/thai-word hides /usr/share/emacs/26.1/lisp/language/thai-word

Features:
(shadow sort mail-extr warnings emacsbug message rmc puny seq byte-opt
gv bytecomp byte-compile cconv dired dired-loaddefs format-spec rfc822
mml easymenu mml-sec password-cache epa derived epg epg-config gnus-util
rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util mail-prsvr mail-utils elec-pair time cus-start cus-load paren
cc-styles cc-align cc-engine cc-vars cc-defs edmacro kmacro cl-loaddefs
cl-lib time-date mule-util tooltip eldoc electric uniquify ediff-hook
vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode elisp-mode lisp-mode prog-mode register page
menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core term/tty-colors frame cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting move-toolbar gtk
x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 118562 10618)
 (symbols 48 23199 1)
 (miscs 40 54 133)
 (strings 32 34944 2101)
 (string-bytes 1 946046)
 (vectors 16 15937)
 (vector-slots 8 510844 4784)
 (floats 8 56 97)
 (intervals 56 279 0)
 (buffers 992 12))



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Eli Zaretskii
> From: Vincent Lefevre <[hidden email]>
> Date: Thu, 27 Dec 2018 11:13:06 +0100
>
> When I open a large XML file and immediately go to the end of the
> file with '<ESC> >', Emacs hangs for several seconds. For instance,
> on /usr/share/xml/iso-codes/iso_639-3.xml from iso-codes in Debian
> (a 1-MB file), it takes 5 seconds. On a 4-MB personal XML file, it
> takes 15 seconds.
>
> This is a regression: Emacs 25 did not hang at all.

Confirmed, thanks.

The profile (see below) blames syntax-ppss called by
sgml-syntax-propertize, so I suspect commit 0055190, which added
sgml-syntax-propertize-inside to sgml-syntax-propertize.

CC'ing Stefan who made those changes.

Here's the profile:

  - command-execute                                                 532  77%
   - call-interactively                                             532  77%
    - funcall-interactively                                         522  75%
     - end-of-buffer                                                500  72%
      - recenter                                                    496  71%
       - jit-lock-function                                          496  71%
        - jit-lock-fontify-now                                      496  71%
         - jit-lock--run-functions                                  496  71%
          - run-hook-wrapped                                        496  71%
           - #<compiled 0x200000000b3a7fd0>                         496  71%
            - font-lock-fontify-region                              496  71%
             - font-lock-default-fontify-region                     496  71%
              - nxml-extend-region                                  496  71%
               - skip-syntax-forward                                496  71%
                - internal--syntax-propertize                       496  71%
                 - syntax-propertize                                496  71%
                  - sgml-syntax-propertize                          490  71%
                     syntax-ppss                                    445  64%
        push-mark                                                     1   0%
     - find-file                                                     20   2%
      - find-file-noselect                                           20   2%
       - find-file-noselect-1                                        19   2%
        - after-find-file                                            17   2%
         - normal-mode                                               17   2%
          - set-auto-mode                                            17   2%
           - set-auto-mode-0                                         17   2%
            - xml-mode                                               17   2%
             - byte-code                                             14   2%
              - require                                              12   1%
               - byte-code                                           11   1%
                - require                                            10   1%
                 - byte-code                                          9   1%
                  - require                                           6   0%
                   - byte-code                                        6   0%
                    - cl-generic-define-method                        4   0%
                     - cl--generic-make-function                      4   0%
                      - cl--generic-make-next-function                  4   0%
                       - cl--generic-get-dispatcher                   4   0%
                        - byte-compile                                3   0%
                           byte-code                                  1   0%
                         - #<compiled 0x200000000b325048>                  1   0%
                            byte-compile-top-level                    1   0%
                  - custom-declare-variable                           1   0%
                   - custom-initialize-reset                          1   0%
                    - eval                                            1   0%
                     - funcall                                        1   0%
                      - #<compiled 0x200000000b3c88b8>                  1   0%
                       - executable-find                              1   0%
                          locate-file                                 1   0%
                 file-truename                                        1   0%
             - rng-nxml-mode-init                                     2   0%
              - rng-validate-mode                                     2   0%
               - rng-auto-set-schema                                  2   0%
                - rng-locate-schema-file                              2   0%
                 - rng-locate-schema-file-using                       2   0%
                  - rng-get-parsed-schema-locating-file                  2   0%
                   - rng-parse-schema-locating-file                   1   0%
                    - rng-parse-validate-file                         1   0%
                     - nxml-parse-instance                            1   0%
                        nxml-parse-instance-1                         1   0%
             - file-truename                                          1   0%
              - file-truename                                         1   0%
               - file-truename                                        1   0%
                  file-truename                                       1   0%
        - insert-file-contents                                        1   0%
           xml-find-file-coding-system                                1   0%
     - execute-extended-command                                       1   0%
      - sit-for                                                       1   0%
         redisplay                                                    1   0%
     - minibuffer-complete                                            1   0%
      - completion-in-region                                          1   0%
       - completion--in-region                                        1   0%
        - #<compiled 0x2000000001b04c20>                              1   0%
         - apply                                                      1   0%
          - #<compiled 0x20000000013baac8>                            1   0%
           - completion--in-region-1                                  1   0%
            - completion--do-completion                               1   0%
             - completion-try-completion                              1   0%
              - completion--nth-completion                            1   0%
               - completion--some                                     1   0%
                - #<compiled 0x2000000001b0bd20>                      1   0%
                 - completion-basic-try-completion                    1   0%
                  - try-completion                                    1   0%
                     completion-file-name-table                       1   0%
    - byte-code                                                      10   1%
     - read-extended-command                                          9   1%
      - completing-read                                               9   1%
       - completing-read-default                                      9   1%
          read-from-minibuffer                                        9   1%
     - find-file-read-args                                            1   0%
      - read-file-name                                                1   0%
       - read-file-name-default                                       1   0%
        - completing-read                                             1   0%
         - completing-read-default                                    1   0%
          - read-from-minibuffer                                      1   0%
           - redisplay_internal (C function)                          1   0%
              find-image                                              1   0%
  - ...                                                             158  22%
     Automatic GC                                                   156  22%
   - macroexp--all-forms                                              1   0%
    - macroexp--expand-all                                            1   0%
     - #<compiled 0x2000000001375130>                                 1   0%
      - macroexp--all-forms                                           1   0%
       - macroexp--expand-all                                         1   0%
        - macroexp--all-forms                                         1   0%
         - macroexp--expand-all                                       1   0%
          - #<compiled 0x2000000001375130>                            1   0%
           - macroexp--all-forms                                      1   0%
            - macroexp--expand-all                                    1   0%
             - #<compiled 0x2000000001375068>                         1   0%
              - macroexp--all-forms                                   1   0%
               - macroexp--expand-all                                 1   0%
                - macroexp-macroexpand                                1   0%
                 - macroexpand                                        1   0%
                    #<compiled 0x20000000013f0600>                    1   0%
   - rng-compute-start-tag-open-deriv                                 1   0%
    - rng-element-get-child                                           1   0%
     - rng-compile                                                    1   0%
      - apply                                                         1   0%
       - rng-compile-group                                            1   0%
        - mapcar                                                      1   0%
         - rng-compile                                                1   0%
          - apply                                                     1   0%
           - rng-compile-attribute                                    1   0%
            - rng-compile                                             1   0%
             - apply                                                  1   0%
              - rng-compile-ref                                       1   0%
               - rng-compile                                          1   0%
                - apply                                               1   0%
                 - rng-compile-data                                   1   0%
                    rng-compile-dt                                    1   0%



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Stefan Monnier
>> When I open a large XML file and immediately go to the end of the
>> file with '<ESC> >', Emacs hangs for several seconds. For instance,
>> on /usr/share/xml/iso-codes/iso_639-3.xml from iso-codes in Debian
>> (a 1-MB file), it takes 5 seconds. On a 4-MB personal XML file, it
>> takes 15 seconds.
>>
>> This is a regression: Emacs 25 did not hang at all.
>
> Confirmed, thanks.
>
> The profile (see below) blames syntax-ppss called by
> sgml-syntax-propertize, so I suspect commit 0055190, which added
> sgml-syntax-propertize-inside to sgml-syntax-propertize.

Sounds right, but I'm not sure what to do about this.
I don't wonder why so much time is passed on syntax-ppss, which is
generally expected to be relatively fast.
Maybe sgml-syntax-propertize is called too often (I see it's mostly
called from skip-syntax-forward; maybe we should call syntax-propertize
explicitly beforehand with a more distant position so
sgml-syntax-propertize is called just once).


        Stefan


> Here's the profile:
>
>   - command-execute                                                 532  77%
>    - call-interactively                                             532  77%
>     - funcall-interactively                                         522  75%
>      - end-of-buffer                                                500  72%
>       - recenter                                                    496  71%
>        - jit-lock-function                                          496  71%
> - jit-lock-fontify-now                                      496  71%
> - jit-lock--run-functions                                  496  71%
>  - run-hook-wrapped                                        496  71%
>   - #<compiled 0x200000000b3a7fd0>                         496  71%
>    - font-lock-fontify-region                              496  71%
>     - font-lock-default-fontify-region                     496  71%
>      - nxml-extend-region                                  496  71%
>       - skip-syntax-forward                                496  71%
> - internal--syntax-propertize                       496  71%
> - syntax-propertize                                496  71%
>  - sgml-syntax-propertize                          490  71%
>     syntax-ppss                                    445  64%
> push-mark                                                     1   0%
>      - find-file                                                     20   2%
>       - find-file-noselect                                           20   2%
>        - find-file-noselect-1                                        19   2%
> - after-find-file                                            17   2%
> - normal-mode                                               17   2%
>  - set-auto-mode                                            17   2%
>   - set-auto-mode-0                                         17   2%
>    - xml-mode                                               17   2%
>     - byte-code                                             14   2%
>      - require                                              12   1%
>       - byte-code                                           11   1%
> - require                                            10   1%
> - byte-code                                          9   1%
>  - require                                           6   0%
>   - byte-code                                        6   0%
>    - cl-generic-define-method                        4   0%
>     - cl--generic-make-function                      4   0%
>      - cl--generic-make-next-function                  4   0%
>       - cl--generic-get-dispatcher                   4   0%
> - byte-compile                                3   0%
>   byte-code                                  1   0%
> - #<compiled 0x200000000b325048>                  1   0%
>    byte-compile-top-level                    1   0%
>  - custom-declare-variable                           1   0%
>   - custom-initialize-reset                          1   0%
>    - eval                                            1   0%
>     - funcall                                        1   0%
>      - #<compiled 0x200000000b3c88b8>                  1   0%
>       - executable-find                              1   0%
>  locate-file                                 1   0%
> file-truename                                        1   0%
>     - rng-nxml-mode-init                                     2   0%
>      - rng-validate-mode                                     2   0%
>       - rng-auto-set-schema                                  2   0%
> - rng-locate-schema-file                              2   0%
> - rng-locate-schema-file-using                       2   0%
>  - rng-get-parsed-schema-locating-file                  2   0%
>   - rng-parse-schema-locating-file                   1   0%
>    - rng-parse-validate-file                         1   0%
>     - nxml-parse-instance                            1   0%
> nxml-parse-instance-1                         1   0%
>     - file-truename                                          1   0%
>      - file-truename                                         1   0%
>       - file-truename                                        1   0%
>  file-truename                                       1   0%
> - insert-file-contents                                        1   0%
>   xml-find-file-coding-system                                1   0%
>      - execute-extended-command                                       1   0%
>       - sit-for                                                       1   0%
> redisplay                                                    1   0%
>      - minibuffer-complete                                            1   0%
>       - completion-in-region                                          1   0%
>        - completion--in-region                                        1   0%
> - #<compiled 0x2000000001b04c20>                              1   0%
> - apply                                                      1   0%
>  - #<compiled 0x20000000013baac8>                            1   0%
>   - completion--in-region-1                                  1   0%
>    - completion--do-completion                               1   0%
>     - completion-try-completion                              1   0%
>      - completion--nth-completion                            1   0%
>       - completion--some                                     1   0%
> - #<compiled 0x2000000001b0bd20>                      1   0%
> - completion-basic-try-completion                    1   0%
>  - try-completion                                    1   0%
>     completion-file-name-table                       1   0%
>     - byte-code                                                      10   1%
>      - read-extended-command                                          9   1%
>       - completing-read                                               9   1%
>        - completing-read-default                                      9   1%
>  read-from-minibuffer                                        9   1%
>      - find-file-read-args                                            1   0%
>       - read-file-name                                                1   0%
>        - read-file-name-default                                       1   0%
> - completing-read                                             1   0%
> - completing-read-default                                    1   0%
>  - read-from-minibuffer                                      1   0%
>   - redisplay_internal (C function)                          1   0%
>      find-image                                              1   0%
>   - ...                                                             158  22%
>      Automatic GC                                                   156  22%
>    - macroexp--all-forms                                              1   0%
>     - macroexp--expand-all                                            1   0%
>      - #<compiled 0x2000000001375130>                                 1   0%
>       - macroexp--all-forms                                           1   0%
>        - macroexp--expand-all                                         1   0%
> - macroexp--all-forms                                         1   0%
> - macroexp--expand-all                                       1   0%
>  - #<compiled 0x2000000001375130>                            1   0%
>   - macroexp--all-forms                                      1   0%
>    - macroexp--expand-all                                    1   0%
>     - #<compiled 0x2000000001375068>                         1   0%
>      - macroexp--all-forms                                   1   0%
>       - macroexp--expand-all                                 1   0%
> - macroexp-macroexpand                                1   0%
> - macroexpand                                        1   0%
>    #<compiled 0x20000000013f0600>                    1   0%
>    - rng-compute-start-tag-open-deriv                                 1   0%
>     - rng-element-get-child                                           1   0%
>      - rng-compile                                                    1   0%
>       - apply                                                         1   0%
>        - rng-compile-group                                            1   0%
> - mapcar                                                      1   0%
> - rng-compile                                                1   0%
>  - apply                                                     1   0%
>   - rng-compile-attribute                                    1   0%
>    - rng-compile                                             1   0%
>     - apply                                                  1   0%
>      - rng-compile-ref                                       1   0%
>       - rng-compile                                          1   0%
> - apply                                               1   0%
> - rng-compile-data                                   1   0%
>    rng-compile-dt                                    1   0%



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Eli Zaretskii
> From: Stefan Monnier <[hidden email]>
> Cc: Vincent Lefevre <[hidden email]>, [hidden email]
> Date: Thu, 27 Dec 2018 11:39:06 -0500
>
> > The profile (see below) blames syntax-ppss called by
> > sgml-syntax-propertize, so I suspect commit 0055190, which added
> > sgml-syntax-propertize-inside to sgml-syntax-propertize.
>
> Sounds right, but I'm not sure what to do about this.
> I don't wonder why so much time is passed on syntax-ppss, which is
> generally expected to be relatively fast.

Why was sgml-syntax-propertize-inside added?  Is its effect an
absolute must, or merely a nice-to-have feature?  If the latter,
perhaps a defcustom that could disable that call will be an okay
solution, at least as a stopgap?



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Stefan Monnier
> Why was sgml-syntax-propertize-inside added?  Is its effect an
> absolute must, or merely a nice-to-have feature?

It's needed for correctness in the presence of <?...?> or <![CDATA[...]]>

> If the latter, perhaps a defcustom that could disable that call will
> be an okay solution, at least as a stopgap?

I don't think it should be terribly expensive, so I'd rather first try
and better understand the performance issue,


        Stefan



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Eli Zaretskii
> From: Stefan Monnier <[hidden email]>
> Cc: [hidden email], [hidden email]
> Date: Thu, 27 Dec 2018 12:32:21 -0500
>
> > If the latter, perhaps a defcustom that could disable that call will
> > be an okay solution, at least as a stopgap?
>
> I don't think it should be terribly expensive, so I'd rather first try
> and better understand the performance issue,

Sure.  I thought you already did ;-)



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Vincent Lefevre-10
In reply to this post by Stefan Monnier
On 2018-12-27 12:32:21 -0500, Stefan Monnier wrote:
> > Why was sgml-syntax-propertize-inside added?  Is its effect an
> > absolute must, or merely a nice-to-have feature?
>
> It's needed for correctness in the presence of <?...?> or <![CDATA[...]]>

I use both in some of my XML files and I have never found any issue
with them. Or perhaps this is just for particular cases?

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Stefan Monnier
>> > Why was sgml-syntax-propertize-inside added?  Is its effect an
>> > absolute must, or merely a nice-to-have feature?
>> It's needed for correctness in the presence of <?...?> or <![CDATA[...]]>
> I use both in some of my XML files and I have never found any issue
> with them. Or perhaps this is just for particular cases?

Yes, it only makes a real difference when the content of those things
ends up confusing the parser (e.g. it looks like an unclosed tag, or
things along these lines).


        Stefan



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Fernando Jascovich
In reply to this post by Vincent Lefevre-10
Hi everyone, this is my first email to bug-gnu-emacs, so please let me
know if I am making some mistake.
For no special reason, I took this bug in order to start to know  emacs'
code.
Following and confirming the details of the bug, I found that indeed the
performance issue is introduced at commit 0055190174, but not beacuse
the introduction of `sgml-syntax-propertize-inside`.
The problem is with the last rule:
```
("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0))))
                    (goto-char (match-end 0)))
                  (string-to-syntax ".")))
```
I can't see the real effect of this rule, I tested xml parsing without
this rule and it works fine, marking double quotes inside tags as
expected without this performance issue.
Do we need to target double quotes outside tags explicitly?

--
Fernando Jascovich
developer
m: +54 9 3548 63 9833
github: https://github.com/fernando-jascovich/
linkedin: https://www.linkedin.com/in/fernandojascovich/



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Eli Zaretskii
> From: Fernando Jascovich <[hidden email]>
> Date: Tue, 08 Jan 2019 19:11:02 -0300
>
> Hi everyone, this is my first email to bug-gnu-emacs, so please let me
> know if I am making some mistake.
> For no special reason, I took this bug in order to start to know  emacs'
> code.
> Following and confirming the details of the bug, I found that indeed the
> performance issue is introduced at commit 0055190174, but not beacuse
> the introduction of `sgml-syntax-propertize-inside`.
> The problem is with the last rule:
> ```
> ("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0))))
>                     (goto-char (match-end 0)))
>                   (string-to-syntax ".")))
> ```
> I can't see the real effect of this rule, I tested xml parsing without
> this rule and it works fine, marking double quotes inside tags as
> expected without this performance issue.
> Do we need to target double quotes outside tags explicitly?

Stefan, any comments?



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Stefan Monnier
In reply to this post by Eli Zaretskii
> The profile (see below) blames syntax-ppss called by
> sgml-syntax-propertize, so I suspect commit 0055190, which added
> sgml-syntax-propertize-inside to sgml-syntax-propertize.

Hmm... actually, the syntax-ppss calls that take time are directly made
from within sgml-syntax-propertize rather than from within
sgml-syntax-propertize-inside (which doesn't even appear in your profile
(in my profile I get 8099 units of time in sgml-syntax-propertize, of
which 7611 in syntax-ppss and only 77 in sgml-syntax-propertize-inside).
The problem seems to come from the following syntax propertize rule:

     ;; Double quotes outside of tags should not introduce strings.
     ;; Be careful to call `syntax-ppss' on a position before the one we're
     ;; going to change, so as not to need to flush the data we just computed.
     ("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0))))
                    (goto-char (match-end 0)))
                  (string-to-syntax "."))))

If I comment it out, the delay is *much* smaller.

The problem being that " are quite common characters in XML files, so
the regexp matches often and we call syntax-ppss each time, so we end up
calling syntax-ppss very often.

I'm trying to figure out how to avoid calling syntax-ppss for every
" character.  I'm thinking of looking at pairs of " chars and only do
extra work if there's a < or > between the two.


        Stefan



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Stefan Monnier
In reply to this post by Eli Zaretskii
>> From: Fernando Jascovich <[hidden email]>
>> Date: Tue, 08 Jan 2019 19:11:02 -0300
>>
>> Hi everyone, this is my first email to bug-gnu-emacs, so please let me
>> know if I am making some mistake.
>> For no special reason, I took this bug in order to start to know  emacs'
>> code.
>> Following and confirming the details of the bug, I found that indeed the
>> performance issue is introduced at commit 0055190174, but not beacuse
>> the introduction of `sgml-syntax-propertize-inside`.
>> The problem is with the last rule:
>> ```
>> ("\"" (0 (if (prog1 (zerop (car (syntax-ppss (match-beginning 0))))
>>                     (goto-char (match-end 0)))
>>                   (string-to-syntax ".")))
>> ```
>> I can't see the real effect of this rule, I tested xml parsing without
>> this rule and it works fine, marking double quotes inside tags as
>> expected without this performance issue.
>> Do we need to target double quotes outside tags explicitly?
>
> Stefan, any comments?

Yes, he's exactly right.

I just pushed a patch to master which should reduce significantly
this delay.


        Stefan



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Noam Postavsky
In reply to this post by Vincent Lefevre-10
Vincent Lefevre <[hidden email]> writes:

> This is a regression: Emacs 25 did not hang at all.

Should we backport Stefan's fix to emacs-26?  Or specifically, backport
[1: e7e92dc5d2], which is Stefan's fix on top of my fix for the
loss-of-single-quote-fontification bug (Bug#35381).

[1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400
  Fix merge of sgml-syntax-propertize-rules
  https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Vincent Lefevre-10
Hi,

On 2019-05-15 19:53:08 -0400, Noam Postavsky wrote:

> Vincent Lefevre <[hidden email]> writes:
>
> > This is a regression: Emacs 25 did not hang at all.
>
> Should we backport Stefan's fix to emacs-26?  Or specifically, backport
> [1: e7e92dc5d2], which is Stefan's fix on top of my fix for the
> loss-of-single-quote-fontification bug (Bug#35381).
>
> [1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400
>   Fix merge of sgml-syntax-propertize-rules
>   https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b

It would be nice if this could be fixed quickly in emacs-26,
hoping that it could be fixed in Debian before the next stable
release.

(I'm still using Emacs 25 because of this bug.)

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Noam Postavsky
In reply to this post by Noam Postavsky
Noam Postavsky <[hidden email]> writes:

> [1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400
>   Fix merge of sgml-syntax-propertize-rules
>   https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b

Uh, I goofed that one, Stefan fixed it [2: 9a74e5666b].  The corrected patch would be as follows:



[2: 9a74e5666b]: 2019-05-15 22:21:36 -0400
  * lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Fix typo
  https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=9a74e5666b022098c63d0047c0df90c66e1aa64a

0001-Backport-sgml-syntax-propertize-rules-speedup-Bug-33.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Eli Zaretskii
In reply to this post by Noam Postavsky
> From: Noam Postavsky <[hidden email]>
> Date: Wed, 15 May 2019 19:53:08 -0400
> Cc: [hidden email]
>
> Vincent Lefevre <[hidden email]> writes:
>
> > This is a regression: Emacs 25 did not hang at all.
>
> Should we backport Stefan's fix to emacs-26?  Or specifically, backport
> [1: e7e92dc5d2], which is Stefan's fix on top of my fix for the
> loss-of-single-quote-fontification bug (Bug#35381).
>
> [1: e7e92dc5d2]: 2019-05-15 19:04:14 -0400
>   Fix merge of sgml-syntax-propertize-rules
>   https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=e7e92dc5d24ac3bcde69732bab6a6c3c0d9de97b

I'd like to leave this fix on master for a while, so that we could
make sure it has no adverse consequences.  Can we revisit this in a
month's time, say?



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Vincent Lefevre-10
In reply to this post by Noam Postavsky
On 2019-05-16 08:15:58 -0400, Noam Postavsky wrote:
> The corrected patch would be as follows:
[...]

I've tried the combination of

  ca14dd1d4628094dd33d5d94694dcf5f29e843b8
  7dab3ee7ab54b3c2e7bc24170376054786c01d6f

and this patch against Debian's current source package.

Emacs no longer hangs, but I get incorrect highlighting,
for instance on the following XML file.

<root>
<!-- comment -->
<a>"a'</a>
<!-- comment -->
</root>

Highlighting starts to be wrong at the single-quote character.
I've attached a screenshot obtained with the -Q option.

Did I miss anything?

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

nxml.png (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Noam Postavsky
Vincent Lefevre <[hidden email]> writes:

> I've tried the combination of
>
>   ca14dd1d4628094dd33d5d94694dcf5f29e843b8
>   7dab3ee7ab54b3c2e7bc24170376054786c01d6f
>
> and this patch against Debian's current source package.
>
> Emacs no longer hangs, but I get incorrect highlighting,
> for instance on the following XML file.
>
> <root>
> <!-- comment -->
> <a>"a'</a>
> <!-- comment -->
> </root>
>
> Highlighting starts to be wrong at the single-quote character.
> I've attached a screenshot obtained with the -Q option.
>
> Did I miss anything?
Ah, I didn't get the mixed quote handling right.  Here's the fix for master:


From 4677edd8dd65b5d956732821e78794f35b275418 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <[hidden email]>
Date: Sat, 18 May 2019 00:04:01 -0400
Subject: [PATCH] Fix Bug#33887 for mixed quote usage

* lisp/textmodes/sgml-mode.el (sgml-syntax-propertize-rules): Only
skip syntax-ppss for matched quotes.
* test/lisp/textmodes/sgml-mode-tests.el (sgml-tests--quotes-syntax):
Expand test.
---
 lisp/textmodes/sgml-mode.el            |  4 ++--
 test/lisp/textmodes/sgml-mode-tests.el | 17 ++++++++++++-----
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/lisp/textmodes/sgml-mode.el b/lisp/textmodes/sgml-mode.el
index 1b064fb825..e3cf56aa0e 100644
--- a/lisp/textmodes/sgml-mode.el
+++ b/lisp/textmodes/sgml-mode.el
@@ -345,8 +345,8 @@ sgml-font-lock-keywords
      ;; the resulting number of calls to syntax-ppss made it too slow
      ;; (bug#33887), so we're now careful to leave alone any pair
      ;; of quotes that doesn't hold a < or > char, which is the vast majority.
-     ("\\(?:\\(?1:\"\\)[^\"<>]*[<>\"]\\|\\(?1:'\\)[^'<>]*[<>']\\)"
-      (1 (unless (memq (char-before) '(?\' ?\"))
+     ("\\([\"']\\)[^<>\"']*[<>\"']"
+      (1 (unless (eq (char-after (match-beginning 1)) (char-before))
            ;; Be careful to call `syntax-ppss' on a position before the one
            ;; we're going to change, so as not to need to flush the data we
            ;; just computed.
diff --git a/test/lisp/textmodes/sgml-mode-tests.el b/test/lisp/textmodes/sgml-mode-tests.el
index a900e8dcf2..ffcc2cd840 100644
--- a/test/lisp/textmodes/sgml-mode-tests.el
+++ b/test/lisp/textmodes/sgml-mode-tests.el
@@ -161,11 +161,18 @@ sgml-with-content
       (should (string= "&&" (buffer-string))))))
 
 (ert-deftest sgml-tests--quotes-syntax ()
-  (with-temp-buffer
-    (sgml-mode)
-    (insert "a\"b <tag>c'd</tag>")
-    (should (= 1 (car (syntax-ppss (1- (point-max))))))
-    (should (= 0 (car (syntax-ppss (point-max)))))))
+  (dolist (str '("a\"b <t>c'd</t>"
+                 "a'b <t>c\"d</t>"
+                 "<t>\"a'</t>"
+                 "<t>'a\"</t>"
+                 "<t>\"a'\"</t>"
+                 "<t>'a\"'</t>"))
+   (with-temp-buffer
+     (sgml-mode)
+     (insert str)
+     ;; Check that last tag is parsed as a tag.
+     (should (= 1 (car (syntax-ppss (1- (point-max))))))
+     (should (= 0 (car (syntax-ppss (point-max))))))))
 
 (provide 'sgml-mode-tests)
 ;;; sgml-mode-tests.el ends here
--
2.11.0



And the correponding patch against emacs-26:


0001-Backport-sgml-syntax-propertize-rules-speedup-Bug-33.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Vincent Lefevre-10
There's still an issue. On the following XML file

<root>
<a>text</a>
<!-- ' -->
<a>text</a>
</root>

the part after the comment <!-- ' --> is highlighted as a comment.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Reply | Threaded
Open this post in threaded view
|

bug#33887: 26.1; Emacs hangs for several seconds when going to the end of an XML file in nXML mode

Vincent Lefevre-10
On 2019-05-18 16:47:56 +0200, Vincent Lefevre wrote:
> There's still an issue. On the following XML file
>
> <root>
> <a>text</a>
> <!-- ' -->
> <a>text</a>
> </root>
>
> the part after the comment <!-- ' --> is highlighted as a comment.

And on the following XML file too:

<root>
<!DOCTYPE root [
<!ENTITY f SYSTEM "f.xml">
]>
<a>ab'cd</a>
<a>text</a>
</root>

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



12