bug#43631: 28.0.50; CC Mode multiline strings grinds performance to a halt

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#43631: 28.0.50; CC Mode multiline strings grinds performance to a halt

Eli Zaretskii
> From: Theodor Thornhill <[hidden email]>
> Cc: [hidden email]
> Date: Sat, 26 Sep 2020 14:40:36 +0200
>
> Eli Zaretskii <[hidden email]> writes:
>
> > All the profiles posted there end prematurely, thus making it
> > impossible to make independent conclusions regarding the possible
> > culprits.  Would it be possible to post here a full profile,
> > completely expanded, obtained after loading all the relevant *.el
> > files as *.el (NOT *.elc!), so that the profile is detailed enough to
> > show the relevant parts?  It would make the discussion much more
> > focused.
> >
> > Bonus points for posting another profile, where the feature you think
> > is the main culprit is disabled.
> >
> > TIA
>
>
> Attached is two reports, one which is super slow, and one that is fast.

Thanks, but please post the text of Profiler-Report buffer, not the
internal structure it produces.



Reply | Threaded
Open this post in threaded view
|

bug#43631: 28.0.50; CC Mode multiline strings grinds performance to a halt

Emacs - Bugs mailing list
Eli Zaretskii <[hidden email]> writes:

[...]


>>
>> Attached is two reports, one which is super slow, and one that is fast.
>
> Thanks, but please post the text of Profiler-Report buffer, not the
> internal structure it produces.

Ok, third time's the charm:

recipe same as before.

Theodor Thornhill


fast-without-multiline.txt (8K) Download Attachment
slow-with-multiline.txt (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#43631: 28.0.50; CC Mode multiline strings grinds performance to a halt

Eli Zaretskii
> From: Theodor Thornhill <[hidden email]>
> Cc: [hidden email]
> Date: Sat, 26 Sep 2020 18:03:10 +0200
>
> > Thanks, but please post the text of Profiler-Report buffer, not the
> > internal structure it produces.
>
> Ok, third time's the charm:

Thanks.  This seems to indicate that this loop in
c-pps-to-string-delim is the culprit:

    (while (progn
             (parse-partial-sexp (point) end nil nil st-s 'syntax-table)
             (unless (bobp)
               (c-clear-syn-tab (1- (point))))
             (setq st-pos (point))
             (and (< (point) end)
                  (not (eq (char-before) ?\")))))

But I'm confused why the "fast" profile starts with
font-lock-fontify-region, whereas the "slow" profile doesn't have
font-lock-fontify-region anywhere...

Hopefully, Alan can take it from here.



Reply | Threaded
Open this post in threaded view
|

bug#43631: 28.0.50; CC Mode multiline strings grinds performance to a halt

Dmitry Gutov
On 26.09.2020 19:17, Eli Zaretskii wrote:
> But I'm confused why the "fast" profile starts with
> font-lock-fontify-region, whereas the "slow" profile doesn't have
> font-lock-fontify-region anywhere...

Because CC Mode doesn't use syntax-propertize-function (most major modes
do, so we're used to seeing "slow" syntax analysis being done inside
font-lock-fontify-region because it calls s-p-f).

CC Mode applies syntax properties inside before/after-change-functions,
and the "slow" profile reflects that: c-before-change is featured
prominently.



Reply | Threaded
Open this post in threaded view
|

bug#43631: 28.0.50; CC Mode multiline strings grinds performance to a halt

Emacs - Bugs mailing list
In reply to this post by Eli Zaretskii
Eli Zaretskii <[hidden email]> writes:


[...]

>
> Thanks.  This seems to indicate that this loop in
> c-pps-to-string-delim is the culprit:
>
>     (while (progn
>     (parse-partial-sexp (point) end nil nil st-s 'syntax-table)
>     (unless (bobp)
>       (c-clear-syn-tab (1- (point))))
>     (setq st-pos (point))
>     (and (< (point) end)
>  (not (eq (char-before) ?\")))))
>
> But I'm confused why the "fast" profile starts with
> font-lock-fontify-region, whereas the "slow" profile doesn't have
> font-lock-fontify-region anywhere...
>

Thanks for digging into this. I can add one more thing that I see. When
this variable is set to some char, typing that character and then quote
mark would only insert one quote and fontify to end of buffer as a
string.

Example:

 - type @"             // (#") for pike-mode
 - see whole buffer get fontified
 - no extra quote mark is inserted to make a proper pair.

What I would expect:

 - type @"
 - see @""
 - type normally inside quote marks.

I am not sure how this is related, if at all, but found it noticeable
enough to add to this discussion.

Also, if the 'multiline-string' variable is not set, typing @" would
behave as expected, with the pair being closed and nothing other than
string is fontified.

> Hopefully, Alan can take it from here.

Theo



Reply | Threaded
Open this post in threaded view
|

bug#43631: 28.0.50; CC Mode multiline strings grinds performance to a halt

Alan Mackenzie
In reply to this post by Eli Zaretskii
Hello, Theo.

On Sun, Sep 27, 2020 at 13:34:32 +0200, Theodor Thornhill wrote:
> Eli Zaretskii <[hidden email]> writes:

> [...]


> > But I'm confused why the "fast" profile starts with
> > font-lock-fontify-region, whereas the "slow" profile doesn't have
> > font-lock-fontify-region anywhere...

> I see that when I remove 'c-before-change-check-unbalanced-strings from
> 'c-get-state-before-change-functions' the performance degradation
> ceases.  I'm not sure what else is affected by that change, so not sure
> if that can be counted as a fix as far as 'csharp-mode' is concerned.

I would strongly recommend you not to make such a change, at least not
without a good deal of matching changes elsewhere.  ;-)

It seems the bit in c-b-c-check-unbalanced-strings dealing with
multiline strings was written on the assumption that buffers containing
such would be small.

With multiline strings, _any_ change involving quote characters can flip
the string/non-string characterisation from point all the way to the end
of the buffer.  In the worst case scenario, this potentially big region
needs to be analysed and have syntax-table text properties throughout
the entire region changed.

The current problem is that c-b-c-check-u-strings is doing this analysis
for every buffer change.  This was easier to code, but has led to
performance problems on buffers which aren't small.  The solution to
this will have to involve restricting this analysis to when quote marks
or the c-multiline-string-start-char get inserted or removed.  That way,
there should only be an occasional and tolerable delay when one of these
characters is inserted/removed.

I'll be looking at this in the coming days.

> Just wanted to let you know.

> Theo

--
Alan Mackenzie (Nuremberg, Germany).