icalendar.el bug fix patch

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

icalendar.el bug fix patch

Emacs - Dev mailing list
icalendar-export-region does not export multi-line Desc as it is imported by icalendar-import-file.  The following patch fixes the issue.  If acceptable, please commit. Thanks.

diff --git a/lisp/calendar/icalendar.el b/lisp/calendar/icalendar.el
index 1186ced3fb..1f4e582aa5 100644
--- a/lisp/calendar/icalendar.el
+++ b/lisp/calendar/icalendar.el
@@ -1244,7 +1244,7 @@ icalendar--parse-summary-and-rest
                      (concat "\\(" icalendar-import-format-uid "\\)??"))))
  ;; Need the \' regexp in order to detect multi-line items
         (setq s (concat "\\`"
-                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
+                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
                         "\\'"))
         (if (string-match s summary-and-rest)
             (let (cla des loc org sta url uid) ;; sum

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Eli Zaretskii
> Date: Wed, 23 Oct 2019 09:33:52 -0400
> From: Rajeev Narang via "Emacs development discussions." <[hidden email]>
>
> icalendar-export-region does not export multi-line Desc as it is imported by icalendar-import-file.  The following patch fixes the issue.  If acceptable, please commit. Thanks.
>
> diff --git a/lisp/calendar/icalendar.el b/lisp/calendar/icalendar.el
> index 1186ced3fb..1f4e582aa5 100644
> --- a/lisp/calendar/icalendar.el
> +++ b/lisp/calendar/icalendar.el
> @@ -1244,7 +1244,7 @@ icalendar--parse-summary-and-rest
>                       (concat "\\(" icalendar-import-format-uid "\\)??"))))
>   ;; Need the \' regexp in order to detect multi-line items
>          (setq s (concat "\\`"
> -                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
> +                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
>                          "\\'"))

Thanks, but is [^a-z] correct?  Do you want to reject only lower-case
ASCII characters?  What about non-ASCII?

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Mattias Engdegård-2
1 nov. 2019 kl. 10.40 skrev Eli Zaretskii <[hidden email]>:

>> -                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
>> +                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
>>
> Thanks, but is [^a-z] correct?  Do you want to reject only lower-case
> ASCII characters?  What about non-ASCII?

It's [^z-a], also known as 'anychar'.


Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Emacs - Dev mailing list
In reply to this post by Eli Zaretskii
[^z-a] is correct. I understand it to mean any char. It is another was of writing [.\n] and is used in rx.el, so I presumed is preferred.

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Eli Zaretskii
In reply to this post by Mattias Engdegård-2
> From: Mattias Engdegård <[hidden email]>
> Date: Fri, 1 Nov 2019 11:51:49 +0100
> Cc: Rajeev Narang <[hidden email]>, [hidden email]
>
> 1 nov. 2019 kl. 10.40 skrev Eli Zaretskii <[hidden email]>:
>
> >> -                        (replace-regexp-in-string "%s" "\\(.*?\\)" s nil t)
> >> +                        (replace-regexp-in-string "%s" "\\([^z-a]*?\\)" s nil t)
> >>
> > Thanks, but is [^a-z] correct?  Do you want to reject only lower-case
> > ASCII characters?  What about non-ASCII?
>
> It's [^z-a], also known as 'anychar'.

OK, but I still wonder why it's TRT.

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Eli Zaretskii
In reply to this post by Emacs - Dev mailing list
> From: Rajeev Narang <[hidden email]>
> Cc: [hidden email]
> Date: Fri, 01 Nov 2019 07:12:08 -0400
>
> [^z-a] is correct. I understand it to mean any char. It is another was of writing [.\n] and is used in rx.el, so I presumed is preferred.

If we need to say [.\n], let's say that.  IMO, it expresses its intent
much more than the cryptic [^z-a].

Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Mattias Engdegård-2
1 nov. 2019 kl. 14.05 skrev Eli Zaretskii <[hidden email]>:

> If we need to say [.\n], let's say that.  IMO, it expresses its intent
> much more than the cryptic [^z-a].

'.' is not special inside []; you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand. [^z-a] is mentioned in the manual, by the way.

If readability is important, consider (rx (group (*? anychar))).


Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Eli Zaretskii
> From: Mattias Engdegård <[hidden email]>
> Date: Fri, 1 Nov 2019 14:24:31 +0100
> Cc: Rajeev Narang <[hidden email]>, [hidden email]
>
> 1 nov. 2019 kl. 14.05 skrev Eli Zaretskii <[hidden email]>:
>
> > If we need to say [.\n], let's say that.  IMO, it expresses its intent
> > much more than the cryptic [^z-a].
>
> '.' is not special inside []; you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand.

Yes, it's easier to understand, so I prefer that we use it.

> If readability is important, consider (rx (group (*? anychar))).

What does it translate into?

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Richard Stallman
In reply to this post by Eli Zaretskii
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > If we need to say [.\n], let's say that.

Doesn't [.\n] match either a period or a newline?
Period is a special character in regexps, but it does not have
that special meaning inside a character class.

--
Dr Richard Stallman
Founder, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)



Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Howard Melman
In reply to this post by Mattias Engdegård-2

Mattias Engdegård <[hidden email]> writes:

> '.' is not special inside []; you would have to write "\\(?:.\\|\n\\)"
> which is slower and messier, although perhaps easier to
> understand. [^z-a] is mentioned in the manual, by the way.

I did not know about this either. FWIW the manual says:

    However, the lower bound should be at most one greater
    than the upper bound; for example, ‘[c-a]’ should be
    avoided.

So [^b-a] would be preferred over [^z-a].

--

Howard


Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Paul Eggert
In reply to this post by Eli Zaretskii
On 11/1/19 6:33 AM, Eli Zaretskii wrote:
>> you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand.
> Yes, it's easier to understand, so I prefer that we use it.
>

I find "[^z-a]" to be signficantly easier to understand than
"\\(?:.\\|\n\\)".

But since the concept is useful, how about if we create an escape for
it? For example, we could establish \! as a regexp that matches any
single character. This be more readable than either [^z-a] or \(?:.\|
\), and would surely help performance as well as readability.

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Mattias Engdegård-2
1 nov. 2019 kl. 22.19 skrev Paul Eggert <[hidden email]>:

> But since the concept is useful, how about if we create an escape for it? For example, we could establish \! as a regexp that matches any single character. This be more readable than either [^z-a] or \(?:.\|
> \), and would surely help performance as well as readability.

Some time ago I experimented with adding a regexp-engine opcode for anychar, but didn't observe any significant difference in performance from that of [^z-a] or \Sq. It is possible that gains could be had if the opcode were to be exploited on a deeper level, such as producing a fast scan loop for [^z-a]*STRING. One has to be careful with backtracking, however.

This is orthogonal to the addition of regexp string syntax for anychar (like \!); neither requires the other.


Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Juri Linkov-2
In reply to this post by Paul Eggert
>>> you would have to write "\\(?:.\\|\n\\)" which is slower and messier, although perhaps easier to understand.
>> Yes, it's easier to understand, so I prefer that we use it.
>
> I find "[^z-a]" to be signficantly easier to understand than
> "\\(?:.\\|\n\\)".
>
> But since the concept is useful, how about if we create an escape for it?
> For example, we could establish \! as a regexp that matches any single
> character. This be more readable than either [^z-a] or \(?:.\|
> \), and would surely help performance as well as readability.

Like using the dotall modifier in other regexp engines to enable single line
mode where the dot matches all characters, including newlines, e.g. in PCRE

  /regexp/s

what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
Maybe something like

  (let ((regexp-modifiers "s"))
    (string-match "." string))

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Richard Stallman
In reply to this post by Paul Eggert
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > But since the concept is useful, how about if we create an escape for
  > it? For example, we could establish \! as a regexp that matches any
  > single character.

+1.

--
Dr Richard Stallman
Founder, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)



Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Richard Stallman
In reply to this post by Juri Linkov-2
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
  > Maybe something like

  >   (let ((regexp-modifiers "s"))
  >     (string-match "." string))

I don't like the idea of a global variable to alter regexp syntax.
I think that some sort of operator would be better.
It could be an escape sequence for "any character" or it could
be a kind of parenthetical grouping which alters the meaning of a period
inside it.

What would be convenient in the rx syntax?
That question might be helpful to finding the best way to
handle this in string-regexp syntax.

--
Dr Richard Stallman
Founder, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)



Reply | Threaded
Open this post in threaded view
|

RE: icalendar.el bug fix patch

Drew Adams
In reply to this post by Richard Stallman
>> But since the concept is useful, how about if we
>> create an escape for it? For example, we could
>> establish \! as a regexp that matches any single
>> character.
>
> +1.

[Interesting.  I skipped this thread, based on the
Subject.  Just stumbled on this part of it, about
having a simple pattern to match what `\\(.\\|[\n]\\)'
matches.  Wouldn't have guessed that from the Subject.]

FWIW, I proposed this back in 2006.  Actually, I
proposed a toggle, so you could use the _same_ regexp
to match either any char or any char except newline.

Nothing wrong with having a separate escape, such as
`\!', to always match what `\\(.\\|[\n]\\)' matches.

But it would still be good to (also) have a toggle,
to be able to make just `.' match the same thing.

https://lists.gnu.org/archive/html/emacs-devel/2006-03/msg00162.html

In 2012 I again voiced the proposal (in the context
of mentioning how Icicles lets you do such things):

  I proposed long ago, for instance, a toggle for a
  `.' in regexp search to match also newlines, i.e.,
  any character.  IOW, `.' in one mode state of the
  toggle, would be equivalent to "\\(.\\|[\n]\\)"
  in the other (traditional) state.

https://lists.gnu.org/archive/html/emacs-devel/2012-08/msg00832.html

So how about adding an escape, such as `\!', and
_also_ adding a toggle (variable and command to
toggle it) that treats both `\!' and plain `.' the
same; that is; have them each match any char?

Reply | Threaded
Open this post in threaded view
|

RE: icalendar.el bug fix patch

Drew Adams
In reply to this post by Richard Stallman
> I don't like the idea of a global variable to alter regexp syntax.

Why not?

> I think that some sort of operator would be better.

It need not be either-or.

> It could be an escape sequence for "any character" or it could
> be a kind of parenthetical grouping which alters the meaning of a
> period inside it.

+1 for the former (escape char).

But having also a variable to control what plain `.'
matches has an additional advantage of letting the
same regexp (using `.') have two different behaviors.

That's particularly helpful interactively, if you
can hit a key to toggle the variable value (so toggle
the behavior of `.').

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Stefan Monnier
In reply to this post by Juri Linkov-2
> what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
> Maybe something like
>
>   (let ((regexp-modifiers "s"))
>     (string-match "." string))

That will affect all the regexp matching that will happen during
execution of this code, so it will require changes in debug.el and
edebug.el, and probably in elp.el and trace.el as well to "reset" the
var before running innocent code.

Also, it can be problematic for cases where we combine/concatenate
several regexp chunks.


        Stefan


Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Juri Linkov-2
>> what would be an equivalent for specifying regexp modifiers in Emacs Lisp?
>> Maybe something like
>>
>>   (let ((regexp-modifiers "s"))
>>     (string-match "." string))
>
> That will affect all the regexp matching that will happen during
> execution of this code, so it will require changes in debug.el and
> edebug.el, and probably in elp.el and trace.el as well to "reset" the
> var before running innocent code.
>
> Also, it can be problematic for cases where we combine/concatenate
> several regexp chunks.

In /regexp/s syntax the modifiers are inseparable from regexp.
What would be an equivalent in Emacs, maybe text properties?

  (string-match (propertize "." 'regexp-modifiers 'newline) string)

Reply | Threaded
Open this post in threaded view
|

Re: icalendar.el bug fix patch

Stefan Monnier
> In /regexp/s syntax the modifiers are inseparable from regexp.
> What would be an equivalent in Emacs, maybe text properties?

I don't like the sound of it.
I think text markers would make more sense, like

   \(*FOO:REGEXP\)

where FOO is the "syntax modifier".  This uses a similar syntax to the
\(?..:REGEXP\) used for shy and named groups.


        Stefan


12