I think, someday, it would be nice if users could select modern
regex syntax instead of the very very old-fashioned and awkward Emacs regex syntax. The old syntax and functions that implement it need to be kept around for legacy reasons, but one could easily set up a set of parallel new functions that used modern PCRE style syntax, and allow users to select those instead when doing things like isearching on regexps etc. Perry -- Perry E. Metzger [hidden email] |
> I think, someday, it would be nice if users could select modern
> regex syntax instead of the very very old-fashioned and awkward Emacs > regex syntax I agree. See https://github.com/benma/visual-regexp-steroids.el, which implements this. |
On Sat, 16 Jun 2018 11:45:40 -0600 Radon Rosborough
<[hidden email]> wrote: > > I think, someday, it would be nice if users could select modern > > regex syntax instead of the very very old-fashioned and awkward > > Emacs regex syntax > > I agree. See https://github.com/benma/visual-regexp-steroids.el, > which implements this. > That requires python, isn't integrated into emacs, etc. -- Perry E. Metzger [hidden email] |
> On Sat, 16 Jun 2018 11:45:40 -0600 Radon Rosborough
> <[hidden email]> wrote: >> > I think, someday, it would be nice if users could select modern >> > regex syntax instead of the very very old-fashioned and awkward >> > Emacs regex syntax Right now, I'd just settle for an easier-to-type equivalent to "\_<". This sequence is the typing equivalent of a nasty cough. I've actually been wondering how feasible it'd be to support an rx input mode. |
In reply to this post by Perry E. Metzger
> That requires python, isn't integrated into emacs, etc.
It doesn't. You can select pcre2el as the Regexp syntax, which implements a good subset of PCRE on top of Emacs Regexp. However, that project hasn't seen any movement since late 2016. It's be nice if Emacs had PCRE baked in. |
In reply to this post by Perry E. Metzger
Perry E. Metzger writes:
> I think, someday, it would be nice if users could select modern > regex syntax instead of the very very old-fashioned and awkward Emacs > regex syntax. The old syntax and functions that implement it need to > be kept around for legacy reasons, but one could easily set up a set > of parallel new functions that used modern PCRE style syntax, and > allow users to select those instead when doing things like > isearching on regexps etc. I just wanted to note that `rx' is in many cases much easier to write and understand than even PCRE. I'd recommend learning and using `rx' if you are annoyed about backslashes or readability. -Jay |
On Sun, Jun 17, 2018 at 12:32 AM Jay Kamat <[hidden email]> wrote: Perry E. Metzger writes: I only remember the PCRE syntax that's why I use packages like `visual-regexp` and `pcre2el`, if emacs supported it natively that'd be a huge step forward for me. Philippe |
In reply to this post by Jay Kamat
On Sun, 17 Jun 2018, 06:32 Jay Kamat <[hidden email] wrote: I just wanted to note that `rx' is in many cases much easier to write and While I'm sure that is true for lot of people (and for those, the newly announced xr package helps here), others prefer to use the more compact regex syntax. However, I don't think anyone would argue that the Emacs regex syntax has any advantages compared to pcre. I certainly need to wade through the Emacs regex manual every time I want to do slightly more advanced regex matching, followed by lots of testing. When using regexes in regular editing (as opposed to elisp programming) it's even worse. I'm most definitely in favour of pcre. Regards, Elias |
10 feb. 2019 kl. 10.39 skrev Elias Mårtenson <[hidden email]>:
> > While I'm sure that is true for lot of people (and for those, the newly announced xr package helps here), others prefer to use the more compact regex syntax. > > However, I don't think anyone would argue that the Emacs regex syntax has any advantages compared to pcre. I certainly need to wade through the Emacs regex manual every time I want to do slightly more advanced regex matching, followed by lots of testing. > > When using regexes in regular editing (as opposed to elisp programming) it's even worse. > > I'm most definitely in favour of pcre. Hello Elias, Of course you should write "-?[0-9]+" when you need it! And for interactive use -- search-and-replace, say -- the conventional notations are not bad, since they are compact to write, you have the meaning all in your head anyway, and nobody is going to look at it later on. Where rx shines is for the complex ones. I have written page-long regexps in Perl and Python, and despite the fact that both languages permit a "structured" regexp layout, they does not come close to rx when it counts: rx can be read, understood, maintained, evolved, and composed far better, and with fewer mistakes. I agree that the Posix notation is probably better than the old-style version in Emacs since the former tends to be a tad lighter in backslashes. Some languages - OCaml, Python, etc -- have some form of string literal that avoids the need to escape backslashes, but fundamentally, regexps are not strings but an algebraic notation with values and operators, and deserve some kind of higher language-level support. Larry Wall understood that. So I suggest you give rx a go next time you need to write a complicated regexp in Elisp. If you still find it too verbose, you can use short keywords, like `+' or `1+' instead of `one-or-more'. You can even speak a hybrid dialect by injecting little regexp strings inside a big rx expression with the `(regexp ...)' syntax! Take a look at the big `gnu' matcher in compile.el (around line 281) to see what that looks like. Careful here -- rx is addictive, and you may very well come to use it more and more. |
Of course you should write "-?[0-9]+" when you need it! And for interactive use -- search-and-replace, say -- the conventional notations are not bad, since they are compact to write, you have the meaning all in your head anyway, and nobody is going to look at it later on. I think the purpose of this thread is to ask for emacs to support PCRE regexpes in commands like `query-replace-regexp`. Would this even be possible? I can imagine a whole lot of packages breaking if the regexp syntax changed, and changing it just for the user input in interactive functions looks a bit sketchy. |
On 15/02/2019 08.42, Philippe Vaucher wrote:
> Would this even be possible? I can imagine a whole lot of packages breaking if the regexp syntax changed, and changing it just for the user input in interactive functions looks a bit sketchy. We could just add a special tag at the beginning of a regexp to indicate that it's a pcre regexp; something like this maybe? (re-search-forward "\\(?pcre:\\)…[pcre regexp goes here]…"). This form is currently a syntax error, so there would be no ambiguity, and we could define a (pcre …) macro so that you could write (re-search-forward (pcre "…[pcre regexp goes here]…")) instead. Alternatively, we could use an explicit tag, something like (re-search-forward (cons 'pcre "…[pcre regexp goes here]…")). For interactive functions, I imagine you'd have a defcustom with a preferred regexp dialect. |
In reply to this post by Philippe Vaucher
> From: Philippe Vaucher <[hidden email]>
> Date: Fri, 15 Feb 2019 14:42:43 +0100 > Cc: emacs-devel <[hidden email]>, Jay Kamat <[hidden email]>, > Elias Mårtenson <[hidden email]>, > "Perry E. Metzger" <[hidden email]> > > I think the purpose of this thread is to ask for emacs to support PCRE regexpes in commands like > `query-replace-regexp`. > > Would this even be possible? I can imagine a whole lot of packages breaking if the regexp syntax changed, > and changing it just for the user input in interactive functions looks a bit sketchy. It should be possible if we introduce new functions for PCRE, or if we mark PCRE regexps in some special way, like put a special text property on the string. |
In reply to this post by Clément Pit-Claudel
> Would this even be possible? I can imagine a whole lot of packages breaking if the regexp syntax changed, and changing it just for the user input in interactive functions looks a bit sketchy. I like where this is going, that and Eli's suggestion of a special text property we have plenty of ways to implement it where it'd play nice with the existing code. So far 3 proposals:
Given this I'm in favor of the 2nd option, but maybe I missed some points. Philippe |
On 15/02/2019 10.03, Philippe Vaucher wrote:
> Given this I'm in favor of the 2nd option, but maybe I missed some points. Thinking more about this, there is one non-trivial issue: concatenation. It's common for code in Emacs to take a regexp, assume it's a string, and do something like (concat "\\(" some-regexp-var "\\|" some-other-regexp-var "\\)"). Solution 1 could be tweaked to wrap the whole regexp: "\\(?pcre:…[pcre regexp here]…\\)", and so could solution 3 (a text property spanning the whole length of the string), but solution 2 won't work well here. Not to mention the fact that if the regexps are matched by different engines, we now have to make these work together :/ Clément. |
In reply to this post by Eli Zaretskii
On Fri, 15 Feb 2019 16:18:12 +0200 Eli Zaretskii <[hidden email]> wrote:
> > I think the purpose of this thread is to ask for emacs to support > > PCRE regexpes in commands like `query-replace-regexp`. > > > > Would this even be possible? I can imagine a whole lot of > > packages breaking if the regexp syntax changed, and changing it > > just for the user input in interactive functions looks a bit > > sketchy. > > It should be possible if we introduce new functions for PCRE, or if > we mark PCRE regexps in some special way, like put a special text > property on the string. I think the right thing is to introduce new functions for new-style regexps that parallel the old ones, and to allow users to bind things like the regexp flavors of isearch to the new-style versions if they wish. We can decide if we want the new-style versions to be the default search bindings at some distant date. One could also very slowly replace use of old-style regexp functions with the new-style regexp functions in lisp code, but that could be done over many many years if desired. Perry -- Perry E. Metzger [hidden email] |
>> It should be possible if we introduce new functions for PCRE, or if
>> we mark PCRE regexps in some special way, like put a special text >> property on the string. A simpler option is for Elisp users to write (pcre "foo") where `pcre` is a function that converts to Emacs's own format. I think it would make a lot of sense for Emacs's search functions to accept other kinds of search specifications than regular expressions represented as strings, e.g. to also accept precompiled regexps (or NFAs/DFAs), so `pcre` could also return one of those representations if/when support for it is added. Stefan |
In reply to this post by Eli Zaretskii
15 feb. 2019 kl. 15.18 skrev Eli Zaretskii <[hidden email]>:
> > It should be possible if we introduce new functions for PCRE, or if we > mark PCRE regexps in some special way, like put a special text > property on the string. It would be easier if those who ask for PCRE would say exactly what they want: (1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc -- but restricted to the set of features of the Emacs regexp engine. (2) The features of PCRE not present in Emacs regexps. Which ones, exactly? Lookbehind assertions? Atomic groups? (3) PCRE for interactive use only. (4) PCRE for general Elisp programming. Locating and wrapping the places that ask for regexps interactively, such as `query-replace-regexp', would permit the interactive regexp syntax to become a simple user customisation -- traditional, PCRE, rx or whatnot. It would be a matter of writing a transformation function, and possibly some syntax highlighting, for each case. I wouldn't be surprised if 99% of the requests are really about not having to escape |(){} as metacharacters in interactive use. |
On Fri, 15 Feb 2019 17:24:18 +0100 Mattias Engdegård
<[hidden email]> wrote: > 15 feb. 2019 kl. 15.18 skrev Eli Zaretskii <[hidden email]>: > > > > It should be possible if we introduce new functions for PCRE, or > > if we mark PCRE regexps in some special way, like put a special > > text property on the string. > > It would be easier if those who ask for PCRE would say exactly what > they want: > > (1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc -- > but restricted to the set of features of the Emacs regexp engine. Modern syntax is the main one. > (2) The features of PCRE not present in Emacs regexps. Which ones, > exactly? Lookbehind assertions? Atomic groups? I'm not particularly interested in those. > (3) PCRE for interactive use only. > (4) PCRE for general Elisp programming. The old style syntax is repulsive. I think we should make it possible to slowly switch over to the syntax everyone using regexps has gotten used to over the last 30 years or so. BREs in the style Emacs has been using have been obsolete for longer than many Emacs users have been alive. > Locating and wrapping the places that ask for regexps > interactively, such as `query-replace-regexp', would permit the > interactive regexp syntax to become a simple user customisation -- > traditional, PCRE, rx or whatnot. It would be a matter of writing a > transformation function, and possibly some syntax highlighting, for > each case. > > I wouldn't be surprised if 99% of the requests are really about not > having to escape |(){} as metacharacters in interactive use. No, that's a lot of my complaint. I can't even remember what the correct syntax is half the time. Anyway, I recommend Eli's approach. We create a parallel set of modernized syntax functions, and people can slowly adopt them. Perry -- Perry E. Metzger [hidden email] |
Hello, Perry.
On Fri, Feb 15, 2019 at 11:47:28 -0500, Perry E. Metzger wrote: > On Fri, 15 Feb 2019 17:24:18 +0100 Mattias Engdegård > <[hidden email]> wrote: > > 15 feb. 2019 kl. 15.18 skrev Eli Zaretskii <[hidden email]>: > > > It should be possible if we introduce new functions for PCRE, or > > > if we mark PCRE regexps in some special way, like put a special > > > text property on the string. > > It would be easier if those who ask for PCRE would say exactly what > > they want: > > (1) The syntax of PCRE -- | () {} instead of \| \(\) \{\} etc -- > > but restricted to the set of features of the Emacs regexp engine. > Modern syntax is the main one. Such use of "modern" always gets on my nerves. "Modern" is not the same as "good", and likely has a very weak correlation with it. Why aren't we all using "modern" editors, for example? > > (2) The features of PCRE not present in Emacs regexps. Which ones, > > exactly? Lookbehind assertions? Atomic groups? > I'm not particularly interested in those. That would be the sole reason for me for any switch. > > (3) PCRE for interactive use only. > > (4) PCRE for general Elisp programming. > The old style syntax is repulsive. I disagree. But that's not important. What's important is to have a standard invariable regexp notation, otherwise confusion and unwanted unforeseen nastinesses will occur. > I think we should make it possible to slowly switch over to the syntax > everyone using regexps has gotten used to over the last 30 years or so. > BREs in the style Emacs has been using have been obsolete for longer > than many Emacs users have been alive. They're not obsolete: they're used in grep, sed, and in Emacs. There are several different standards for writing regexps, all of approximately the same age. None is better than any other (aside from extra facilities available in some versions). This seems to me to be the same argument as that proposing that Emacs should change its key bindings to match those of other programs, because "everybody" knows those other bindings. > > Locating and wrapping the places that ask for regexps > > interactively, such as `query-replace-regexp', would permit the > > interactive regexp syntax to become a simple user customisation -- > > traditional, PCRE, rx or whatnot. It would be a matter of writing a > > transformation function, and possibly some syntax highlighting, for > > each case. Exactly. And then we've got 10 to 20 years of confusion, with several mutually incompatible regexp notations competing for attention in the same Emacs. I think this would be a thoroughly bad idea. > > I wouldn't be surprised if 99% of the requests are really about not > > having to escape |(){} as metacharacters in interactive use. > No, that's a lot of my complaint. I can't even remember what the > correct syntax is half the time. I don't suffer that difficulty in Emacs (though I sometimes do in grep, egrep, sed and AWK, all of which have slightly different regexps). But I would begin to suffer it if there started to be a mixture of incompatible regexp notations in Emacs sources. Let's keep things simple. > Anyway, I recommend Eli's approach. We create a parallel set of > modernized syntax functions, and people can slowly adopt them. I suggest we retain our current regexp notation, together with compatible tools, as the sole way of writing regexps in Emacs. This notation is not all that bad, and it is thoroughly documented and well tested. It's the approach which will cause the least confusion. It works. > Perry > -- > Perry E. Metzger [hidden email] -- Alan Mackenzie (Nuremberg, Germany). |
> > Modern syntax is the main one.
> > Such use of "modern" always gets on my nerves. "Modern" is not the same > as "good", and likely has a very weak correlation with it. Not to mention that "modern" has been applied to the latest fashion, ephemeral or not, for at least 100 years. Today's modernista is tomorrow morning's has-been, but s?he sometimes continues to tout the same old-fashioned modernisms. There's absolutely nothing new about labeling something "modern" (or "old-fashioned", for that matter). Nothing new about "modern". > Why aren't we all using "modern" editors, for example? Why indeed? Headline: "Users of Anachronistic Editor Emacs Go 'Modern'!" > > I think we should make it possible to slowly switch over to the syntax > > everyone using regexps has gotten used to over the last 30 years or so. > > BREs in the style Emacs has been using have been obsolete for longer > > than many Emacs users have been alive. > > They're not obsolete: they're used in grep, sed, and in Emacs. > > There are several different standards for writing regexps, all of > approximately the same age. None is better than any other (aside from > extra facilities available in some versions). But surely some are "modern" and others are "obsolete", Alan. ;-) (What's the equivalent of L'Academie Francaise for things technical?) Emacs itself has been obsolete for longer than many Emacs users have been alive. Emacs is dead. Long live Emacs. > This seems to me to be the same argument as that proposing that Emacs > should change its key bindings to match those of other programs, because > "everybody" knows those other bindings. Emacs key bindings have been obsolete longer than many Emacs users have been alive. Please remember this. |
Free forum by Nabble | Edit this page |