[ELPA] New package: xr

classic Classic list List threaded Threaded
56 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Richard Stallman
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I know, but what I don't understand is how ERC etc. are different from
  > a package stored on ELPA.

I did not say they are different.  My understanding is that we treat ALL
the packages in ELPA as part of Emacs.

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)



Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Richard Stallman
In reply to this post by Eli Zaretskii
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > I expect that saying "my file foo is a change to Emacs" is enough.

  > Saying where?

In email to you; you forward that to [hidden email].

Putting it in a version of the file, that you get fron the contributor
and install, would also be ok.

Maybe some other ways would also be good enough, but they don'tcome to
my mind just now.

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)



Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Eli Zaretskii
> From: Richard Stallman <[hidden email]>
> Cc: [hidden email], [hidden email]
> Date: Sun, 10 Feb 2019 00:51:58 -0500
>
>   > > I expect that saying "my file foo is a change to Emacs" is enough.
>
>   > Saying where?
>
> In email to you; you forward that to [hidden email].

In addition to saying in the file that "this file is part of GNU
Emacs"?  Or is the latter enough?

Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Eli Zaretskii
In reply to this post by Richard Stallman
> From: Richard Stallman <[hidden email]>
> Cc: [hidden email], [hidden email]
> Date: Sun, 10 Feb 2019 00:51:58 -0500
>
>   > I know, but what I don't understand is how ERC etc. are different from
>   > a package stored on ELPA.
>
> I did not say they are different.  My understanding is that we treat ALL
> the packages in ELPA as part of Emacs.

But for packages we add to Emacs, and place into the Emacs Git
repository, we didn't until now request any statement from the author,
we assumed that being in the repository is a clear sign that the
package is part of Emacs.  We also assumed that being in the ELPA
repository is a clear sign of the same.  Now it sounds like we were
mistaken?

Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Richard Stallman
In reply to this post by Eli Zaretskii
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > In addition to saying in the file that "this file is part of GNU
  > Emacs"?  Or is the latter enough?

If that statement is present in the file when the contributor emails
it to you, I think that does the job.  To make sure the FSF knows this,
please forward that email to the FSF clerk.

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)



Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Michael Heerdegen
In reply to this post by Mattias Engdegård-2
Hi Mattias,

> The xr package is the inverse of rx: it transforms an Emacs regexp
> string into rx form. Its purpose is to assist in migration to rx-style
> regexps, something I think a lot more people should do, and help
> understanding difficult regexps.

Would it be possible to define a variable to control what kind of
operators `xr--postfix' chooses?  Currently the verbose names are
hardcoded, and even if there is no perfectly nice solution, it would be
good if there was some way to control this.  I'm not sure if there are
other places besides `xr--postfix' where such a variable would be to
consulted.


Thanks,

Michael.

Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Mattias Engdegård-2
27 feb. 2019 kl. 16.06 skrev Michael Heerdegen <[hidden email]>:
>
> Would it be possible to define a variable to control what kind of
> operators `xr--postfix' chooses?  Currently the verbose names are
> hardcoded, and even if there is no perfectly nice solution, it would be
> good if there was some way to control this.  I'm not sure if there are
> other places besides `xr--postfix' where such a variable would be to
> consulted.

The identifiers chosen by xr will never suit the taste of everyone, but I couldn't make up my mind about what to do about it. I'd be interested in hearing what other people think.
One way would be to add an optional translation alist to xr:

(xr "a*.?"
    '((zero-or-more . 0+) (nonl . not-newline)))
=> (seq (0+ "a") (opt not-newline))

but this could be done almost as easily by post-processing the resulting list tree, maybe like this:

(defun rename-rx (substitution-alist rx)
  "Rename all identifiers in RX according to SUBSTITUTION-ALIST."
  (cond
   ((symbolp rx) (alist-get rx substitution-alist rx))
   ((consp rx)
    ;; Translate the first element of each list even if not a symbol,
    ;; to allow translation of ? and ?? (space and ?).
    (cons (alist-get (car rx) substitution-alist (car rx))
          (mapcar (lambda (elem) (rename-rx substitution-alist elem))
                  (cdr rx))))
   (t rx)))

Post-processing is more generally powerful since it can do structural replacement and not mere symbol replacement.

A customisable setting would also work, but I prefer keeping xr purely functional.


Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Michael Heerdegen
Mattias Engdegård <[hidden email]> writes:

> A customisable setting would also work, but I prefer keeping xr purely
> functional.

Then maybe at least define several settings somewhere and make them
available to be used as argument or replacement list or whatever.  It
would be good if it was possible to get a different behavior without too
much brain usage: Make it so that the calls needed can be kept short.


Michael.

Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Mattias Engdegård-2
27 feb. 2019 kl. 18.09 skrev Michael Heerdegen <[hidden email]>:
>
> Then maybe at least define several settings somewhere and make them
> available to be used as argument or replacement list or whatever.  It
> would be good if it was possible to get a different behavior without too
> much brain usage: Make it so that the calls needed can be kept short.

That's probably a good idea; the optional argument could be either an alist or a symbol like 'short or 'verbose.

Good defaults are just as important. I thought that one-or-more would be more descriptive than + or 1+ for those not familiar with the rx notation, but perhaps this was a mistake.

Which would you prefer as default symbol from each of these sets?

 one-or-more 1+ +
 zero-or-more 0+ *
 zero-or-one optional opt ?
 repeat **  (for lower-upper-bounded repetition)
 repeat =   (for exact-count repetition)
 any char in
 not-newline nonl
 group submatch
 line-start bol
 line-end eol
 string-start buffer-start bos bot
 string-end buffer-end eos eot
 word-start bow
 word-end eow
 sequence seq and :
 or |
 regexp regex



Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Clément Pit-Claudel
Here are my picks :)

On 28/02/2019 09.10, Mattias Engdegård wrote:

>  1+
>  0+
>  ?
>  repeat ** (not sure)
>  repeat = (not sure)>  any
>  nonl
>  group
>  bol
>  eol
>  string-start buffer-start
>  string-end buffer-end
>  word-start
>  word-end
>  seq
>  or
>  regex

I'm not sure whether it's be better to be consistent about the 'bo…' and 'eo…' family.

Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Stefan Monnier
In reply to this post by Mattias Engdegård-2
>  one-or-more 1+ +
>  zero-or-more 0+ *
>  zero-or-one optional opt ?

I'm definitely in favor of using the standard * + and ?

>  any char in

I think I prefer `char` or `in`.

>  line-start bol
>  line-end eol
>  string-start buffer-start bos bot
>  string-end buffer-end eos eot
>  word-start bow
>  word-end eow

I like the boX/eoX nomenclature.

The main benefit of RX is to make the structure more visible, and the
main downside is to make regexp more verbose, so I think short
identifiers are preferable.

>  sequence seq and :

I'm strongly opposed to `and` because that should mean the
conjunction/intersection of two regexps (i.e. a string matches it only
if it matches both sub-regexps) rather than the sequential concatenation.
[ lex.el supports such intersections.  ]


        Stefan


Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Van L
In reply to this post by Mattias Engdegård-2

> Which would you prefer as default symbol from each of these sets?

  one-or-more
  zero-or-more
  zero-or-one
  repeat
  repeat
  any
  not-newline
  group
  line-start
  line-end
  string-start
  string-end
  word-start
  word-end
  sequence
  or
  regexp

for readability in the worst case use case where an end-user/person
operates the patterns fewer than five times a year, year after year,
in an ETL scenario.


Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Mattias Engdegård-2
In reply to this post by Stefan Monnier
1 mars 2019 kl. 00.06 skrev Stefan Monnier <[hidden email]>:

> I'm definitely in favor of using the standard * + and ?

They are suggestive but I find `?' a bit on the hacky side. For instance, you can't break the line after the operator, since it requires a space following.

>> any char in
>
> I think I prefer `char` or `in`.

Interestingly, (not (any ...)) and (not (in ...)) are valid, but (not (char ...)) isn't (although there is `not-char').

Another implementation leak: since `any' is also an alias for `not-newline', `char' and `in' are as well:
(rx any char in) => "..."

>> line-start bol
>> line-end eol
>> string-start buffer-start bos bot
>> string-end buffer-end eos eot
>> word-start bow
>> word-end eow
>
> I like the boX/eoX nomenclature.
>
> The main benefit of RX is to make the structure more visible, and the
> main downside is to make regexp more verbose, so I think short
> identifiers are preferable.

There is definitely merit to that, although I also have sympathy for the verbosity preferred by others.
Lisp seems to cope with long identifiers better than most other languages (dashes permitted, easy line breaking).

I'm leaning towards giving xr a brief and verbose options, and let the default be in the middle somewhere.

>> sequence seq and :
>
> I'm strongly opposed to `and` because that should mean the
> conjunction/intersection of two regexps (i.e. a string matches it only
> if it matches both sub-regexps) rather than the sequential concatenation.
> [ lex.el supports such intersections.  ]

Strongly agreed!


Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Stefan Monnier
>> I'm definitely in favor of using the standard * + and ?
> They are suggestive but I find `?' a bit on the hacky side. For instance,
> you can't break the line after the operator, since it requires
> a space following.

Oh, indeed, that's a problem.

> Interestingly, (not (any ...)) and (not (in ...)) are valid, but (not (char
> ...)) isn't (although there is `not-char').

I guess this means we should go with `in`.
Tho it should be easy to add (not (char ...)) if needed.

BTE, I don't like (not (any ...)) and (not (in ...)) in any case:
the negation should be within the `char`, `in`, or `any` not around it.
This is because the negation of a regexp RE could be defined as a regexp
which matches all strings not matched by RE (and hence when RE is (any
...), it will also match all strings of length != 1).

This form of negation is supported in lex.el, for example.

> Another implementation leak: since `any' is also an alias for `not-newline',
> `char' and `in' are as well:
> (rx any char in) => "..."

But XR doesn't have to abuse this, luckily.


        Stefan


Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Stefan Monnier
In reply to this post by Van L
> <longer form for all options>
> For readability in the worst case use case where an end-user/person
> operates the patterns fewer than five times a year, year after year,
> in an ETL scenario.

While I personally prefer short forms, there is definitely something to
be said for the longer forms.  And you could argue that people who
like it terse know where to find it.


        Stefan


Reply | Threaded
Open this post in threaded view
|

Re: [ELPA] New package: xr

Michael Heerdegen
In reply to this post by Mattias Engdegård-2
Mattias Engdegård <[hidden email]> writes:

> Good defaults are just as important. I thought that one-or-more would
> be more descriptive than + or 1+ for those not familiar with the rx
> notation, but perhaps this was a mistake.

Not a mistake, not just what everybody wants.  Dunno what the majority
wants.  But I think as a default it could be right, because it's
something at least everybody understands.

> Which would you prefer as default symbol from each of these sets?

I didn't bother with regexps very often, so count my opinion as the
opinion of some random user.

>  one-or-more 1+ +
>  zero-or-more 0+ *
>  zero-or-one optional opt ?

+, *, ?, because they are short and what I'm used to from string regexps.

>  repeat **  (for lower-upper-bounded repetition)
>  repeat =   (for exact-count repetition)
>  any char in
>  not-newline nonl
>  group submatch

Don't have preferences yet here.

>  line-start bol
>  line-end eol
>  string-start buffer-start bos bot
>  string-end buffer-end eos eot
>  word-start bow
>  word-end eow

bol, eol, bos, eos, bow, eow: short and easy to remember.

>  sequence seq and :

I guess seq.

>  or |

|.

>  regexp regex

I always look up which is the right one.  Oh - they work both?  Cool.


Michael.

123