bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
X-Debbugs-CC: Juri Linkov <[hidden email]>
Severity: wishlist

Hi,

we have `count-matches' in replace.el, which returns the
number of matches of a regexp.  Why not to have an standard
function `collect-matches' as well?

I know `xref-collect-matches' but it uses grep program: some users might
not have grep installed, or they may prefer to use Emacs regexps.

I've being using for a while something similar than the patch below.
Probably it doesn't need to be a command, just a normal function.

What do you think?
Regards,
Tino
--8<-----------------------------cut here---------------start------------->8---
commit ccc78b19aa044f6bdb27875937320ed06c2b517a
Author: Tino Calancha <[hidden email]>
Date:   Sun Apr 2 21:37:19 2017 +0900

    collect-matches: New command
   
    Collect all matches for REGEXP in current buffer (Bug#26338).
    * lisp/replace.el (collect-matches): New command.

diff --git a/lisp/replace.el b/lisp/replace.el
index a7b8ae6a34..6f2c6c9a2b 100644
--- a/lisp/replace.el
+++ b/lisp/replace.el
@@ -1002,6 +1002,44 @@ how-many
  (if (= count 1) "" "s")))
       count)))
 
+(defun collect-matches (regexp &optional region group limit)
+  "Collect matches for REGEXP following point.
+Optional arg REGION, if non-nil, mean restrict search to the
+ specified region.  Otherwise search the entire buffer.
+ REGION must be a list of (START . END) positions as returned by
+ `region-bounds'.
+Interactively, in Transient Mark mode when the mark is active, operate
+ on the contents of the region.  Otherwise, operate from point to the
+ end of (the accessible portion of) the buffer.
+Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
+ save the whole match.  Interactively, a numeric prefix set GROUP.
+Optional LIMIT if non-nil, then stop after such number of matches.
+ Otherwise collect all of them."
+  (interactive
+   (list (read-regexp "Collect matches for regexp: ")
+         (and (use-region-p) (region-bounds))
+         (if current-prefix-arg (prefix-numeric-value current-prefix-arg) 0)
+         nil))
+  (unless group (setq group 0))
+  (let* ((count 0)
+         (start (if region (max (caar region) (point-min)) (point)))
+         (end (if region (min (cdar region) (point-max)) (point-max)))
+         res)
+    (save-excursion
+      (goto-char start)
+      (catch '--collect-matches-end
+        (while (re-search-forward regexp nil t)
+          (unless (>= (point) end)
+            (push (match-string-no-properties group) res)
+            (cl-incf count))
+          (when (or (>= (point) end)
+                    (and (natnump limit) (>= count limit)))
+            (throw '--collect-matches-end nil)))))
+    (message "%d Match%s: %s"
+             count (if (= count 1) "" "es")
+             (mapconcat 'identity (setq res (nreverse res)) " "))
+    res))
+
 
 (defvar occur-menu-map
   (let ((map (make-sparse-keymap)))
--8<-----------------------------cut here---------------end--------------->8---
In GNU Emacs 26.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.22.9)
 of 2017-04-02
Repository revision: afabe53b562675b6279cc670ceba32357fac2214



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Dmitry Gutov
On 02.04.2017 15:41, Tino Calancha wrote:

> we have `count-matches' in replace.el, which returns the
> number of matches of a regexp.  Why not to have an standard
> function `collect-matches' as well?
>
> I know `xref-collect-matches' but it uses grep program: some users might
> not have grep installed, or they may prefer to use Emacs regexps.
>
> I've being using for a while something similar than the patch below.
> Probably it doesn't need to be a command, just a normal function.
>
> What do you think?
When used interactively, isn't M-x occur doing something like this?

And for Elisp programs, (while (re-search-forward ...)) is usually
sufficient. That's a three-liner at worst.

And I've never had a need to limit the number of matches, personally.



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Juri Linkov-2
In reply to this post by Tino Calancha-2
> we have `count-matches' in replace.el, which returns the
> number of matches of a regexp.  Why not to have an standard
> function `collect-matches' as well?
>
> I know `xref-collect-matches' but it uses grep program: some users might
> not have grep installed, or they may prefer to use Emacs regexps.
>
> I've being using for a while something similar than the patch below.
> Probably it doesn't need to be a command, just a normal function.
>
> What do you think?

But there is already the occur-collect feature implemented in occur-1
and occur-read-primary-args.  Why would we need a separate command?



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
In reply to this post by Dmitry Gutov


On Sun, 2 Apr 2017, Dmitry Gutov wrote:

> On 02.04.2017 15:41, Tino Calancha wrote:
>
>> we have `count-matches' in replace.el, which returns the
>> number of matches of a regexp.  Why not to have an standard
>> function `collect-matches' as well?
>>
>> I know `xref-collect-matches' but it uses grep program: some users might
>> not have grep installed, or they may prefer to use Emacs regexps.
>>
>> I've being using for a while something similar than the patch below.
>> Probably it doesn't need to be a command, just a normal function.
>>
>> What do you think?
> When used interactively, isn't M-x occur doing something like this?
>
> And for Elisp programs, (while (re-search-forward ...)) is usually
> sufficient. That's a three-liner at worst.
It might be argue the same for occur.  You can just increase a counter
inside (while (re-search-forward ...))

> And I've never had a need to limit the number of matches, personally.
I did often while implementing Bug#25493.  Let's say i am interested in
the last 200 commits modifying a file foo.el.
M-x: find-library foo RET
C-x v l
M-: (setq hashes (collect-matches "^commit \\([[:xdigit:]]+\\)" nil 1 200))

In this case, there is no need to go beyond 200 that's why the limit
argument might be useful.

Another example,
let's say i want to know the two first defun's in subr.el
M-x: find-library subr RET
M-: (collect-matches "^(defun \\([^[:blank:]]+\\)" nil 1 2) RET

Of course you could do:
M-: (seq-take (collect-matches "^(defun \\([^[:blank:]]+\\)" nil 1) 2) RET
;; But if you just want the 2 leading defun's this is a waste.



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
In reply to this post by Juri Linkov-2


On Mon, 3 Apr 2017, Juri Linkov wrote:

>> we have `count-matches' in replace.el, which returns the
>> number of matches of a regexp.  Why not to have an standard
>> function `collect-matches' as well?
>>
>> I know `xref-collect-matches' but it uses grep program: some users might
>> not have grep installed, or they may prefer to use Emacs regexps.
>>
>> I've being using for a while something similar than the patch below.
>> Probably it doesn't need to be a command, just a normal function.
>>
>> What do you think?
>
> But there is already the occur-collect feature implemented in occur-1
> and occur-read-primary-args.  Why would we need a separate command?
Sorry, i don't know about `occur-collect', i can not find its definition.
It doesn't seem to be a defun in replace.el.
See my previous e-mail and let me know if `occur-collect' can serve for
that purpose.



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
In reply to this post by Juri Linkov-2
Juri Linkov <[hidden email]> writes:

>> we have `count-matches' in replace.el, which returns the
>> number of matches of a regexp.  Why not to have an standard
>> function `collect-matches' as well?
>>
>> I know `xref-collect-matches' but it uses grep program: some users might
>> not have grep installed, or they may prefer to use Emacs regexps.
>>
>> I've being using for a while something similar than the patch below.
>> Probably it doesn't need to be a command, just a normal function.
>>
>> What do you think?
>
> But there is already the occur-collect feature implemented in occur-1
> and occur-read-primary-args.  Why would we need a separate command?
Indeed i don't think we need a new command for this.  I am thinking more
in an standard function.
Following:
(occur "defun\\s +\\(\\S +\\)" "\\1")

doesn't return the collected things.  It writes the matches in *Occur*
buffer.  Then, if you want a list with the matches you must loop
again inside *Occur* which is sub-optimal.
For me, it has sense to have a `occur-collect' which just returns the
list of matches.
Then, we might use such function in the implementation of occur-1
which could bring a cleaner implementation.
We might get also the LIMIT argument for occur which might come
in handy for multi-occur with lot of input buffers (just an idea).



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Juri Linkov-2
>>> What do you think?
>>
>> But there is already the occur-collect feature implemented in occur-1
>> and occur-read-primary-args.  Why would we need a separate command?
> Indeed i don't think we need a new command for this.  I am thinking more
> in an standard function.
> Following:
> (occur "defun\\s +\\(\\S +\\)" "\\1")
>
> doesn't return the collected things.  It writes the matches in *Occur*
> buffer.  Then, if you want a list with the matches you must loop
> again inside *Occur* which is sub-optimal.
> For me, it has sense to have a `occur-collect' which just returns the
> list of matches.
> Then, we might use such function in the implementation of occur-1
> which could bring a cleaner implementation.
> We might get also the LIMIT argument for occur which might come
> in handy for multi-occur with lot of input buffers (just an idea).

occur-collect is intended for interactive use.  As for programmatic use,
Dmitry is right: a universal idiom is (while (re-search-forward ...)).
This is why e.g. the docstring of ‘replace-regexp’ recommends to use
an explicit loop like (while (re-search-forward ...) (replace-match ...))



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
Juri Linkov <[hidden email]> writes:

>>>> What do you think?
>>>
>>> But there is already the occur-collect feature implemented in occur-1
>>> and occur-read-primary-args.  Why would we need a separate command?
>> Indeed i don't think we need a new command for this.  I am thinking more
>> in an standard function.
>> Following:
>> (occur "defun\\s +\\(\\S +\\)" "\\1")
>>
>> doesn't return the collected things.  It writes the matches in *Occur*
>> buffer.  Then, if you want a list with the matches you must loop
>> again inside *Occur* which is sub-optimal.
>> For me, it has sense to have a `occur-collect' which just returns the
>> list of matches.
>> Then, we might use such function in the implementation of occur-1
>> which could bring a cleaner implementation.
>> We might get also the LIMIT argument for occur which might come
>> in handy for multi-occur with lot of input buffers (just an idea).
>
> occur-collect is intended for interactive use.  As for programmatic use,
> Dmitry is right: a universal idiom is (while (re-search-forward ...)).
> This is why e.g. the docstring of ‘replace-regexp’ recommends to use
> an explicit loop like (while (re-search-forward ...) (replace-match ...))
OK thanks.  Let me ask you my last proposal before come back to my dark
cave and start painting animals in the walls.

Any interest in something like this?:

(defmacro with-collect-matches (regexp &optional group &rest body)
  "Collect matches for REGEXP and eval BODY for each match.
BODY is evaluated with `it' bound to the match.
Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
save the whole match."



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
Tino Calancha <[hidden email]> writes:

> Juri Linkov <[hidden email]> writes:
>> occur-collect is intended for interactive use.  As for programmatic use,
>> Dmitry is right: a universal idiom is (while (re-search-forward ...)).
>> This is why e.g. the docstring of ‘replace-regexp’ recommends to use
>> an explicit loop like (while (re-search-forward ...) (replace-match ...))
> OK thanks.  Let me ask you my last proposal before come back to my dark
> cave and start painting animals in the walls.
>
> Any interest in something like this?:
>
> (defmacro with-collect-matches (regexp &optional group &rest body)
>   "Collect matches for REGEXP and eval BODY for each match.
> BODY is evaluated with `it' bound to the match.
> Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
> save the whole match."
Sorry, i was paiting a mammoth and i forgot something in the docstring:

--8<-----------------------------cut here---------------start------------->8---
 (defmacro with-collect-matches (regexp &optional group &rest body)
   "Collect matches for REGEXP and eval BODY for each match.
 BODY is evaluated with `it' bound to the match.
 Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
 save the whole match.
 Return a list with the matches."
 --8<-----------------------------cut here---------------end--------------->8---

So, for instance:
M-x find-library replace RET
M-: (length (with-collect-matches "^(defun \\(\\S +\\)" 1)) RET
=> 52

M-x find-library replace RET
M-: (length (with-collect-matches "^(defun \\(\\S +\\)" 1
              (with-current-buffer (get-buffer-create "*Matches*")
                (when (string-match "\\`query-" it)
                  (insert (format "%s\n" it)))))
=> 52
;; Same return as before but only write into *Matches* those
;; functions with name starting with "query-".



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Marcin Borkowski-3
In reply to this post by Tino Calancha-2

On 2017-04-04, at 03:37, Tino Calancha <[hidden email]> wrote:

> Juri Linkov <[hidden email]> writes:
>
>>>>> What do you think?
>>>>
>>>> But there is already the occur-collect feature implemented in occur-1
>>>> and occur-read-primary-args.  Why would we need a separate command?
>>> Indeed i don't think we need a new command for this.  I am thinking more
>>> in an standard function.
>>> Following:
>>> (occur "defun\\s +\\(\\S +\\)" "\\1")
>>>
>>> doesn't return the collected things.  It writes the matches in *Occur*
>>> buffer.  Then, if you want a list with the matches you must loop
>>> again inside *Occur* which is sub-optimal.
>>> For me, it has sense to have a `occur-collect' which just returns the
>>> list of matches.
>>> Then, we might use such function in the implementation of occur-1
>>> which could bring a cleaner implementation.
>>> We might get also the LIMIT argument for occur which might come
>>> in handy for multi-occur with lot of input buffers (just an idea).
>>
>> occur-collect is intended for interactive use.  As for programmatic use,
>> Dmitry is right: a universal idiom is (while (re-search-forward ...)).
>> This is why e.g. the docstring of ‘replace-regexp’ recommends to use
>> an explicit loop like (while (re-search-forward ...) (replace-match ...))
> OK thanks.  Let me ask you my last proposal before come back to my dark
> cave and start painting animals in the walls.
>
> Any interest in something like this?:
>
> (defmacro with-collect-matches (regexp &optional group &rest body)
>   "Collect matches for REGEXP and eval BODY for each match.
> BODY is evaluated with `it' bound to the match.
> Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
> save the whole match."

Sorry if this was said already, but why a macro and not a map-like
function?

Best,

--
Marcin Borkowski



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2


On Tue, 4 Apr 2017, Marcin Borkowski wrote:

>> Any interest in something like this?:
>>
>> (defmacro with-collect-matches (regexp &optional group &rest body)
>>   "Collect matches for REGEXP and eval BODY for each match.
>> BODY is evaluated with `it' bound to the match.
>> Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
>> save the whole match."
>
> Sorry if this was said already, but why a macro and not a map-like
> function?
No special reason.  It's the second idea which came to my mind after
my initial proposal was declined.  Maybe because is shorter to do:
(with-collect-matches regexp)
than
(foo-collect-matches regexp nil #'identity)

if you are just interested in the list of matches.  Implementing it as
a map function might be also nice.  Don't see a big enthusiasm on
the proposal, though :-(

So far people think that it's easy to write a while loop.  I wonder
if they think the same about the existence of `dolist': the should
never use it and always write a `while' loop instead.  Don't think they
do that anyway.

I will repeat it once more.  I find nice, having an operator returning a
list with matches for REGEXP.  If such operator, in addition, accepts a
body of code or a function, then i find this operator very nice
and elegant.

Regards,
Tino



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Noam Postavsky-2
Tino Calancha <[hidden email]> writes:

>
> So far people think that it's easy to write a while loop.  I wonder if
> they think the same about the existence of `dolist': the should
> never use it and always write a `while' loop instead.  Don't think they
> do that anyway.

Perhaps a macro that loops over matches?

    (defmacro domatches (spec &rest body)
      "Loop over matches to REGEXP.

      \(fn (MATCH-VAR [GROUP] REGEXP [BOUND]) BODY...)")

Or an addition to cl-loop that would allow doing something like

    (cl-loop for m being the matches of "foo\\|bar"
             do ...)

Then you could easily 'collect m' to get the list of matches if you want
that.

> I will repeat it once more.  I find nice, having an operator returning
> a list with matches for REGEXP.

I don't think that's come up for me very much, if at all.  It seems
easier to just operate on the matches directly rather than collecting
and then mapping.

> If such operator, in addition,
> accepts a body of code or a function, then i find this operator very
> nice
> and elegant.

Forcing collection on the looping operator seems inelegant to me.



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Juri Linkov-2
In reply to this post by Tino Calancha-2
>> Sorry if this was said already, but why a macro and not a map-like
>> function?
> No special reason.  It's the second idea which came to my mind after
> my initial proposal was declined.  Maybe because is shorter to do:
> (with-collect-matches regexp)
> than
> (foo-collect-matches regexp nil #'identity)
>
> if you are just interested in the list of matches.  Implementing it as
> a map function might be also nice.  Don't see a big enthusiasm on
> the proposal, though :-(
>
> So far people think that it's easy to write a while loop.  I wonder if they
> think the same about the existence of `dolist': the should
> never use it and always write a `while' loop instead.  Don't think they
> do that anyway.
>
> I will repeat it once more.  I find nice, having an operator returning
> a list with matches for REGEXP.  If such operator, in addition, accepts
> a body of code or a function, then i find this operator very nice
> and elegant.

A mapcar-like function presumes a lambda where you can process every
match as you need, but going this way you'd have a temptation to
implement an analogous API from other programming languages like e.g.
https://apidock.com/ruby/String/scan



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
In reply to this post by Noam Postavsky-2


On Wed, 5 Apr 2017, [hidden email] wrote:

> Tino Calancha <[hidden email]> writes:
>
>>
>> So far people think that it's easy to write a while loop.  I wonder if
>> they think the same about the existence of `dolist': the should
>> never use it and always write a `while' loop instead.  Don't think they
>> do that anyway.
>
> Perhaps a macro that loops over matches?
>
>    (defmacro domatches (spec &rest body)
>      "Loop over matches to REGEXP.
>
>      \(fn (MATCH-VAR [GROUP] REGEXP [BOUND]) BODY...)")
>
> Or an addition to cl-loop that would allow doing something like
>
>    (cl-loop for m being the matches of "foo\\|bar"
>             do ...)
>
> Then you could easily 'collect m' to get the list of matches if you want
> that.
Your proposals looks nice to me ;-)
>
>> I will repeat it once more.  I find nice, having an operator returning
>> a list with matches for REGEXP.
>
> I don't think that's come up for me very much, if at all.  It seems
> easier to just operate on the matches directly rather than collecting
> and then mapping.
Sometimes i want to collect matches for different purposes; feed them into
another functions accepting a list.  That's why i miss a standard operator
collecting matches.  Sure, it can be done with a `while' loop, and 3-5
lines.  With the operator would be just one function call.

>> If such operator, in addition,
>> accepts a body of code or a function, then i find this operator very
>> nice
>> and elegant.
>
> Forcing collection on the looping operator seems inelegant to me.
You know, the beauty is in the eyes watching.  The elegance too.  Maybe
you don't like the blue jersey i am wearing now; my mum made it
for me and i love it ;-)
Suppose depend on the name of the operator.  Not
a sorprise if `collect-matches' collect matches;  a bit of sorprise if
`domatches' does such thing.
Thank you for your opinion :-)



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Drew Adams
> > Or an addition to cl-loop that would allow doing something like
> >
> >    (cl-loop for m being the matches of "foo\\|bar"
> >             do ...)
> >
> > Then you could easily 'collect m' to get the list of matches if you want
> > that.
>
> Your proposals looks nice to me ;-)

(Caveat: I have not been following this thread.)

I think that `cl-loop' should be as close to Common Lisp `loop'
as we can reasonably make it.  We should _not_ be adding other
features to it or changing its behavior away from what it is
supposedly emulating.

If you want, create a _different_ macro that is Emacs-specific,
with whatever behavior you want.  Call it whatever you want
that will not be confused with Common Lisp emulation.

Please keep `cl-' for Common Lisp emulation.  We've already
seen more than enough tampering with this - people adding
their favorite thing to the `cl-' namespace.  Not good.



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
In reply to this post by Juri Linkov-2
Juri Linkov <[hidden email]> writes:

>>> Sorry if this was said already, but why a macro and not a map-like
>>> function?
>> No special reason.  It's the second idea which came to my mind after
>> my initial proposal was declined.  Maybe because is shorter to do:
>> (with-collect-matches regexp)
>> than
>> (foo-collect-matches regexp nil #'identity)
>>
>> if you are just interested in the list of matches.  Implementing it as
>> a map function might be also nice.  Don't see a big enthusiasm on
>> the proposal, though :-(
>>
>> So far people think that it's easy to write a while loop.  I wonder if they
>> think the same about the existence of `dolist': the should
>> never use it and always write a `while' loop instead.  Don't think they
>> do that anyway.
>>
>> I will repeat it once more.  I find nice, having an operator returning
>> a list with matches for REGEXP.  If such operator, in addition, accepts
>> a body of code or a function, then i find this operator very nice
>> and elegant.
>
> A mapcar-like function presumes a lambda where you can process every
> match as you need, but going this way you'd have a temptation to
> implement an analogous API from other programming languages like e.g.
> https://apidock.com/ruby/String/scan
I am not crazy with the mapcar-like implemention either.
Actually, I have changed my mind after nice Noah suggestion.  He
mentioned the possibility of extend `cl-loop' with a new clause to iterate on
matches for a regexp.
I think this clause fits well in cl-loop; this way we don't need to
introduce a new function/macro name.


--8<-----------------------------cut here---------------start------------->8---
commit 59e66771d13fce73ff5220ce3df677b9247c9c52
Author: Tino Calancha <[hidden email]>
Date:   Fri Apr 7 23:31:08 2017 +0900

    New clause in cl-loop to iterate in the matches of a regexp
   
    Add new clause in cl-loop facility to loop over the matches for
    REGEXP in the current buffer (Bug#26338).
    * lisp/emacs-lisp/cl-macs.el (cl--parse-loop-clause): Add new clause.
    (cl-loop): update docstring.
    * doc/misc/cl.texi (For Clauses): Document the new clause.
    * etc/NEWS: Mention this change.

diff --git a/doc/misc/cl.texi b/doc/misc/cl.texi
index 2339d57631..6c5c43ad09 100644
--- a/doc/misc/cl.texi
+++ b/doc/misc/cl.texi
@@ -2030,6 +2030,21 @@ For Clauses
 This clause iterates over a sequence, with @var{var} a @code{setf}-able
 reference onto the elements; see @code{in-ref} above.
 
+@item for @var{var} being the matches of @var{regexp}
+This clause iterates over the matches for @var{regexp} in the current buffer.
+By default, @var{var} is bound to the full match.  Optionally, @var{var}
+might be bound to a subpart of the match.  It's also possible to restrict
+the loop to a given number of matches.  For example,
+
+@example
+(cl-loop for x being the matches of "^(defun \\(\\S +\\)"
+         using '(group 1 limit 10)
+         collect x)
+@end example
+
+@noindent
+collects the next 10 function names after point.
+
 @item for @var{var} being the symbols [of @var{obarray}]
 This clause iterates over symbols, either over all interned symbols
 or over all symbols in @var{obarray}.  The loop is executed with
diff --git a/etc/NEWS b/etc/NEWS
index aaca229d5c..03f6ecb88b 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -862,6 +862,10 @@ instead of its first.
 * Lisp Changes in Emacs 26.1
 
 +++
+** New clause in cl-loop to iterate in the matches for a regexp
+in the current buffer.
+
++++
 ** Emacs now supports records for user-defined types, via the new
 functions 'copy-record', 'make-record', 'record', and 'recordp'.
 Records are now used internally to represent cl-defstruct and defclass
diff --git a/lisp/emacs-lisp/cl-macs.el b/lisp/emacs-lisp/cl-macs.el
index 25c9f99992..50596c066e 100644
--- a/lisp/emacs-lisp/cl-macs.el
+++ b/lisp/emacs-lisp/cl-macs.el
@@ -892,6 +892,7 @@ cl-loop
       the overlays/intervals [of BUFFER] [from POS1] [to POS2]
       the frames/buffers
       the windows [of FRAME]
+      the matches of/for REGEXP [using (group GROUP [limit LIMIT])]
   Iteration clauses:
     repeat INTEGER
     while/until/always/never/thereis CONDITION
@@ -1339,6 +1340,33 @@ cl--parse-loop-clause
   (push (list temp-idx `(1+ ,temp-idx))
  loop-for-steps)))
 
+               ((memq word '(match matches))
+ (let* ((_ (or (and (not (memq (car cl--loop-args) '(of for)))
+    (error "Expected `of'"))))
+      (regexp (cl--pop2 cl--loop-args))
+                      (group-limit
+                       (and (eq (car cl--loop-args) 'using)
+                            (consp (cadr cl--loop-args))
+                            (>= (length (cadr cl--loop-args)) 2)
+                            (cadr (cl--pop2 cl--loop-args))))
+                      (group
+                       (or (and group-limit
+                                (cl-find 'group group-limit)
+                                (nth (1+ (cl-position 'group group-limit)) group-limit))
+                           0))
+                      (limit
+                       (and group-limit
+                            (cl-find 'limit group-limit)
+                            (nth (1+ (cl-position 'limit group-limit)) group-limit)))
+                      (count (make-symbol "--cl-count")))
+                  (push (list count 0) loop-for-bindings)
+                  (push (list var nil) loop-for-bindings)
+                  (push `(re-search-forward ,regexp nil t) cl--loop-body)
+                  (push `(or (null ,limit) (and (natnump ,limit) (< ,count ,limit))) cl--loop-body)
+                  (push (list count `(1+ ,count)) loop-for-sets)
+                  (push (list var `(match-string-no-properties ,group))
+                        loop-for-sets)))
+
        ((memq word hash-types)
  (or (memq (car cl--loop-args) '(in of))
                     (error "Expected `of'"))
--8<-----------------------------cut here---------------end--------------->8---
In GNU Emacs 26.0.50 (build 7, x86_64-pc-linux-gnu, GTK+ Version 3.22.11)
 of 2017-04-07
Repository revision: 67aeaa74af8504f950f653136d749c6dd03a60de



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Noam Postavsky-2
On Fri, Apr 7, 2017 at 10:47 AM, Tino Calancha <[hidden email]> wrote:
> +
> +@example
> +(cl-loop for x being the matches of "^(defun \\(\\S +\\)"
> +         using '(group 1 limit 10)
> +         collect x)
> +@end example

You can reuse the existing 'repeat N' clause instead of 'using (limit N)'.

(cl-loop for x being the matches of "^(defun \\(\\S +\\)" using (group 1)
         repeat 10
         collect x)

Regarding Drew's concerns about extending cl-loop with more non-Common
Lisp things, I just don't see that as a problem. I suppose it would be
nice to have a more easily extensible looping macro, like iterate [1].
That would be quite a bit of work though.

[1]: https://common-lisp.net/project/iterate/



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Drew Adams
> Regarding Drew's concerns about extending cl-loop with more non-Common
> Lisp things, I just don't see that as a problem.

It depends on what one considers "a problem".  I think it
is a problem for `cl-', which was intended for Common Lisp
emulation, to become a dumping ground for anyone's idea of
a cool thing to add to Emacs.  That's not what it's for.

We've already had a couple of things unrelated to CL that
were misguidedly added to `cl-'.  We should not continue
that practice (and we really should remove those from the
`cl-' namespace).

There is nothing preventing Emacs from adding any constructs
it wants.  There just is no reason why the `cl-' namespace
(and the `cl*.el' files) should be polluted with stuff that
is not Common Lisp emulation.

A user of `cl-loop' should be able to expect Common Lisp
`loop', or as close to it as we can get.

> I suppose it would be nice to have a more easily extensible
> looping macro, like iterate [1].  That would be quite a bit
> of work though.

As for `iterate': If this is what you mean:
https://common-lisp.net/project/iterate/

then I'm all in favor of it.  I much prefer it to `loop'.
But I don't see anyone stepping forward to add it to Emacs.

Even then, I would probably prefer that we add it to the
`cl-' namespace and stay as close as possible to emulating
the Common Lisp `iterate' (no, it is not part of the CL
language, but yes, it is something developed for/with CL).

There are lots of users of CL, and lots of CL code.  Both
should find a simple, straightforward path to Emacs.  We
should minimize any differences between Emacs emulations
and the things being emulated.

But again, nothing prevents Emacs adding a different
construct that does exactly what you want, with all the
bells and whistles you think are improvements over `loop'
or `iterate' or whatever.

That should not be in the `cl-' namespace, and we should
not confuse users by passing it off as (even a partial)
emulation of a Common Lisp construct.  That's all.

Just one opinion.



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Tino Calancha-2
In reply to this post by Drew Adams


On Fri, 7 Apr 2017, Drew Adams wrote:

>>> Or an addition to cl-loop that would allow doing something like
>>>
>>>    (cl-loop for m being the matches of "foo\\|bar"
>>>             do ...)
>>>
>>> Then you could easily 'collect m' to get the list of matches if you want
>>> that.
>>
>> Your proposals looks nice to me ;-)
>
> (Caveat: I have not been following this thread.)
>
> I think that `cl-loop' should be as close to Common Lisp `loop'
> as we can reasonably make it.  We should _not_ be adding other
> features to it or changing its behavior away from what it is
> supposedly emulating.
>
> If you want, create a _different_ macro that is Emacs-specific,
> with whatever behavior you want.  Call it whatever you want
> that will not be confused with Common Lisp emulation.
>
> Please keep `cl-' for Common Lisp emulation.  We've already
> seen more than enough tampering with this - people adding
> their favorite thing to the `cl-' namespace.  Not good.
Drew, i respect your opinion; but so far the change
would just extend `cl-loop' which as you noticed has being already
extended before.
For instance, we have:
cl-loop for x being the overlays/buffers ...

Don't see a problem to have those things.  We already point out in the
manual that these are Emacs specific things, so nobody should be fooled
with that.  As far as we cover all CL clauses, what problem could be in
having a few more?

I find interesting be able to do things like the following:

--8<-----------------------------cut here---------------start------------->8---
(require 'find-lisp)
(let ((op "defun")
       (dir (expand-file-name "lisp" source-directory)))
   (setq funcs
         (cl-loop for f in (find-lisp-find-files dir "\.el\\'") nconc
                  (with-temp-buffer
                    (insert-file-contents-literally f)
                    (let ((regexp (format "^(%s \\(\\S +\\)" op)))
                      (cl-loop for x the matches of regexp using '(group 1) collect x)))))
   (length funcs))

=> 38898 ; op: defun
=> 1256 ; op: defmacro
=> 1542 ; op: defsubst

--8<-----------------------------cut here---------------end--------------->8---



Reply | Threaded
Open this post in threaded view
|

bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

Drew Adams
> >>> Or an addition to cl-loop that would allow doing something like
> >>>    (cl-loop for m being the matches of "foo\\|bar"
> >>>             do ...)
> >>> Then you could easily 'collect m' to get the list of matches if you want
> >>> that.
> >> Your proposals looks nice to me ;-)
> >
> > (Caveat: I have not been following this thread.)
> >
> > I think that `cl-loop' should be as close to Common Lisp `loop'
> > as we can reasonably make it.  We should _not_ be adding other
> > features to it or changing its behavior away from what it is
> > supposedly emulating.
> >
> > If you want, create a _different_ macro that is Emacs-specific,
> > with whatever behavior you want.  Call it whatever you want
> > that will not be confused with Common Lisp emulation.
> >
> > Please keep `cl-' for Common Lisp emulation.  We've already
> > seen more than enough tampering with this - people adding
> > their favorite thing to the `cl-' namespace.  Not good.
>
> Drew, i respect your opinion; but so far the change
> would just extend `cl-loop' which as you noticed has being
> already extended before.  For instance, we have:
> cl-loop for x being the overlays/buffers ...
>
> Don't see a problem to have those things.  We already point out in the
> manual that these are Emacs specific things, so nobody should be fooled
> with that.  As far as we cover all CL clauses, what problem could be in
> having a few more?

You make a fair point when you stick to only extension and
keep compatibility for the rest.  I still disagree with it,
for the reasons given below.

And because the next enhancement proposal will perhaps just
point to this one as more precedent for making changes,
without bothering to, itself, ensure that "we cover all CL
clauses" etc.  As you pointed out: "so far"...  Little by
little, we've already seen `cl-' diluted from CL by being
incrementally "enhanced".

Here's my general opinion on this kind of thing:

I agree that such things are useful.  I have nothing against
them, and I'm glad to see Emacs have them.  My objection is
to using `cl-loop' for it.

`cl-loop' - and all of the stuff in `cl-*' - should be for
Common Lisp emulation.  Nothing more or less.  That's my
opinion.

Emacs should have its _own_, non-cl-* functions, macros,
variables, whatever.  It can take Common Lisp constructs
as a point of departure or inspiration, and extend enhance,
limit, or in any way change that point of departure as is
most fitting and useful for Emacs.

That's normal.  I'm all for that kind of thing.  But it
should not be confused with Common Lisp emulation.  `cl-'
should be kept for Common Lisp emulation.

Users should be able to recognize when they are using CL
code (an emulation of it).  Users should be able to take
existing CL code and use it in Emacs with little or no
modification (no, we're not there yet, and never will be
completely, but it's a good goal).

Put all this stuff - and more - into an `eloop' macro.
Since it will be so much better and more Emacsy, with
specifics that are especially useful for Emacs, it is
what users will (eventually) use instead of `cl-loop'.

Since it will do everything that `cl-loop' does (and
more), eventually only the rare user who needs, or for
some reason really wants, code that is CL or close to
it will use `cl-loop'.  Everyone else will use `eloop'.
No problem.

I am sure that my opinion on this is a minority one -
perhaps a minority of one.

But going the other direction, along lines such as what
you suggest:

1. We lose the value of `cl-' as an emulation of CL.  And
   typically we lose compatibility with existing CL code.

2. We lose the ability, when seeing something `cl-', to
   know we can look it up in the (fine) Common Lisp docs.

3. Where does it stop?  What's the point of `cl-', if
   anything goes and we can stuff whatever into it?

What prevents Emacs design from doing the right thing?
What do we lose by putting non-CL stuff into an `eloop'
that extends `cl-loop' in Emacsy ways?

Sure, invent more and better and different.  But put it
in a different namespace or in no namespace.

If `cl-' is just an Emacs thing and no longer a Common
Lisp thing then why the pretense of having a CL manual;
and using a `cl-' namespace; and pointing to the CL docs
for explanation (the Emacs docs explain practically
nothing about its `cl-' constructs - there is really no
doc for `cl-' in Emacs)?

What's the point?  Leave `cl-loop' as Common Lisp's `loop'.
Create a more-and-better, more Emacsy `eloop' or whatever.
Complete freedom - do whatever.  My vote is only that `cl-'
be kept for CL (and even be cleaned up to be more like it).



12