bug#37189: 25.4.1: vc-hg-ignore implementation is missing

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Wolfgang Scherer
Using `vc-ignore' in a *vc-dir* buffer for mercurial does not provide a correct entry in `.hgignore.'.

The function `vc-hg-ignore'' below autodetects the correct syntax and adds a correctly quoted entry to `.hgignore'.

Feel free to incorporate it in `vc-hg.el'.


(defvar vc-hg--py-regexp-special-chars
  (mapcar
   (function
    (lambda (_c)
      (cons _c (concat "\\" (char-to-string _c)))))
   (append "()[]{}?*+-|^$\\.&~# \t\n\r\v\f" nil))
  "Characters that have special meaning in Python regular expressions.")

(defun vc-hg--py-regexp-quote (string)
  "Return a Python regexp string which matches exactly STRING and nothing else.
Ported from Python v3.7"
  (mapconcat
   (function
    (lambda (_c)
      (or (cdr (assq _c vc-hg--py-regexp-special-chars))
          (char-to-string _c))))
   string ""))

(defvar vc-hg-ignore-detect-wildcard "[*^$]"
  "Regular expresssion to detect wildcards in an ignored file specification.")

(defun vc-hg-ignore (file &optional directory remove)
  "Ignore FILE of DIRECTORY (default is `default-directory').

FILE is a file wildcard, relative to the root directory of DIRECTORY.

If FILE matches the regular expression
`vc-hg-ignore-detect-wildcard', it is appended to .hgignore as
is. Otherwise, FILE is escaped/expanded according to the active
syntax in .hgignore. If the syntax is `regexp', FILE is quoted as
anchored literal Python regexp and if FILE is a directory, the
trailing `$' is omitted.  Otherwise, if the syntax is `glob',
FILE is used unquoted and if FILE is a directory, a `*' is
appended.

When called from Lisp code, if DIRECTORY is non-nil, the
repository to use will be deduced by DIRECTORY; if REMOVE is
non-nil, remove FILE from ignored files."
  (let ((ignore (vc-hg-find-ignore-file (or directory default-directory)))
        (pattern file)
        root-dir file-path syntax)
    (unless (string-match vc-hg-ignore-detect-wildcard pattern)
      (setq root-dir (file-name-directory ignore))
      (setq file-path (expand-file-name file directory))
      (setq pattern (substring file-path (length root-dir)))
      (save-match-data
        (with-current-buffer (find-file-noselect ignore)
          (goto-char (point-max))
          (setq syntax
                (if (re-search-backward "^ *syntax: *\\(regexp\\|glob\\)$" nil t)
                    (match-string 1)
                  "regexp")))
        (setq pattern
              (if (string= syntax "regexp")
                  (concat "^" (vc-hg--py-regexp-quote pattern)
                          (and (not (file-directory-p file-path)) "$"))
                (concat pattern (and (file-directory-p file-path) "*"))))))
    (if remove
        (vc--remove-regexp pattern ignore)
      (vc--add-line pattern ignore))))

Reply | Threaded
Open this post in threaded view
|

bug#37189: Acknowledgement (25.4.1: vc-hg-ignore implementation is missing)

Wolfgang Scherer
Patch with commit message attached.


0001-Provides-vc-hg-ignore-to-make-vc-ignore-work-correct.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#37189: Acknowledgement (25.4.1: vc-hg-ignore implementation is missing)

Eli Zaretskii
> From: Wolfgang Scherer <[hidden email]>
> Date: Tue, 27 Aug 2019 01:25:59 +0200
>
> +(defun vc-hg-ignore (file &optional directory remove)
> +  "Ignore FILE of DIRECTORY (default is `default-directory').
> +
> +FILE is a file wildcard, relative to the root directory of DIRECTORY.

I think instead of "root directory of DIRECTORY" this should say "the
top-level directory of DIRECTORY's repository".

> +If FILE matches the regular expression
> +`vc-hg-ignore-detect-wildcard', it is appended to .hgignore as
> +is. Otherwise, FILE is escaped/expanded according to the active
> +syntax in .hgignore. If the syntax is `regexp', FILE is quoted as
> +anchored literal Python regexp and if FILE is a directory, the
> +trailing `$' is omitted.  Otherwise, if the syntax is `glob',
> +FILE is used unquoted and if FILE is a directory, a `*' is
> +appended.

Our convention is to leave 2 spaces between sentences in comments and
doc strings.

> +When called from Lisp code, if DIRECTORY is non-nil, the

"When called from Lisp" implies that this function can be called in
some other way, which is generally correct only with commands.  But
this function is not a command, so I'm unsure what this means here.

Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#37189: *** GMX Spamverdacht *** Re: bug#37189: Acknowledgement (25.4.1: vc-hg-ignore implementation is missing)

Wolfgang Scherer
Am 27.08.19 um 09:45 schrieb Eli Zaretskii:
>> From: Wolfgang Scherer <[hidden email]>
>> Date: Tue, 27 Aug 2019 01:25:59 +0200
>>
>> +(defun vc-hg-ignore (file &optional directory remove)
>> +  "Ignore FILE of DIRECTORY (default is `default-directory').
>> +
>> +FILE is a file wildcard, relative to the root directory of DIRECTORY.
> I think instead of "root directory of DIRECTORY" this should say "the
> top-level directory of DIRECTORY's repository".
I disagree. This comment is a modified version of the comment for
`vc-default-ignore'. And the exact same phrase also pops up for
`vc-svn-ignore':

vc.el:1420:FILE is a file wildcard, relative to the root directory of DIRECTORY.
vc-svn.el:356:FILE is a file wildcard, relative to the root directory of DIRECTORY.

Also `top level' only appears twice in the `vc-' files:

vc-bzr.el:1070: A merge has been performed.\nA commit from the top-level directory
vc-cvs.el:904:;; at the top level for CVS.

while the phrases `root', `root directory', `repository root'
appear 23 times in strings and comments alone.

The sentence is actually utterly false for `vc-default-ignore',
since FILE is usually an absolute file path when called from a
*vc-dir* buffer. With relative paths the code is still wrong and
ends up as plain basename for the pattern.

Since the implementation of `vc-hg-ignore' corresponds to the
actual situation, I have changed the wording to reflect that using
the term `project root'.

>
>> +If FILE matches the regular expression
>> +`vc-hg-ignore-detect-wildcard', it is appended to .hgignore as
>> +is. Otherwise, FILE is escaped/expanded according to the active
>> +syntax in .hgignore. If the syntax is `regexp', FILE is quoted as
>> +anchored literal Python regexp and if FILE is a directory, the
>> +trailing `$' is omitted.  Otherwise, if the syntax is `glob',
>> +FILE is used unquoted and if FILE is a directory, a `*' is
>> +appended.
> Our convention is to leave 2 spaces between sentences in comments and
> doc strings.
Done.
>> +When called from Lisp code, if DIRECTORY is non-nil, the
> "When called from Lisp" implies that this function can be called in
> some other way, which is generally correct only with commands.  But
> this function is not a command, so I'm unsure what this means here.
I agree, and I fixed the remove option, so it actually does what
it says ;-)
> Thanks.

You're welcome.

Oops, I had forgotten to include the variable
`vc-hg-ignore-detect-wildcard'. So here is an overhauled patch.



0001-Provides-vc-hg-ignore-to-make-vc-ignore-work-correct.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#37189: *** GMX Spamverdacht *** Re: bug#37189: Acknowledgement (25.4.1: vc-hg-ignore implementation is missing)

Eli Zaretskii
> Cc: [hidden email]
> From: Wolfgang Scherer <[hidden email]>
> Date: Wed, 28 Aug 2019 03:46:53 +0200
>
> >> +(defun vc-hg-ignore (file &optional directory remove)
> >> +  "Ignore FILE of DIRECTORY (default is `default-directory').
> >> +
> >> +FILE is a file wildcard, relative to the root directory of DIRECTORY.
> > I think instead of "root directory of DIRECTORY" this should say "the
> > top-level directory of DIRECTORY's repository".
> I disagree. This comment is a modified version of the comment for
> `vc-default-ignore'. And the exact same phrase also pops up for
> `vc-svn-ignore':

But do you agree that the alternative text I proposed is more accurate
and more clear?  If so, we might wish to change all those places to
use the better wording.  It could be a separate commit, of course.



Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Wolfgang Scherer
In reply to this post by Wolfgang Scherer
Small fix for option remove. New version of patch attached.


0001-Provides-vc-hg-ignore-to-make-vc-ignore-work-correct.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Wolfgang Scherer
In reply to this post by Eli Zaretskii

Am 28.08.19 um 08:16 schrieb Eli Zaretskii:

>> Cc: [hidden email]
>> From: Wolfgang Scherer <[hidden email]>
>> Date: Wed, 28 Aug 2019 03:46:53 +0200
>>
>>>> +(defun vc-hg-ignore (file &optional directory remove)
>>>> +  "Ignore FILE of DIRECTORY (default is `default-directory').
>>>> +
>>>> +FILE is a file wildcard, relative to the root directory of DIRECTORY.
>>> I think instead of "root directory of DIRECTORY" this should say "the
>>> top-level directory of DIRECTORY's repository".
>> I disagree. This comment is a modified version of the comment for
>> `vc-default-ignore'. And the exact same phrase also pops up for
>> `vc-svn-ignore':
> But do you agree that the alternative text I proposed is more accurate
> and more clear?  If so, we might wish to change all those places to
> use the better wording.  It could be a separate commit, of course.


I agree, that the text should be corrected (and I already did so
for `vc-hg-ignore').  I do not agree to introducing the term
`top-level directory' as replacement for `project root'.

However, the problem here is not the wording, but the
incorrect/incomplete implementation of various
commands/functions. (Actually, not a single handler
implementation is correct. Not even CVS.)

1. `vc-ignore' provides the API specification for the
   backend-specific implementations:

     "Ignore FILE under the VCS of DIRECTORY.

     Normally, FILE is a wildcard specification that matches the files
     to be ignored.  When REMOVE is non-nil, remove FILE from the list
     of ignored files."
  
   Based on this API, mentioning the project root at all does not
   make any sense.

   If the wildcard specification is relative, it is relative to
   DIRECTORY (or default-directory).  It is definitely not
   relative to the project root.
  
2. `vc-cvs-ignore' implements only the appending part, but does
   not implement the `remove' option.

   When appending, the correct .cvsignore file is identified,
   but when called with an absolute filename (which is the
   default case), it writes the entire path into the .cvsignore
   file.

   `vc-cvs-ignore' also writes duplicate strings into .cvsignore.

   I submitted a patch for these errors as bug#37215.

   (And yes, I still have a bunch of old unconverted CVS repositories ;-)).

3. When `vc-svn-ignore' starts a new `svn:ignore' property, The
   function `vc-svn-ignore-completion-table' parses an error
   message as ignore patterns. The split also matches any
   whitespace instead of lines only. I have submitted a patch for
   these errors as bug##37214.

   `vc-svn-ignore' states:

      "Ignore FILE under Subversion.
      FILE is a file wildcard, relative to the root directory of DIRECTORY."

    This is just not true. The correct description is:

      "Ignore FILE under Subversion.
      FILE is a wildcard specification, either relative to
      DIRECTORY or absolute."

    `vc-svn-ignore' gets the relative filename right (in a manner
    suitable for Git), but then fails to add the ignore spec to
    the correct subdirectory, if the relative path has at least
    one level of directories. A patch is supplied as bug#37216.

4. This leaves `vc-default-ignore'.

   Both the description and implementation are wrong.

   Only the basename of FILE is written to the ignore file. This is
   wrong for all filenames relative to project root with one or
   more parent directories.

   The remove option is also implemented incorrectly.

   A patch is supplied as bug#37217.

The mentioned commits replace all occurences of the incorrect description.





Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Wolfgang Scherer
In reply to this post by Wolfgang Scherer
A unit test showed, that the removal regexp was faulty. New version of patch attached.


0001-Provide-vc-hg-ignore-to-make-vc-ignore-work-correctl.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Dmitry Gutov
Hi Wolfgang,

On 29.08.2019 18:52, Wolfgang Scherer wrote:
> +  "Ignore FILE of DIRECTORY (default is `default-directory').

IF this function needs a docstring at all (which is not obvious since it
should be following vc-ignore), I think I'd rather this followed the
latter's docstring. Where the clarification about the default is not in
the first sentence.

Also, I think saying "Ignore FILE under DIRECTORY" would be better, if
you intend to add this particular semantics to relative names.

> +Otherwise, FILE is either relative to DIRECTORY or absolute. FILE
> +is converted to a path relative to the project root of DIRECTORY.

Isn't it a bit odd that vc-ignore's docstring doesn't specify this
distinction, and yet we're trying to implement it in vc-hg-ignore?

Do you have a particular reason for that?

> +                (concat pattern (and (file-directory-p file-path) "*"))))))

I think it needs to asterisks for the glob to become recursive. At least
according to https://stackoverflow.com/a/255094/615245.

> +    (lambda (_c)
> +      (cons _c (concat "\\" (char-to-string _c)))))

Our convention says that an argument whose name starts with underscore
is unused. That's not the case here, so it shouldn't be named like that.

> +    (lambda (_c)
> +      (or (cdr (assq _c vc-hg--py-regexp-special-chars))
> +          (char-to-string _c))))

Same.



Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Wolfgang Scherer
Hi Dmitry,

Am 25.12.19 um 01:16 schrieb Dmitry Gutov:
> On 29.08.2019 18:52, Wolfgang Scherer wrote:
>> +  "Ignore FILE of DIRECTORY (default is `default-directory').
>
> IF this function needs a docstring at all (which is not obvious since it should be following vc-ignore), I think I'd rather this followed the latter's docstring. Where the clarification about the default is not in the first sentence.

vc-hg-ignore needs a docstring, since it exhibits specific behavior for the backend (e.g. glob/regex syntax), which is not and cannot be covered by the rather generic vc-ignore docstring.

Most of the description in  vc-ignore is not even applicable to the backend specific functions, e.g. there is no interactive functionality in the backend function and no backend is determined.

If a backend does not provide a vc-BACKEND-ignore function, vc-default-ignore is invoked. This function has a docstring of its own (which contains the clarification about the default in the first sentence). According to your argument, this function should also have no docstring of its own. However, I cannot see how both docstrings are equivalent.

vc-hg-ignore is modeled after vc-default-ignore, the docstring was copied from vc-default-ignore and modified to fit the implementation. I have modified the docstring further to match other backend specific ignore functions.

>
> Also, I think saying "Ignore FILE under DIRECTORY" would be better, if you intend to add this particular semantics to relative names.
All other ignore functions say "Ignore FILE under VCS". I have modified the docstring accordingly.
>
>> +Otherwise, FILE is either relative to DIRECTORY or absolute. FILE
>> +is converted to a path relative to the project root of DIRECTORY.
>
> Isn't it a bit odd that vc-ignore's docstring doesn't specify this distinction, and yet we're trying to implement it in vc-hg-ignore?

No. vc-ignore's semantic roots stem from SCCS/RCS/CVS/subversion with single directory ignore files. The ignore files for those VCS only contain file patterns.

For a VCS with subdirectories and a single ignore file at the root of the repository (Git, Mercurial), rooted and non-rooted expressions are handled differently and the single directory ignore file paradigm fails.

>
> Do you have a particular reason for that?
Yes. It is the standard use case for ignoring something from vc-dir mode. As mentioned, the vc-default-ignore produces utterly useless results for Git and Mercurial in vc-dir mode.
>
>> +                (concat pattern (and (file-directory-p file-path) "*"))))))
>
> I think it needs to asterisks for the glob to become recursive. At least according to https://stackoverflow.com/a/255094/615245.
No, according to https://stackoverflow.com/a/255094/2127439, all of "bin/", "bin/*", "bin/**" are equivalent under glob syntax. I also tested this specifically and extensivlely.
>
>> +    (lambda (_c)
>> +      (cons _c (concat "\\" (char-to-string _c)))))
>
> Our convention says that an argument whose name starts with underscore is unused. That's not the case here, so it shouldn't be named like that.
OK.
>
>> +    (lambda (_c)
>> +      (or (cdr (assq _c vc-hg--py-regexp-special-chars))
>> +          (char-to-string _c))))
>
> Same.

OK.

New patch attached.


0001-Provide-vc-hg-ignore-to-make-vc-ignore-work-correctl.patch (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Andreas Schwab-2
On Jan 05 2020, Wolfgang Scherer wrote:

> diff --git a/lisp/vc/vc-hg.el b/lisp/vc/vc-hg.el
> index eac9a6f..db84a28 100644
> --- a/lisp/vc/vc-hg.el
> +++ b/lisp/vc/vc-hg.el
> @@ -142,9 +142,9 @@ If nil, use the value of `vc-diff-switches'.  If t, use no switches."
>  If nil, use the value of `vc-annotate-switches'.  If t, use no
>  switches."
>    :type '(choice (const :tag "Unspecified" nil)
> - (const :tag "None" t)
> - (string :tag "Argument String")
> - (repeat :tag "Argument List" :value ("") string))
> +                 (const :tag "None" t)
> +                 (string :tag "Argument String")
> +                 (repeat :tag "Argument List" :value ("") string))
>    :version "25.1"
>    :group 'vc-hg)
>

That contains a lot of spurious whitespace differences.

Andreas.

--
Andreas Schwab, [hidden email]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Wolfgang Scherer
Am 05.01.20 um 09:58 schrieb Andreas Schwab:
> That contains a lot of spurious whitespace differences.

You are right, that is not intended. Sometimes muscle memory just
invokes my whitespace cleaner.

Corrected patch attached.


0001-Provide-vc-hg-ignore-to-make-vc-ignore-work-correctl.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#37189: 25.4.1: vc-hg-ignore implementation is missing

Dmitry Gutov
In reply to this post by Wolfgang Scherer
Hi Wolfgang,

Sorry for the wait.

On 05.01.2020 6:46, Wolfgang Scherer wrote:
>> On 29.08.2019 18:52, Wolfgang Scherer wrote:
>>> +  "Ignore FILE of DIRECTORY (default is `default-directory').
>>
>> IF this function needs a docstring at all (which is not obvious since it should be following vc-ignore), I think I'd rather this followed the latter's docstring. Where the clarification about the default is not in the first sentence.
>
> vc-hg-ignore needs a docstring, since it exhibits specific behavior for the backend (e.g. glob/regex syntax), which is not and cannot be covered by the rather generic vc-ignore docstring.

OK, I'm fine with that. But user-facing one is vc-ignore, so it can have
some generic language to that effect (saying something like "wildcard or
other syntax supported by the backend").

> Most of the description in  vc-ignore is not even applicable to the backend specific functions, e.g. there is no interactive functionality in the backend function and no backend is determined.

OK, true, but otherwise the description should be closer. And we also
have the description of this backend command in the header commentary of
vc.el.

> If a backend does not provide a vc-BACKEND-ignore function, vc-default-ignore is invoked. This function has a docstring of its own (which contains the clarification about the default in the first sentence). According to your argument, this function should also have no docstring of its own. However, I cannot see how both docstrings are equivalent.

Actually, I'd have more doubt in its necessity, yes.

This one really should be covered by the docstring of vc-ignore, as well
as the description of the 'ignore' VC command in the header commentary.
I mean, it can have a docstring, but it would be a plain copy of the
existing descriptions elsewhere.

> vc-hg-ignore is modeled after vc-default-ignore, the docstring was copied from vc-default-ignore and modified to fit the implementation. I have modified the docstring further to match other backend specific ignore functions.

Actually, let's talk about vc-default-ignore.

In a recent patch you submitted in debbugs#37217 you changed its
docstring to say "either relative to DIRECTORY or absolute" whereas it
originally said "either relative to the root directory of DIRECTORY or
absolute. That sounds like a change in documented behavior. And it could
be a problem if it were in a docstring users are actually likely to see.

In any case, I think the docstrings should really be in accord, and the
"more private" functions shouldn't have crucial information about
command's behavior that the users won't be able to see.

>> Also, I think saying "Ignore FILE under DIRECTORY" would be better, if you intend to add this particular semantics to relative names.
> All other ignore functions say "Ignore FILE under VCS". I have modified the docstring accordingly.
>>
>>> +Otherwise, FILE is either relative to DIRECTORY or absolute. FILE
>>> +is converted to a path relative to the project root of DIRECTORY.
>>
>> Isn't it a bit odd that vc-ignore's docstring doesn't specify this distinction, and yet we're trying to implement it in vc-hg-ignore?
>
> No. vc-ignore's semantic roots stem from SCCS/RCS/CVS/subversion with single directory ignore files. The ignore files for those VCS only contain file patterns.

vc-ignore is still the command the user calls, isn't it? Or are you
using some other command?

(This seems like the key question of this email at this point).

If yes, I also wonder what are the cases when the DIRECTORY argument is
important. I.e., when is it not the same as default-directory?

vc-ignore, when called interactively, sets this argument to nil (so it
is set to default-directory). vc-dir-ignore doesn't pass the second
argument either.

Importantly, when do we really expect FILE to be something other than
absolute, relative to the repository root, or a simple glob (e.g. *.c)
which is going to apply to files regardless of the directory? BTW, it
seems to me like f144c87f92b broke the last case. Or at least made it
work worse (calling vc-ignore with '*.c' somewhere deep inside the tree
would prepend its subdirectory to it).

There is some *valuable* code in this patch, but let's resolve this
relative-names confusion first.

 From what I see, when vc-ignore receives the FILE argument
interactively, it's either absolute (as a product of read-file-name when
the user choose a file with completion), or a free-form glob when the
user just went ahead and wrote one anyway.

So instead of the current always-relativizing logic, I think we should
choose what to do by calling file-name-absolute-p. And if it's not
absolute, just take the value as-is, because there is no telling if the
user meant "this file in the current dir" or "all files with that name".
If it is absolute, on the other hand, yes, make it relative to the
repository root.

>>> +                (concat pattern (and (file-directory-p file-path) "*"))))))
>>
>> I think it needs to asterisks for the glob to become recursive. At least according to https://stackoverflow.com/a/255094/615245.
> No, according to https://stackoverflow.com/a/255094/2127439, all of "bin/", "bin/*", "bin/**" are equivalent under glob syntax. I also tested this specifically and extensivlely.

Very well.