bug#41955: 28.0.50; Monorepos and project.el

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Theodor Thornhill

Hi!

At work I've had the following issue. Assume we have some sort of
gigarepo, like this maybe:

gigarepo/
├── clients
│   ├── client1
│   ├── client2
│   ├── client3
│   └── client4
└── services
    ├── service1
    ├── service2
    ├── service3
    └── service4

The services are made in different languages, say, some in c#, f#, java -
as are clients, some are Typescript projects, some are JS etc etc.

Everything is rooted in a ".git" in "gigarepo/", and there are no
submodules or any fancyness.

As project.el works now, there are several issues arising. I'll just
note them down here, and probably split things up later, if that is ok.

* Lsp server/client
  In most of the projects, the lsp servers are indexing from what they
  consider root. Typically a tsconfig.json, elm.json etc. They get
  confused, eglot a bit more than lsp-mode (it has its own root finding
  algorithm). What happens is they look in root (gigarepo/), and it has
  no executable for lsp-server. One solution is then to install the
  server in root, but then it indexes the whole thing, and gets super
  slow, and indexes a lot of unrelated stuff (I'm not even working on
  client 2-4.)
 
* Buffer switching
  Lets say several of the clients uses a module called AuthService.ts.
  If I'm working on several of these projects you get a lot of identical
  files, so it is a bit hit and miss.
 
* Grepping
  This one was the worst for me, since grepping was very slow given the
  size of the project, and grepping loads of unrelated files returns a
  lot of noise.


What would be nice is to be able to get the benefits of the vc-dir
version of project.el, but not having to "git init" inside the child
projects. More specifically, to be able to choose "project context",
one as the closest project-root and one as the gigarepo project-root.

Also, If I've worked on both a service and a client, running git or
magit should probably be done from root rather than the subproject I am
currently in, to get the whole context.

Is there a way to do this? I realize this may be an odd situation, but
it came up nonetheless.

This report is getting long, and I think I'm already rambling a bit (and
forgetting stuff.), so I'll leave it for now.

Theo

P.S:
What I did to circumvent this (this is only a MVP, not at all optimized
for anything):

(defvar project-root-markers
  '("package.json"
    "tsconfig.json"
    "jsconfig.json"
    "elm.json"
    "*.sln")
  "Files or directories that indicate the root of a project.")

(defvar project-exclusion-list
  '("node_modules"
    "target"
    "build"
    "package-lock.json"
    "elm-stuff")
  "Things not to be included in project-find-file.")

(defun project-exclude-dirs ()
  (mapcar (lambda (dir)
          (add-to-list 'vc-directory-exclusion-list dir))
        project-exclusion-list))

(defun project-find-root (path)
  "Tail-recursive search in PATH for root markers."
  (let* ((this-dir (file-name-as-directory (file-truename path)))
         (parent-dir (expand-file-name (concat this-dir "../")))
         (system-root-dir (expand-file-name "/")))
    (cond
     ((project-root-p this-dir) (cons 'transient this-dir))
     ((equal system-root-dir this-dir) nil)
     (t (project-find-root parent-dir)))))

(defun project-root-p (path)
  "Check if current PATH has any of project root markers."
  (let ((results (mapcar (lambda (marker)
                           (file-exists-p (concat path marker)))
                         project-root-markers)))
    (eval `(or ,@ results))))




Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Eli Zaretskii
> Date: Fri, 19 Jun 2020 20:42:26 +0000
> From: Theodor Thornhill <[hidden email]>
>
> gigarepo/
> ├── clients
> │   ├── client1
> │   ├── client2
> │   ├── client3
> │   └── client4
> └── services
>     ├── service1
>     ├── service2
>     ├── service3
>     └── service4
>
> The services are made in different languages, say, some in c#, f#, java -
> as are clients, some are Typescript projects, some are JS etc etc.
>
> Everything is rooted in a ".git" in "gigarepo/", and there are no
> submodules or any fancyness.
>
> As project.el works now, there are several issues arising. I'll just
> note them down here, and probably split things up later, if that is ok.
>
> * Lsp server/client
>   In most of the projects, the lsp servers are indexing from what they
>   consider root. Typically a tsconfig.json, elm.json etc. They get
>   confused, eglot a bit more than lsp-mode (it has its own root finding
>   algorithm). What happens is they look in root (gigarepo/), and it has
>   no executable for lsp-server. One solution is then to install the
>   server in root, but then it indexes the whole thing, and gets super
>   slow, and indexes a lot of unrelated stuff (I'm not even working on
>   client 2-4.)
>  
> * Buffer switching
>   Lets say several of the clients uses a module called AuthService.ts.
>   If I'm working on several of these projects you get a lot of identical
>   files, so it is a bit hit and miss.
>  
> * Grepping
>   This one was the worst for me, since grepping was very slow given the
>   size of the project, and grepping loads of unrelated files returns a
>   lot of noise.
>
>
> What would be nice is to be able to get the benefits of the vc-dir
> version of project.el, but not having to "git init" inside the child
> projects. More specifically, to be able to choose "project context",
> one as the closest project-root and one as the gigarepo project-root.

Would it help to have facilities of specifying the files in a project
by starting with an empty project, and then adding the files one by
one?  Also, to be able to say that all the files in a given directory
(optionally, only files that match some shell wildcard), recursively,
should be added to a project?

This would probably need a way of making the list of files/directories
in a project persistent between sessions, because currently we rely on
the filesystem or a VCS to record that.



Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Theodor Thornhill

Would it help to have facilities of specifying the files in a project
by starting with an empty project, and then adding the files one by
one? Also, to be able to say that all the files in a given directory
(optionally, only files that match some shell wildcard), recursively,
should be added to a project?
Yeah! I actually suggested something like that. However, this will not fix the issue with programs looking for a specific directory for root. Such as eglot and whatever-language-server. 

This would probably need a way of making the list of files/directories
in a project persistent between sessions, because currently we rely on
the filesystem or a VCS to record that.

Yeah, maybe something like that too. 
Theo

Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Eli Zaretskii
> Date: Sat, 20 Jun 2020 07:48:53 +0000
> From: Theodor Thornhill <[hidden email]>
> Cc: [hidden email]
>
>  Would it help to have facilities of specifying the files in a project
>  by starting with an empty project, and then adding the files one by
>  one? Also, to be able to say that all the files in a given directory
>  (optionally, only files that match some shell wildcard), recursively,
>  should be added to a project?
>
> Yeah! I actually suggested something like that. However, this will not fix the issue with programs looking for a
> specific directory for root. Such as eglot and whatever-language-server.

They will need to learn to use the facilities we provide to define a
project in terms other than just the root directory.



Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Juri Linkov-2
In reply to this post by Theodor Thornhill
> What would be nice is to be able to get the benefits of the vc-dir
> version of project.el, but not having to "git init" inside the child
> projects. More specifically, to be able to choose "project context",
> one as the closest project-root and one as the gigarepo project-root.

Isn't this the same problem that we discussed recently in bug#41572 to
define a project's root using a special file .project or more generally
.dir-locals.el.



Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Dmitry Gutov
On 21.06.2020 02:29, Juri Linkov wrote:
> Isn't this the same problem that we discussed recently in bug#41572 to
> define a project's root using a special file .project or more generally
> .dir-locals.el.

Perhaps.

But as a monorepo with lots of files, this project (or a set of
projects) would be particularly susceptible to the problem of slow file
listing if the new backend doesn't reuse 'git ls-files' somehow, or
implements some fast strategy of its own.



Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Dmitry Gutov
In reply to this post by Eli Zaretskii
On 20.06.2020 11:57, Eli Zaretskii wrote:
> They will need to learn to use the facilities we provide to define a
> project in terms other than just the root directory.

Eglot just cares about the return value of project-root.



Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Dmitry Gutov
In reply to this post by Theodor Thornhill
On 19.06.2020 23:42, Theodor Thornhill wrote:

> At work I've had the following issue. Assume we have some sort of
> gigarepo, like this maybe:
>
> gigarepo/
> ├── clients
> │   ├── client1
> │   ├── client2
> │   ├── client3
> │   └── client4
> └── services
>      ├── service1
>      ├── service2
>      ├── service3
>      └── service4
>
> The services are made in different languages, say, some in c#, f#, java -
> as are clients, some are Typescript projects, some are JS etc etc.
>
> Everything is rooted in a ".git" in "gigarepo/", and there are no
> submodules or any fancyness.
>
> As project.el works now, there are several issues arising. I'll just
> note them down here, and probably split things up later, if that is ok.

Thank you for the report and the description. Before moving on to more
complete solutions, here are some initial comments:

> * Lsp server/client
>    In most of the projects, the lsp servers are indexing from what they
>    consider root. Typically a tsconfig.json, elm.json etc. They get
>    confused, eglot a bit more than lsp-mode (it has its own root finding
>    algorithm). What happens is they look in root (gigarepo/), and it has
>    no executable for lsp-server. One solution is then to install the
>    server in root, but then it indexes the whole thing, and gets super
>    slow, and indexes a lot of unrelated stuff (I'm not even working on
>    client 2-4.)

I think Eglot is wrong in simply defaulting to project-root, without
looking for tsconfig.json, etc. It would be pretty easy to implement,
and would bring it closer to VS Code's behavior in this aspect (I think).

But that would require it to maintain a list of such files.

> * Buffer switching
>    Lets say several of the clients uses a module called AuthService.ts.
>    If I'm working on several of these projects you get a lot of identical
>    files, so it is a bit hit and miss.

Uniquify should help, but yeah, it sounds like a pain point indeed.

> * Grepping
>    This one was the worst for me, since grepping was very slow given the
>    size of the project, and grepping loads of unrelated files returns a
>    lot of noise.

This is just a small step, but have you tried the patch here (with rg)?

https://lists.gnu.org/archive/html/emacs-devel/2020-06/msg00547.html

> What would be nice is to be able to get the benefits of the vc-dir
> version of project.el, but not having to "git init" inside the child
> projects. More specifically, to be able to choose "project context",
> one as the closest project-root and one as the gigarepo project-root.

Here's some code based on yours.

Also rough and unofficial, but should hopefully be faster.

(defvar project-root-markers
   '("package.json"
     "tsconfig.json"
     "jsconfig.json"
     "elm.json"
     "*.sln")
   "Files or directories that indicate the root of a project.")

;; Probably better in .dir-locals.el.
(setq project-vc-ignores
       '("node_modules"
         "target"
         "build"
         "package-lock.json"
         "elm-stuff"))

(defun project-find-nomono (dir)
   (let ((root
          (locate-dominating-file
           dir
           (lambda (d)
             (let ((default-directory d))
               (seq-some #'file-expand-wildcards
                         project-root-markers))))))
     (when (and root
                (ignore-errors
                  (vc-responsible-backend root)))
       (cons 'vc root))))

(add-hook 'project-find-functions #'project-find-nomono)




Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Felician Nemeth
In reply to this post by Theodor Thornhill
>> Would it help to have facilities of specifying the files in a project
>> by starting with an empty project, and then adding the files one by
>> one? Also, to be able to say that all the files in a given directory
>> (optionally, only files that match some shell wildcard), recursively,
>> should be added to a project?

I'm not completely sure, because I haven't really used EDE (the Emacs
Development Environment), but I think you can specify a project like
that in EDE.  And project.el supports EDE if I recall correctly.  So
maybe this complex "monorepo" use-case is already covered by existing
features of Emacs.



Reply | Threaded
Open this post in threaded view
|

bug#41955: 28.0.50; Monorepos and project.el

Dmitry Gutov
On 22.06.2020 20:17, Felician Nemeth wrote:
> I'm not completely sure, because I haven't really used EDE (the Emacs
> Development Environment), but I think you can specify a project like
> that in EDE.  And project.el supports EDE if I recall correctly.  So
> maybe this complex "monorepo" use-case is already covered by existing
> features of Emacs.

Interesting thought.

The current EDE integration is pretty basic (only project-root,
basically), but it can be extended fairly quickly, if an EDE project
contains relevant data.

But that doesn't seem to be the case: it seems to me that EDE uses the
same "one directory tree" model for the contents of the project. And the
:include-path and :system-include-path keys we see in the definition
examples are about code analysis (where to find external type and
function definitions), and not about describing the main bulk of the
files belonging to the project.