bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Eric Abrahamsen-2

Right now, Gnus backends return article header data by writing it in a
parseable format into the `nntp-server-buffer', and returning one of the
symbols 'nov or 'headers, indicating how the data should be parsed.

This isn't great because it requires backends to first munge their data
into a format that looks like the NNTP format, which is then parsed
again, which is an extra layer of data transformation. It also makes use
of the `nntp-server-buffer', which is something I'd like to work on
reducing because it causes problems with threading and introduces
potential encoding bugs.

This patch provides the possibility for backends to return their own
headers (ie a list of vectors), though it doesn't actually change any of
the backends to do that -- that will be another patch.

I have one question at this stage: the 'nov or 'headers value gets
stored into the `gnus-headers-retrieved-by' variable. That variable is
later checked in a couple of places like so:

      (when (and gnus-fetch-old-headers
                 (eq gnus-headers-retrieved-by 'nov))
        (if (eq gnus-fetch-old-headers 'invisible)
            (gnus-build-all-threads)
          (gnus-build-old-threads)))

If the variable is 'headers, the `gnus-build-*-threads' functions don't
get called at all.

What's the difference between 'nov and 'headers, and why can we build
threads in one case and not the other? If backends were to return their
own headers, what value should they return? I'll also note that the nnir
version of this function returns 'nov regardless of what the "real"
backend function returned -- why is that?

I would love to use some other, more direct, heuristic to decide about
building threads or not, and get rid of the
'nov/'headers/gnus-headers-retrieved-by stuff altogether, but I don't
yet know how to do that.

Eric


0001-WIP-on-allowing-Gnus-backends-to-return-headers-dire.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Lars Ingebrigtsen
Eric Abrahamsen <[hidden email]> writes:

> This patch provides the possibility for backends to return their own
> headers (ie a list of vectors), though it doesn't actually change any of
> the backends to do that -- that will be another patch.

Great!

> I have one question at this stage: the 'nov or 'headers value gets
> stored into the `gnus-headers-retrieved-by' variable. That variable is
> later checked in a couple of places like so:
>
>       (when (and gnus-fetch-old-headers
> (eq gnus-headers-retrieved-by 'nov))
> (if (eq gnus-fetch-old-headers 'invisible)
>    (gnus-build-all-threads)
>  (gnus-build-old-threads)))
>
> If the variable is 'headers, the `gnus-build-*-threads' functions don't
> get called at all.
>
> What's the difference between 'nov and 'headers, and why can we build
> threads in one case and not the other? If backends were to return their
> own headers, what value should they return?

It's not about threading per se, but about displaying information about
already-read articles (or more precisely -- about articles not in the
set that was requested).  Threads are build no matter how the backend
delivers the data.

If nn-*retrieve-headers supports NOV fetching, then certain Gnus
variables about filling in threads with "old" articles is switched on,
because fetching extra NOV headers is fast (you just say "fetch 100-150"
instead of "fetch 100-110,120-130,150").  With the backends that fetch
head by head, this is slow, so it's not done.

It's basically a distinction between NNTP and almost all the other
backends.

--
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Eric Abrahamsen-2
Lars Ingebrigtsen <[hidden email]> writes:

> Eric Abrahamsen <[hidden email]> writes:
>
>> This patch provides the possibility for backends to return their own
>> headers (ie a list of vectors), though it doesn't actually change any of
>> the backends to do that -- that will be another patch.
>
> Great!
>
>> I have one question at this stage: the 'nov or 'headers value gets
>> stored into the `gnus-headers-retrieved-by' variable. That variable is
>> later checked in a couple of places like so:
>>
>>       (when (and gnus-fetch-old-headers
>> (eq gnus-headers-retrieved-by 'nov))
>> (if (eq gnus-fetch-old-headers 'invisible)
>>    (gnus-build-all-threads)
>>  (gnus-build-old-threads)))
>>
>> If the variable is 'headers, the `gnus-build-*-threads' functions don't
>> get called at all.
>>
>> What's the difference between 'nov and 'headers, and why can we build
>> threads in one case and not the other? If backends were to return their
>> own headers, what value should they return?
>
> It's not about threading per se, but about displaying information about
> already-read articles (or more precisely -- about articles not in the
> set that was requested).  Threads are build no matter how the backend
> delivers the data.
>
> If nn-*retrieve-headers supports NOV fetching, then certain Gnus
> variables about filling in threads with "old" articles is switched on,
> because fetching extra NOV headers is fast (you just say "fetch 100-150"
> instead of "fetch 100-110,120-130,150").  With the backends that fetch
> head by head, this is slow, so it's not done.
>
> It's basically a distinction between NNTP and almost all the other
> backends.

Okay, that makes sense. So it's basically a flag saying header retrieval
is cheap enough that we might as well pull in more old messages than we
otherwise would. For example, nnmaildir builds and stores nov data
(which drives people with large maildirs insane because it takes
enormous amounts of time and space), so it returns 'nov, but if we ever
got rid of that (or made it optional) it would return 'headers.

I'm going to assume that any backend capable of returning its own
headers is probably going to... return 'headers. At any rate, as I
implement this for various backends, I'll start with the ones that
return 'headers to begin with. At some point, though, I'd still like to
get rid of this flag and build the distinction into the backend
functions themselves.

Hmmm... nnmaildir needs some love.

Eric



Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Eric Abrahamsen-2
Eric Abrahamsen <[hidden email]> writes:

> Lars Ingebrigtsen <[hidden email]> writes:
>
>> Eric Abrahamsen <[hidden email]> writes:
>>
>>> This patch provides the possibility for backends to return their own
>>> headers (ie a list of vectors), though it doesn't actually change any of
>>> the backends to do that -- that will be another patch.
>>
>> Great!
>>
>>> I have one question at this stage: the 'nov or 'headers value gets
>>> stored into the `gnus-headers-retrieved-by' variable. That variable is
>>> later checked in a couple of places like so:
Okay, here's the patch as it stands -- I'll do some testing first. Andy,
I'm copying you in because this touches nnir.el, and nnselect.el will
need to be edited accordingly.


0001-WIP-on-allowing-Gnus-backends-to-return-headers-dire.patch (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Lars Ingebrigtsen
In reply to this post by Eric Abrahamsen-2
Eric Abrahamsen <[hidden email]> writes:

> Okay, that makes sense. So it's basically a flag saying header retrieval
> is cheap enough that we might as well pull in more old messages than we
> otherwise would.

Yup.

> Hmmm... nnmaildir needs some love.

Indeed.

--
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Eric Abrahamsen-2
I knew this seemed too simple...

Andy Cohen pointed out that `gnus-get-newsgroup-headers{,-xover}' were
doing more work (running hooks, and the alter function) that needed to
also be done if headers were returned directly. There's a bunch of other
overlap between those two functions (and between
`gnus-get-newsgroup-headers' and `nnheader-parse-naked-head') that Andy
is working on refactoring (I hope).

In the meantime there are several other places in the code that use
`gnus-retrieve-headers', and will need to be updated to handle the
possibility that headers have been returned directly. What I'd really
like to do, as much as possible, is to switch to `gnus-fetch-headers',
so we get our headers back and don't need to know how that happened.
This will also avoid ugliness like `nnvirtual-convert-headers'.

I think I can probably figure out how to do this in most spots.
gnus-agent and gnus-cache are particularly gnarly, though: what does
-braid-nov/-braid-heads actually do?

Eric



Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Lars Ingebrigtsen
Eric Abrahamsen <[hidden email]> writes:

> I think I can probably figure out how to do this in most spots.
> gnus-agent and gnus-cache are particularly gnarly, though: what does
> -braid-nov/-braid-heads actually do?

Take two sets of nov/head headers and combine them.  Basically sort |
uniq, but they're already sorted.

--
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Eric Abrahamsen-2

On 11/08/19 22:03 PM, Lars Ingebrigtsen wrote:
> Eric Abrahamsen <[hidden email]> writes:
>
>> I think I can probably figure out how to do this in most spots.
>> gnus-agent and gnus-cache are particularly gnarly, though: what does
>> -braid-nov/-braid-heads actually do?
>
> Take two sets of nov/head headers and combine them.  Basically sort |
> uniq, but they're already sorted.

Okay good -- that should be as easy (easier) to do with an actual list
of headers than with text in a buffer.



Reply | Threaded
Open this post in threaded view
|

bug#38011: 27.0.50; [PATCH] WIP on allowing Gnus backends to return header data directly

Lars Ingebrigtsen
Eric Abrahamsen <[hidden email]> writes:

> Okay good -- that should be as easy (easier) to do with an actual list
> of headers than with text in a buffer.

Yup; much easier.

--
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no