Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
> branch: master
> commit d82603747564063f908c9c877449c827a9808528
> Author: Paul Eggert <[hidden email]>
> Commit: Paul Eggert <[hidden email]>
>
>     Remove the need for temacs.in
>
>     Instead of building a file temacs.in used only to compute a
>     fingerprint, compute the fingerprint directly from the .o and
>     .a files that go into temacs.in.  This speeds up the build by
>     avoiding the need to link temacs twice, once with a dummy
>     fingerprint.

Please don't do this. Computing a fingerprint over temacs.in factors link
layout information into the fingerprint hash. Your approach doesn't. It's
possible to link Emacs in different ways from the same object files and
produce different binaries. I don't think a little build speedup is worth
the safety loss.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Paul Eggert
On 4/10/19 11:53 AM, Daniel Colascione wrote:
> Computing a fingerprint over temacs.in factors link
> layout information into the fingerprint hash. Your approach doesn't. It's
> possible to link Emacs in different ways from the same object files and
> produce different binaries.

Computing a fingerprint over temacs.in also omitted layout information.
This was particularly true when building position-independent
executables. But even for the non-PIE case the fingerprint did not cover
dynamically-linked libraries.

In practice the omitted layout information didn't matter for temacs.in,
as it was not significant for what the fingerprint is used for.
Similarly, the information that the new approach omits from temacs is
not significant for the fingerprint's intended use, hence we haven't
lost anything significant by switching to the simpler-and-faster mechanism.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
> On 4/10/19 11:53 AM, Daniel Colascione wrote:
>> Computing a fingerprint over temacs.in factors link
>> layout information into the fingerprint hash. Your approach doesn't.
>> It's
>> possible to link Emacs in different ways from the same object files and
>> produce different binaries.
>
> Computing a fingerprint over temacs.in also omitted layout information.

temacs.in's layout is identical to temacs because temacs.in and temacs
differ only in the contents of the fingerprint array. They have the same
symbols in the same order in the same section. The temacs.in mechanism
covers all current and *and unknown future* binary layout modifications.
It breaks only if linker symbol arrangement depends on randomness or on
the precise content of the fingerprint array, and both of these
possibilities are unlikely.

The right way to avoid the need for temacs.in is to teach the build system
how to find the fingerprint array in the temacs binary and overwrite it
*in place* with the hash of temacs. If you want to do that, great --- I
suspect some invocation of nm(1) could tell you the file offset of the
symbol. This approach would even work in the case of a linker that did
randomize symbol locations.

Your approach isn't the right one though. Please stop making unsafe
changes for the sake of insignificant speedups to local development.

> This was particularly true when building position-independent
> executables. But even for the non-PIE case the fingerprint did not cover
> dynamically-linked libraries.

PIE and shared libraries are irrelevant. The whole point of pdumper is to
be invariant across different PIE and DSO configurations. Neither PIE
relocation nor DSO load addresses affect the *internal* layout of the
Emacs binary image *at runtime*, which is what we really care about. With
your change, the fingerprint calculation misses important and relevant
information. It's unsafe. There's a better way to speed up the build.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Paul Eggert
On 4/10/19 12:42 PM, Daniel Colascione wrote:
> PIE and shared libraries are irrelevant. The whole point of pdumper is to
> be invariant across different PIE and DSO configurations.

Yes, and that's part of the point. Because the pdumper doesn't care how
'write' is implemented so long as it's done correctly, the fingerprint
doesn't need to include a checksum of the implementation of 'write'.
There's a good chunk of the Emacs executable that is in the same
category as 'write' - that is, the chunk doesn't matter for the purposes
of the fingerprint. Whether we checksum the irrelevant chunk is a
pragmatic/efficiency issue; it's not needed for correctness.

With this in mind, checksumming the .o files ought to be enough to
generate a fingerprint good enough for the intended purpose of the
checksum. The checksum won't capture how 'write' is implemented, nor
will it capture detailed decisions about how the linker lays out Emacs's
low-level objects, but that's OK: the pdumper doesn't care about those
things so long as they're done correctly.

> The temacs.in mechanism
> covers all current and *and unknown future* binary layout modifications.
> It breaks only if linker symbol arrangement depends on randomness or on
> the precise content of the fingerprint array, and both of these
> possibilities are unlikely.

The mechanism could also "break" with whole-program optimization that
inlines the fingerprint array, or with other plausible future changes to
linkers as they get smarter. But none of this should matter, as any such
"breakage" should be irrelevant to the pdumper.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
> On 4/10/19 12:42 PM, Daniel Colascione wrote:
>> PIE and shared libraries are irrelevant. The whole point of pdumper is
>> to
>> be invariant across different PIE and DSO configurations.
>
> Yes, and that's part of the point. Because the pdumper doesn't care how
> 'write' is implemented so long as it's done correctly, the fingerprint
> doesn't need to include a checksum of the implementation of 'write'.
> There's a good chunk of the Emacs executable that is in the same
> category as 'write' - that is, the chunk doesn't matter for the purposes
> of the fingerprint. Whether we checksum the irrelevant chunk is a
> pragmatic/efficiency issue; it's not needed for correctness.
>
> With this in mind, checksumming the .o files ought to be enough to
> generate a fingerprint good enough for the intended purpose of the
> checksum. The checksum won't capture how 'write' is implemented, nor
> will it capture detailed decisions about how the linker lays out Emacs's
> low-level objects, but that's OK: the pdumper doesn't care about those
> things so long as they're done correctly.

No: that's simply wrong. pdumper *does* care about the low-level layout of
objects within Emacs. We have dump-to-emacs relocations based on offsets
from a known symbol within Emacs. The linker deciding to lay out objects
within sections in a different order will break the dump. We don't care
how the linker lays out the object so long as it's the same in the binary
that makes a dump and the binary that loads the dump. Your change makes it
possible for incompatible Emacs binaries to have the same fingerprint.

>> The temacs.in mechanism
>> covers all current and *and unknown future* binary layout modifications.
>> It breaks only if linker symbol arrangement depends on randomness or on
>> the precise content of the fingerprint array, and both of these
>> possibilities are unlikely.
>
> The mechanism could also "break" with whole-program optimization that
> inlines the fingerprint array, or with other plausible future changes to
> linkers as they get smarter.

The optimization you've described doesn't matter: as long as changing the
*value* of the fingerprint array (not its length!) preserves Emacs *object
layout* change in Emacs, we're good.

> But none of this should matter, as any such
> "breakage" should be irrelevant to the pdumper.

It's quite relevant. Please either revert the temacs.in removal patch or
implement the inline fingerprint stamping I described. The way you've left
it is broken.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Paul Eggert
>> checksumming the .o files ought to be enough to
>> generate a fingerprint good enough for the intended purpose of the
>> checksum.
>
> No: that's simply wrong. pdumper *does* care about the low-level layout of
> objects within Emacs. We have dump-to-emacs relocations based on offsets
> from a known symbol within Emacs.

Ah, sorry, I didn't know that.

> The linker deciding to lay out objects
> within sections in a different order will break the dump.

But why would the linker do that?  It sounds like you're worrying that, even if
we give the linker the same object files in the same order and ask it to link
Emacs again, then the linker might generate a different executable, one that is
incompatible with the previous one.  But if that's the case, the temacs.in
solution has the same problem so we're no worse off than before. And if it's not
the case, then what exactly is the failure scenario here? I'm not seeing any
failure scenarios for the current approach that can't also be failure scenarios
for the previous approach, even now that I know that pdumper cares about object
order within a section.

>> The mechanism could also "break" with whole-program optimization that
>> inlines the fingerprint array, or with other plausible future changes to
>> linkers as they get smarter.
>
> The optimization you've described doesn't matter: as long as changing the
> *value* of the fingerprint array (not its length!) preserves Emacs *object
> layout* change in Emacs, we're good.

The optimization I describe could migrate some or all of the fingerprint array's
contents into the code, with the amount of migration depending on the contents
of the fingerprint array, and with the migrated contents omitted from the
fingerprint array. I don't see how object layout (in the sense that you
describe) would be preserved in that scenario.

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Stefan Monnier
In reply to this post by Daniel Colascione-5
> Please don't do this. Computing a fingerprint over temacs.in factors link
> layout information into the fingerprint hash. Your approach doesn't. It's
> possible to link Emacs in different ways from the same object files and
> produce different binaries. I don't think a little build speedup is worth
> the safety loss.

BTW, regarding this fingerprint: following this discussion, I see that
we actually don't know for sure what is preserved between two different
runs of `ld` and more to the point, there is no guarantee.

So, how'bout we don't compute a fingerprint at all.

Instead we generate a UUID: this way we can have our cake (no need for
a separate temacs.in) and eat it too (even very subtle changes `ld` does
between two different runs will lead to different "fingerprints").


        Stefan


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
>> Please don't do this. Computing a fingerprint over temacs.in factors
>> link
>> layout information into the fingerprint hash. Your approach doesn't.
>> It's
>> possible to link Emacs in different ways from the same object files and
>> produce different binaries. I don't think a little build speedup is
>> worth
>> the safety loss.
>
> BTW, regarding this fingerprint: following this discussion, I see that
> we actually don't know for sure what is preserved between two different
> runs of `ld` and more to the point, there is no guarantee.
>
> So, how'bout we don't compute a fingerprint at all.
>
> Instead we generate a UUID: this way we can have our cake (no need for
> a separate temacs.in) and eat it too (even very subtle changes `ld` does
> between two different runs will lead to different "fingerprints").

I thought about just making it random --- but I like reproducible and
deterministic builds. We could use the build GUID that some linkers can
compute from the build inputs (a bit like how we compute the fingerprint),
but there's no portable way to get it at runtime, and I really want to get
out of the ELF- and PE-parsing business.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
In reply to this post by Paul Eggert
>> The linker deciding to lay out objects
>> within sections in a different order will break the dump.
>
> But why would the linker do that?  It sounds like you're worrying that,
> even if
> we give the linker the same object files in the same order and ask it to
> link
> Emacs again, then the linker might generate a different executable, one
> that is
> incompatible with the previous one.

Who knows what linkers might do? Change is especially likely in an LTO
world  when different linker flags or the linker binary changes? I want to
future-proof the fingerprint mechanism by fingerprinting something as
close as possible to the actual binary we run.

> But if that's the case, the temacs.in
> solution has the same problem so we're no worse off than before. And if
> it's not
> the case, then what exactly is the failure scenario here? I'm not seeing
> any
> failure scenarios for the current approach that can't also be failure
> scenarios
> for the previous approach, even now that I know that pdumper cares about
> object
> order within a section.

It's likely that the linker to lay out objects in a section identically in
two different builds when the only difference between the builds is the
content of an array. If we're worried about the array being folded into
the code, we can make it volatile. If someone changes a linker flag that
changes object ordering, temacs.in will change. If the linker gets
upgraded and flips the order of two variables in .data, temacs.in will
change.

>>> The mechanism could also "break" with whole-program optimization that
>>> inlines the fingerprint array, or with other plausible future changes
>>> to
>>> linkers as they get smarter.
>>
>> The optimization you've described doesn't matter: as long as changing
>> the
>> *value* of the fingerprint array (not its length!) preserves Emacs
>> *object
>> layout* change in Emacs, we're good.
>
> The optimization I describe could migrate some or all of the fingerprint
> array's
> contents into the code, with the amount of migration depending on the
> contents
> of the fingerprint array, and with the migrated contents omitted from the
> fingerprint array. I don't see how object layout (in the sense that you
> describe) would be preserved in that scenario.

Even if the array is migrated into code, changes to linker configuration
will still change the temacs.in fingerprint.

The invariant here isn't that temacs.in has to be same as temacs, but
rather that if temacs_1 differs from temacs_2 in ways that pdumper cares
about, then temacs.in_1 must have hash different from temacs.in_2. My
objection is that the object-hashing approach can result in a situation in
which temacs_1 and temacs_2 differ in ways that pdumper cares about but
nevertheless have the same fingerprint.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Stefan Monnier
In reply to this post by Daniel Colascione-5
> I thought about just making it random --- but I like reproducible and
> deterministic builds.

Oh, right.


        Stefan

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Paul Eggert
In reply to this post by Daniel Colascione-5
Daniel Colascione wrote:

>> what exactly is the failure scenario here? I'm not seeing any
>> failure scenarios for the current approach that can't also be failure
>> scenarios for the previous approach
>
> It's likely that the linker to lay out objects in a section identically in
> two different builds when the only difference between the builds is the
> content of an array.

Sure, but it's even more likely for linkers to lay out objects in a section
identically in two different builds when there is no difference whatsoever
between the inputs to the builds. So I'm still not seeing the failure scenario
for the current approach that wouldn't also be a failure scenario for the
previous (temacs.in) approach.

> If we're worried about the array being folded into
> the code, we can make it volatile.

That wouldn't be enough; we'd need the volatile accesses to memory under the
program's control being tricky enough (they aren't now) so that the compiler
couldn't optimize them away or reorder the array or whatever. Admittedly this is
getting a little theoretical (but then this particular point is pretty
theoretical anyway :-).

> If someone changes a linker flag that
> changes object ordering, temacs.in will change.

Right, but that can also affect temacs in the previous approach; that is,
temacs.in might be linked with different flags than temacs is. Or the linker
might be upgraded between the time that temacs.in is built, and the time that
temacs is built. So these failure scenarios apply to the previous approach too.

If we rely on a fingerprint we have to give up on the idea of an ironclad
guarantee that if the fingerprint matches, Emacs is compatible. We have to
settle for just a high-enough probability in practice.

We could document ways in which the low-probability events can occur (hash
collision, linker change that breaks reproducible builds, etc.). Or we could
change the pdumper so that it doesn't rely on a fingerprint: instead, Emacs
could record a complete description of what it's assuming in the dump file, and
check this description when it reads the dump back in. However, that'd be some
work and is almost surely overkill.

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
> Daniel Colascione wrote:
>
>>> what exactly is the failure scenario here? I'm not seeing any
>>> failure scenarios for the current approach that can't also be failure
>>> scenarios for the previous approach
>>
>> It's likely that the linker to lay out objects in a section identically
>> in
>> two different builds when the only difference between the builds is the
>> content of an array.
>
> Sure, but it's even more likely for linkers to lay out objects in a
> section
> identically in two different builds when there is no difference whatsoever
> between the inputs to the builds. So I'm still not seeing the failure
> scenario
> for the current approach that wouldn't also be a failure scenario for the
> previous (temacs.in) approach.

You're not capturing all the inputs into the build. What about the linker
itself? What about arguments? What about the environment? Consider a
difference in section alignment between two versions of a linker. Consider
LTO, in which the differences may be even more drastic, since in this
case, object files contain an IR and not even machine code. What are you
going to do, fold into the hash the linker itself and all the code and
data on which it depends?

>> If we're worried about the array being folded into
>> the code, we can make it volatile.
>
> That wouldn't be enough; we'd need the volatile accesses to memory under
> the
> program's control being tricky enough (they aren't now) so that the
> compiler
> couldn't optimize them away or reorder the array or whatever. Admittedly
> this is
> getting a little theoretical (but then this particular point is pretty
> theoretical anyway :-).

It is theoretical: I agree. Whether or not the array is folded into
program code doesn't matter.

>> If someone changes a linker flag that
>> changes object ordering, temacs.in will change.
>
> Right, but that can also affect temacs in the previous approach; that is,
> temacs.in might be linked with different flags than temacs is. Or the
> linker
> might be upgraded between the time that temacs.in is built, and the time
> that
> temacs is built. So these failure scenarios apply to the previous approach
> too.

No, the temacs.in approach is *not* broken. temacs.in doesn't have to be
identical to temacs in order for the scheme to work. The only requirement
is that if something in the build environment or in Emacs itself changes
incompatibly, then temacs.in changes. If the temacs.in scheme *were* to
break, it would have to be in such a way that 1) some change in Emacs or
the environment resulted in temacs.in_1 (before the change) and
temacs.in_2 (after the change) having the same hash, but temacs_1 (before
the change) and temacs_2 (after the change) being different in a way that
breaks pdumper. Can you think of such a scenario? I prefer the temacs.in
scheme because it's  agnostic to whatever it is that the linker might be
doing. It's future-proof.

> If we rely on a fingerprint we have to give up on the idea of an ironclad
> guarantee that if the fingerprint matches, Emacs is compatible. We have to
> settle for just a high-enough probability in practice.
>
> We could document ways in which the low-probability events can occur (hash
> collision, linker change that breaks reproducible builds, etc.). Or we
> could
> change the pdumper so that it doesn't rely on a fingerprint: instead,
> Emacs
> could record a complete description of what it's assuming in the dump
> file, and
> check this description when it reads the dump back in. However, that'd be
> some
> work and is almost surely overkill.


If you want to address the build time issue, just rewrite the fingerprint
in place. I very strongly suspect that a simple volatile declaration will
be sufficient to ensure that the fingerprint array is contiguous in the
binary. We could even locate the array in temacs.in via brute-force
search, substituting a well-known highly-unlikely byte sequence as the
dummy fingerprint.


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Eli Zaretskii
Paul, I think this discussion pointed out that there are no advantages
to your change, while it does have disadvantages.  So I think we
should revert that change, and go back to fingerprinting the
executable.

OK?

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Stefan Monnier
> Paul, I think this discussion pointed out that there are no advantages
> to your change, while it does have disadvantages.

Hmm... it does have the advantage of speeding up compilation by
eliminating one `ld` call (and one that's on the critical path and
doesn't benefit from multiple cores).  I find this non-negligible.


        Stefan


Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
On 4/13/19 8:40 PM, Stefan Monnier wrote:
>> Paul, I think this discussion pointed out that there are no advantages
>> to your change, while it does have disadvantages.
>
> Hmm... it does have the advantage of speeding up compilation by
> eliminating one `ld` call (and one that's on the critical path and
> doesn't benefit from multiple cores).  I find this non-negligible.

Implementing the rewrite-in-place idea would make us both happy,
wouldn't it?

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Stefan Monnier
> Implementing the rewrite-in-place idea would make us both happy,
> wouldn't it?

Could be, yes.  I was thinking that if we use an initial fingerprint
value that's sufficiently unique, we could just look for this special
value and replace it without knowing "anything" about the executable
file format.

Another approach might be to generate the fingerprint on-the-fly at run
time (i.e. everytime a snapshot is dumped or loaded).  Not sure if that
can be made cheap enough easily enough, tho.


        Stefan

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Eli Zaretskii
In reply to this post by Stefan Monnier
> From: Stefan Monnier <[hidden email]>
> Date: Sat, 13 Apr 2019 23:40:52 -0400
>
> > Paul, I think this discussion pointed out that there are no advantages
> > to your change, while it does have disadvantages.
>
> Hmm... it does have the advantage of speeding up compilation by
> eliminating one `ld` call

By how much?  is it significant enough to countermand the
disadvantages?

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Stefan Monnier
>> > Paul, I think this discussion pointed out that there are no advantages
>> > to your change, while it does have disadvantages.
>> Hmm... it does have the advantage of speeding up compilation by
>> eliminating one `ld` call
> By how much?

I didn't bother to measure it, but I felt like it is a noticeable
difference in some of my use cases (because I sometimes noticed that
the make was stopped while linking temacs.in, and because it is also
often stopped while linking temacs).

> is it significant enough to countermand the disadvantages?

It all depends on your beliefs, not on technical issues, I think.

The fingerprint is not guaranteed foolproof when computed from the *.o
files and is not guaranteed foolproof when made from the temacs.in
file either.

Also the fingerprint is actually technically only needed to catch
misuses (when the user mistakenly uses a snapshot together with the wrong
emacs executable).

So it's a question of "how much work are we willing to do in order to
try and catch misuses".

Personally I appreciated the speed up, and found it to be worthwhile
compared to the slightly higher risk of not noticing a misuse.

But I'll let others choose which color we should choose for
this bikeshed.


        Stefan

Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Daniel Colascione-5
On Apr 14, 2019 7:55 AM, Stefan Monnier <[hidden email]> wrote:

>> > Paul, I think this discussion pointed out that there are no advantages
>> > to your change, while it does have disadvantages.
>> Hmm... it does have the advantage of speeding up compilation by
>> eliminating one `ld` call
> By how much?

I didn't bother to measure it, but I felt like it is a noticeable
difference in some of my use cases (because I sometimes noticed that
the make was stopped while linking temacs.in, and because it is also
often stopped while linking temacs).

> is it significant enough to countermand the disadvantages?

It all depends on your beliefs, not on technical issues, I think.

The fingerprint is not guaranteed foolproof when computed from the *.o
files and is not guaranteed foolproof when made from the temacs.in
file either. 


Isn't it? The two approaches are not equally robust. The temacs.in approach is as close as you're going to get to foolproof and future proof. Can you point out an instance, exempting hash collision, in which it fails?
Reply | Threaded
Open this post in threaded view
|

Re: [Emacs-diffs] master d826037 3/3: Remove the need for temacs.in

Stefan Monnier
>  The fingerprint is not guaranteed foolproof when computed from the *.o
>  files and is not guaranteed foolproof when made from the temacs.in
>  file either.
>
> Isn't it?

I don't see anything in the semantics of `ld` which makes the guarantees
we'd need, no.  Maybe current `ld` does, in practice, of course.

> The two approaches are not equally robust.

No, indeed.

> The temacs.in approach is as close as you're going to get to foolproof
> and future proof.

Might be.  Actually backpatching the fingerprint into `temacs` might be
in some cases more robust (it doesn't assume that both runs of `ld`
produce the same result).

But my point is just that it's a question of degree.
Just like the compilation time is a question of degree.

Noone is wrong or right here, these are just personal preferences.
I use machines whose age spans between 16 and 5 years (my main work
laptop is more than 10 years old because it's more or less the most
recent 4:3 I could find) so I value speed ups in compilation time
probably more than others on this list.


        Stefan

12