bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

classic Classic list List threaded Threaded
71 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
Andrea, I see nothing directly wrong with your patch, but perhaps our messages went past one another since our lists of proposed pure functions differ.

> More useful would be the ability to constant-fold ash, expt, %, mod and abs for a subset of each respective domain. I can write a patch.

Here is that patch.


0001-Constant-fold-mod-ash-expt-and-abs-with-constant-int.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Emacs - Bugs mailing list
Mattias Engdegård <[hidden email]> writes:

> More useful would be the ability to constant-fold ash, expt, %, mod and abs for a subset of each respective domain. I can write a patch.

Hi Mattias,

I'm not sure what would be more useful, I guess both are a good thing to
have.  Another reason why I'm interested is that I reuse these
definitions in the native compiler.

> Andrea, I see nothing directly wrong with your patch, but perhaps our messages went past one another since our lists of proposed pure functions differ.

yes that exactly what happen, thanks for looking at it.

I diffed our two lists and this is the results, functions included in
mine but not in your:

consp, hash-table-p, listp, nlistp, not, null, string-lessp, stringp,
symbolp.

functions included in your but not in mine:

characterp, integer-or-marker-p, keywordp, length, number-or-marker-p,
numberp, safe-length, sequencep.

I guess for the most part I can just include the one I've missed.  I'm
not only sure about the one operating on lists like `length' given the
list may be modified in the runtime (?).

I'll update the patch once this point is clarified.

Thanks

  Andrea



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Emacs - Bugs mailing list
Mattias Engdegård <[hidden email]> writes:

> 1 juli 2020 kl. 23.31 skrev Andrea Corallo <[hidden email]>:
>
>> Another reason why I'm interested is that I reuse these
>> definitions in the native compiler.
>
> In that case there are probably more functions you may want to consider for purity -- what about:
>
> < > <= >= = /=
> string< string= string-equal
> eq eql equal
> proper-list-p
> identity
> memq memql member
> assq assql assoc

Good point

>> I guess for the most part I can just include the one I've missed.
>
> By all means, but do not take my word for the correctness of my list -- think it through yourself. We mustn't err here.
>
>> I'm not only sure about the one operating on lists like `length' given the
>> list may be modified in the runtime (?).
>
> Not sure why this would be an obstacle, but I could have overlooked
> something important! Could you explain your thinking in greater
> detail, and if possible provide an example of code that you think
> might be miscompiled if 'length' or 'safe-length' were marked pure?

No, thinking about I believe you are correct.

I mixed in mind the fact that now the native compiler must handle
correctly in the CFG also pure functions taking mutable objects (given we
are adding them), but that is unrelated.  This is no problem for the
bytecompiler given the constant folding is done only locally.

So yes these are pure functions and so they should be marked.

> I still wonder if there is any reason to limit arithmetic constant
> folding to the portable fixnum range. Given that we don't evaluate
> fixnump or bignump at compile-time, what observable effects would
> constant-folding, say, (ash 1 32) have? Advice from deeper thinkers
> solicited!

I always thought the general idea is to respect the allocation side
effect we have creating a bignum.  Not sure if the class of example you
have in mind here fits this case.

Thanks

  Andrea



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
2 juli 2020 kl. 12.59 skrev Andrea Corallo <[hidden email]>:

>> I still wonder if there is any reason to limit arithmetic constant
>> folding to the portable fixnum range. Given that we don't evaluate
>> fixnump or bignump at compile-time, what observable effects would
>> constant-folding, say, (ash 1 32) have? Advice from deeper thinkers
>> solicited!
>
> I always thought the general idea is to respect the allocation side
> effect we have creating a bignum.  Not sure if the class of example you
> have in mind here fits this case.

Number allocation isn't a semantically visible effect and we probably don't want to change that. As far as I can tell, only fixnump and bignump can discriminate fixnums from bignums. There may be functions that only accept fixnums as arguments and thus fail with a different error, but I don't think we constant-fold any of them, and they would be easy to fix if we did.

It may be preferable to defer generation of very big numbers to run-time, to avoid evaluation of (ash 1 1000) at compile-time, but  such a limit should, if implemented, be independent of the fixnum limit (and likely higher).




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Emacs - Bugs mailing list
Mattias Engdegård <[hidden email]> writes:

> 2 juli 2020 kl. 12.59 skrev Andrea Corallo <[hidden email]>:
>
>>> I still wonder if there is any reason to limit arithmetic constant
>>> folding to the portable fixnum range. Given that we don't evaluate
>>> fixnump or bignump at compile-time, what observable effects would
>>> constant-folding, say, (ash 1 32) have? Advice from deeper thinkers
>>> solicited!
>>
>> I always thought the general idea is to respect the allocation side
>> effect we have creating a bignum.  Not sure if the class of example you
>> have in mind here fits this case.
>
> Number allocation isn't a semantically visible effect and we probably
> don't want to change that.

Well is cons allocation a semantically visible effect then?  How is it
different?

I thought the reason why cons is not constant folded is to respect the
allocation side effect, at least that's what I convinced my-self of :)

  Andrea



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
2 juli 2020 kl. 15.56 skrev Andrea Corallo <[hidden email]>:

> Well is cons allocation a semantically visible effect then?  How is it
> different?

Conses are mutable and thus each have their own identity. Numbers are immutable and have none; there is no defined way to distinguish two numbers that have the same value ('eq' does not give well-defined results). The compiler is free to, and does, deduplicate equal bignums. For instance, try

(disassemble (lambda () (list 18723645817263338474859 18723645817263338474859)))

and you will see that the resulting code will only contain one instance of the number.

> I thought the reason why cons is not constant folded is to respect the
> allocation side effect, at least that's what I convinced my-self of :)

Yes, in the sense that

 (defun f () (cons 'a 'b))

produces a fresh (a . b) each time (f) is called, because the returned values can be distinguished both explicitly by 'eq' and by mutating it and observing whether the change affects previously returned values or not. Neither works for numbers.




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Emacs - Bugs mailing list
Mattias Engdegård <[hidden email]> writes:

>> I thought the reason why cons is not constant folded is to respect the
>> allocation side effect, at least that's what I convinced my-self of :)
>
> Yes, in the sense that
>
>  (defun f () (cons 'a 'b))
>
> produces a fresh (a . b) each time (f) is called, because the returned
> values can be distinguished both explicitly by 'eq' and by mutating it
> and observing whether the change affects previously returned values or
> not. Neither works for numbers.

Understand makes perfectly sense.  Cons allocation is something
very visible in Lisp.

  Andrea



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
In reply to this post by Mattias Engdegård-2
2 juli 2020 kl. 17.49 skrev Stefan Monnier <[hidden email]>:

> Better yet, there's still hope that we change things such that `eq`
> behaves like `eql` on bignums (and maybe also on floats).

Speaking of which, Andrea may be in a good position to provide us with performance data about such a change, since making 'eq' more expensive is likely to be more visible in native code (assuming the operation is open-coded) than in bytecode or interpreted lisp. On the other hand, perhaps his compiler thingamajig is able to eliminate some checks statically by type propagation?

Anyway, since we now have bignums and have standardised on IEEE 754 binary64 floats, is there a reason to keep byte-opt-portable-numberp?

If we want to make allowance for capricious x87 rounding, what about rewriting it to accept integral floats in the ±2^53 range, as well as any integer? This is what it might look like:


0001-Relax-portable-number-predicate-in-byte-compiler.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Philipp Stephani
In reply to this post by Emacs - Bugs mailing list
Am Do., 2. Juli 2020 um 12:28 Uhr schrieb Mattias Engdegård <[hidden email]>:

>
> 1 juli 2020 kl. 23.31 skrev Andrea Corallo <[hidden email]>:
>
> > Another reason why I'm interested is that I reuse these
> > definitions in the native compiler.
>
> In that case there are probably more functions you may want to consider for purity -- what about:
>
> < > <= >= = /=
> string< string= string-equal
> eq eql equal
> proper-list-p
> identity
> memq memql member
> assq assql assoc

I don't think most of those are pure, as they have to "look into" an
object. For example, the result of "equal" does not only depend on the
argument objects, but also the objects they refer to.



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Stefan Monnier
In reply to this post by Mattias Engdegård-2
>> Better yet, there's still hope that we change things such that `eq`
>> behaves like `eql` on bignums (and maybe also on floats).
> Speaking of which, Andrea may be in a good position to provide us with
> performance data about such a change, since making 'eq' more expensive is
> likely to be more visible in native code (assuming the operation is
> open-coded) than in bytecode or interpreted lisp. On the other hand, perhaps
> his compiler thingamajig is able to eliminate some checks statically by
> type propagation?

Note that it can also be done without slowing down `eq` (by
hash-consing the bignums).


        Stefan




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Paul Eggert
On 7/2/20 12:38 PM, Stefan Monnier wrote:
> Note that it can also be done without slowing down `eq` (by
> hash-consing the bignums).

Yes, that's a better way to go. I wrote a patch to do that a while ago but never
got around to the laborious part, which was benchmarking.



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Paul Eggert
In reply to this post by Mattias Engdegård-2
On 7/2/20 11:01 AM, Mattias Engdegård wrote:
> If we want to make allowance for capricious x87 rounding, what about rewriting it to accept integral floats in the ±2^53 range, as well as any integer? This is what it might look like:

Another plausible option would be for Emacs to drop support for x87 rounding, in
the interest of portability. All Apple Intel machines have had SSE2, Microsoft
has been requiring SSE2 by default since MSVC 2012, and it's easy enough to do
the same with GCC and clang.

We'd be in good company as lots of other software packages have dropped support
for non-SSE2 machines, including Cygwin, Java, OpenOffice, PostgreSQL, Python,
QEMU, Thunderbird, VirtualBox; see <http://matejhorvat.si/en/unfiled/nosse2.htm>.



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Stefan Monnier
In reply to this post by Mattias Engdegård-2
> Anyway, since we now have bignums and have standardised on IEEE 754 binary64
> floats, is there a reason to keep byte-opt-portable-numberp?

Indeed, it seems like it might not be needed any more.

> If we want to make allowance for capricious x87 rounding, what about
> rewriting it to accept integral floats in the ±2^53 range, as well as any
> integer? This is what it might look like:

I must say I don't know what x87 rounding has to do with it.

I'd tend to assume that x87 rounding can virtually never be seen from
Elisp because it's hard to imagine how the C compiler will manage to
keep our Elisp floats long enough in the x87 stack to avoid rounding
back to 64bit floats between every Elisp-level operation.
Or are we worried about the double-rounding of x87?


        Stefan




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
3 juli 2020 kl. 01.16 skrev Paul Eggert <[hidden email]>:

> As you can see, sometimes SSE2 is closer to the mathematically-correct answer
> and sometimes x87 is. In typical C math code, x87 is better; in Emacs I imagine
> the reverse is true (for the reasons you mentioned), though I have not attempted
> to measure this.

Thanks for the examples and these were indeed what I had in mind (there's also the effect from having a greater exponent range in the intermediate result); Monniaux [1] is a good reference.

In practice, the extra precision of x87 code is so unreliable and fickle (unless the 80-bit long double is used throughout) that it's almost never worth it. (Being much slower doesn't help either.)

Fortunately modern compilers generate SSE code by default, only passing return values on the x87 stack as per the x86 ABI (which causes no harm). This reduces an already tiny risk to nil. We could add an elaborate configure or run-time test and admonishments to the installation instructions but frankly we have better use of our time. I suggest we replace byte-opt--portable-numberp with numberp (or nothing at all, depending on where it occurs) and be done with it.

---
[1] https://hal.archives-ouvertes.fr/hal-00128124/file/floating-point-article.pdf




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
In reply to this post by Philipp Stephani
2 juli 2020 kl. 21.09 skrev Philipp Stephani <[hidden email]>:

> I don't think most of those are pure, as they have to "look into" an
> object. For example, the result of "equal" does not only depend on the
> argument objects, but also the objects they refer to.

Unless I'm mistaken, they are pure enough for the purpose of constant folding, where the arguments are already known (constant) at compile-time; do come with a counter-example if you disagree.

Were you thinking about other uses of pure functions? Perhaps our notion of purity is underspecified.




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Paul Eggert
In reply to this post by Mattias Engdegård-2
On 7/3/20 1:32 AM, Mattias Engdegård wrote:

> In practice, the extra precision of x87 code is so unreliable and fickle (unless the 80-bit long double is used throughout) that it's almost never worth it.

Depends on whether you want reproducibility or accuracy. We prefer the former,
it seems.

> Fortunately modern compilers generate SSE code by default

No, GCC generates x87 code by default. You need to specify -mfpmath=sse to
convince it to not generate x87 code. (Or, when you build GCC, you need to pass
--with-mfpmath=sse to 'configure'; but I think this is uncommon, at least in the
GNU/Linux world.)

Having Emacs use --with-mfpmath=sse should improve performance a bit on x86. But
more important, it should make floating point more reproducible. If I get the
time I'll look into having 'configure' add it automatically.



Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
3 juli 2020 kl. 20.31 skrev Paul Eggert <[hidden email]>:

> No, GCC generates x87 code by default.

Sorry, my mistake -- I was testing various compiler options with godbolt and must have looked at LLVM instead.

> Having Emacs use --with-mfpmath=sse should improve performance a bit on x86. But
> more important, it should make floating point more reproducible. If I get the
> time I'll look into having 'configure' add it automatically.

Thank you, this would be a prerequisite for further improvements.




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Mattias Engdegård-2
In reply to this post by Mattias Engdegård-2
3 juli 2020 kl. 21.05 skrev Andrea Corallo <[hidden email]>:

> attached the updated version of the patch updating the pure function
> classification.

Thanks Andrea! Philipp Stephani raised the interesting question of (essentially) whether 'car' is pure. For the purposes of the current constant folding in the byte compiler the answer is yes, but perhaps you have wider ambitions in your work?

Clearly, (car X) cannot be moved past some operations with side-effects if X is aliased:

(let* ((x (list 'a))
       (y (car x)))
  (f x)
  y)

Here, (car x) cannot be sunk past the call to f despite x remaining unchanged (assuming lexical binding).
It would be useful to know more exactly what notion of purity you require.




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Stefan Monnier
In reply to this post by Mattias Engdegård-2
> Thanks -- patch attached. Some expressions will still not be constant-folded
> entirely; for example

LGTM,


        Stefan




Reply | Threaded
Open this post in threaded view
|

bug#42147: 28.0.50; pure vs side-effect-free, missing optimizations?

Paul Eggert
In reply to this post by Mattias Engdegård-2
On 7/3/20 11:47 AM, Mattias Engdegård wrote:
>> Having Emacs use --with-mfpmath=sse should improve performance a bit on x86. But
>> more important, it should make floating point more reproducible. If I get the
>> time I'll look into having 'configure' add it automatically.
> Thank you, this would be a prerequisite for further improvements.

Attached is a patch to do that. I looked at the GCC source code, and x86 is the
only platform where this sort of thing should be necessary. I'll mention this
patch on emacs-devel, to give people a heads-up before installing.

0001-Prefer-standard-rounding-on-x86.patch (2K) Download Attachment
1234