Using the GNU GMP Library for Bignums in Emacs

classic Classic list List threaded Threaded
280 messages Options
1234 ... 14
Reply | Threaded
Open this post in threaded view
|

Using the GNU GMP Library for Bignums in Emacs

Siraphob (Ben) Phipathananunth
Emacs Calc was written many years ago, and as a result its current
implementation implements bignums purely in Emacs Lisp. Its bignum
operations also use a lot of hacks (such as performing carry
operations). Arbitrary precision arithmetic could be faster if Emacs
had GNU GMP linked to it, with the relevant Emacs Lisp functions added
in C.

What is the consensus on linking the GNU GMP library to Emacs so that
packages such as Emacs Calc (and others) could benefit from using
native types (i.e. "mpz_t") rather than reinventing the wheel?

The benefits of linking GNU GMP are clear. It would make it easier to
extend the math capabilities of Emacs, while making it faster and less
error-prone. However, the downsides, if any exist, should be discussed
as well, to gauge whether pursuing such a task would be fruitful.


Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Eli Zaretskii
> From: "Siraphob (Ben) Phipathananunth" <[hidden email]>
> Date: Sat, 21 Apr 2018 21:15:08 +0700
>
> Emacs Calc was written many years ago, and as a result its current
> implementation implements bignums purely in Emacs Lisp. Its bignum
> operations also use a lot of hacks (such as performing carry
> operations). Arbitrary precision arithmetic could be faster if Emacs
> had GNU GMP linked to it, with the relevant Emacs Lisp functions added
> in C.
>
> What is the consensus on linking the GNU GMP library to Emacs so that
> packages such as Emacs Calc (and others) could benefit from using
> native types (i.e. "mpz_t") rather than reinventing the wheel?

I think the consensus is we want that, it's just a matter of someone
doing the job of making it happen.

The design should IMO be discussed up front, because we want not only
to be able to use bignums and arbitrary-precision floating-point
numbers in C, but also in Lisp.  How exactly to expose them to Lisp is
something we should talk about, I think.

Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Siraphob (Ben) Phipathananunth
One approach would be to implement a minimal subset of functions
exposed in Emacs Lisp that would allow one to recreate the following
functions that are defined in calc.el :

Function Name              Arguments
============================================
math-bignum                (a)
math-bignum-big            (a)
math-div10-bignum          (a)
math-scale-left-bignum     (a n)
math-scale-right-bignum    (a n)
math-add-bignum            (a b)
math-sub-bignum            (a b)
math-mul-bignum            (a b)
math-mul-bignum-digit      (a d c)
math-div-bignum            (a b)
math-div-bignum-digit      (a b)
math-div-bignum-big        (a b alen blen)
math-div-bignum-part       (a b blen)
math-div-bignum-try        (a b c guess)
math-format-bignum         (a)
math-format-bignum-decimal (a)
math-read-bignum           (s)

There are other functions scattered throughout the calc sources, but
this is the rough idea of the type of functions we would need. This
list would have to be shortened, as when we migrate bignums we would
no longer need the following functions (at least, could be more):

math-mul-bignum-digit      (a d c)
math-div-bignum-digit      (a b)

These functions exist because bignums are implemented as a list of
digits.

I think that keeping the function names the same would be good (for
some backwards compatibility). The /actual/ value of the bignum
(internally, in C) would be a tagged pointer to the mpz_t data type
stored somewhere so that it could be marked for GC (very important to
mark for GC).

With respect to floating points, though, things get a little hairy.

As quoted from the GMP Library
(https://gmplib.org/manual/Floating_002dpoint-Functions.html):

 > The exponent of each float has fixed precision, one machine word on
 > most systems. In the current implementation the exponent is a count of
 > limbs, so for example on a 32-bit system this means a range of roughly
 > 2^-68719476768 to 2^68719476736, or on a 64-bit system this will be
 > much greater. Note however that mpf_get_str can only return an
 > exponent which fits an mp_exp_t and currently mpf_set_str doesn’t
 > accept exponents bigger than a long.

Fortunately, it links to another project called MPFR
http://www.mpfr.org/sample.html which allows us to computing floating
points at arbitrary precision.

However, I'm concerned that adding two different libraries could lead
to issues regarding interoperability (for example, a floating point
number that needs to be converted to a bignum, and vice versa). If
anyone has used MPFR + GMP in the past, please chime in.

GC shouldn't be ignored as well, because as the name implies, these
data objects would be very large.

Thanks,

Siraphob (Ben) Phipathananunth

Eli Zaretskii wrote:

>> From: "Siraphob (Ben) Phipathananunth" <[hidden email]>
>> Date: Sat, 21 Apr 2018 21:15:08 +0700
>>
>> Emacs Calc was written many years ago, and as a result its current
>> implementation implements bignums purely in Emacs Lisp. Its bignum
>> operations also use a lot of hacks (such as performing carry
>> operations). Arbitrary precision arithmetic could be faster if Emacs
>> had GNU GMP linked to it, with the relevant Emacs Lisp functions added
>> in C.
>>
>> What is the consensus on linking the GNU GMP library to Emacs so that
>> packages such as Emacs Calc (and others) could benefit from using
>> native types (i.e. "mpz_t") rather than reinventing the wheel?
>
> I think the consensus is we want that, it's just a matter of someone
> doing the job of making it happen.
>
> The design should IMO be discussed up front, because we want not only
> to be able to use bignums and arbitrary-precision floating-point
> numbers in C, but also in Lisp.  How exactly to expose them to Lisp is
> something we should talk about, I think.
>
> Thanks.
>

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Paul Eggert
Siraphob (Ben) Phipathananunth wrote:
> One approach would be to implement a minimal subset of functions
> exposed in Emacs Lisp that would allow one to recreate the following
> functions that are defined in calc.el :

Surely we should just add bignum support to existing functions +, -, etc.
Wouldn't that suffice for 'calc'? If not, why not? (Of course 'calc' would need
to be changed to exploit the bignums properly, no matter how we add bignums.)

> With respect to floating points, though, things get a little hairy.

This should be a separate task. Bignums alone are quite a large-enough project.
I'm not even sure we should do rationals.

> The /actual/ value of the bignum
> (internally, in C) would be a tagged pointer to the mpz_t data type
> stored somewhere so that it could be marked for GC

For bignums I would think that Emacs shouldn't use the mpz_t data type, as this
would complicate garbage collection. Emacs can use the mp_limb_t data type and
stick with the mpn_* functions.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Eli Zaretskii
> Cc: [hidden email]
> From: Paul Eggert <[hidden email]>
> Date: Sat, 21 Apr 2018 08:23:33 -0700
>
> Siraphob (Ben) Phipathananunth wrote:
> > One approach would be to implement a minimal subset of functions
> > exposed in Emacs Lisp that would allow one to recreate the following
> > functions that are defined in calc.el :
>
> Surely we should just add bignum support to existing functions +, -, etc.

Exactly my thoughts.

> > With respect to floating points, though, things get a little hairy.
>
> This should be a separate task. Bignums alone are quite a large-enough project.
> I'm not even sure we should do rationals.

Right.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Siraphob (Ben) Phipathananunth
In reply to this post by Paul Eggert
Paul Eggert wrote:
 > Surely we should just add bignum support to existing functions +, -,
 > etc. Wouldn't that suffice for 'calc'? If not, why not? (Of course
 > 'calc' would need to be changed to exploit the bignums properly, no
 > matter how we add bignums.)

Of course we could to do that. Hopefully there isn't existing
Emacs Lisp code that relies on unsafe arithmetic /anywhere/. If the
functions + - * / operate on bignums (instead of dedicated bignum
functions), would that mean we drop 32/64 bit integers entirely?

Eli Zaretskii wrote:
 > The design should IMO be discussed up front, because we want not only
 > to be able to use bignums and arbitrary-precision floating-point
 > numbers in C, but also in Lisp.

I thought that Eli was talking about how we should interface bignums
to Emacs Lisp; the + - * / operators are defined in C source
code. Bignums would decrease performance in areas where the usual
32/64 bit integers are sufficient, and lead to higher memory usage. It
would make much more sense to have separate math functions for 32/64
bit numbers and for bignums. In doing so, it should be obvious to the
Emacs Lisp programmer when to use what.

Paul Eggert wrote:
 > This should be a separate task. Bignums alone are quite a large-enough
 > project. I'm not even sure we should do rationals.

Rationals would be a part of Emacs Calc, once we have bignums it
should be trivial to reimplement rationals.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Eli Zaretskii
> Cc: [hidden email]
> From: "Siraphob (Ben) Phipathananunth" <[hidden email]>
> Date: Sat, 21 Apr 2018 22:40:05 +0700
>
> I thought that Eli was talking about how we should interface bignums
> to Emacs Lisp; the + - * / operators are defined in C source
> code.

I never thought we could consider not using + - * etc. with bignums,
so I didn't even raise that issue.

> Bignums would decrease performance in areas where the usual
> 32/64 bit integers are sufficient, and lead to higher memory usage. It
> would make much more sense to have separate math functions for 32/64
> bit numbers and for bignums. In doing so, it should be obvious to the
> Emacs Lisp programmer when to use what.

When the arguments are bignums, these operators should automatically
invoke the relevant GMP functions.  That is how we behave elsewhere in
Emacs; Emacs Lisp does support polymorphism on the level of primitive
functions.  E.g., that's how these operators support both Lisp
integers and Lisp floats (and markers, for that matter).

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Paul Eggert
In reply to this post by Siraphob (Ben) Phipathananunth
Siraphob (Ben) Phipathananunth wrote:

> Hopefully there isn't existing
> Emacs Lisp code that relies on unsafe arithmetic /anywhere/.

I'm afraid your hope will be in vain....

And it depends on what we mean by "unsafe". Is it safe to assume that (eq (1+ 0)
1) returns t, for example? The Scheme standard says "no" but we might decide
that 'eq' should "work" for fixnums in Emacs Lisp. That sort of thing.

> If the functions + - * / operate on bignums (instead of dedicated bignum
> functions), would that mean we drop 32/64 bit integers entirely?

No, it'd mean we'd still have fixnums vs bignums internally, and most programs
wouldn't care whether an integer is represented via fixnum or bignum, as it'd be
an issue merely of efficiency. Some programs would probably still care though
(for efficiency reasons), and GNU Calc quite possibly would be one such program.

> It would make much more sense to have separate math functions for 32/64
> bit numbers and for bignums. In doing so, it should be obvious to the
> Emacs Lisp programmer when to use what.

That's not how other Lisps work, by and large. They have just one integer type
and one set of functions. The goal is to simplify the job of writing a program.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Markus Triska-4
In reply to this post by Siraphob (Ben) Phipathananunth
"Siraphob (Ben) Phipathananunth" <[hidden email]> writes:

> The benefits of linking GNU GMP are clear. It would make it easier to
> extend the math capabilities of Emacs, while making it faster and less
> error-prone. However, the downsides, if any exist, should be discussed
> as well, to gauge whether pursuing such a task would be fruitful.

Using GMP has a significant downside if you run out of memory: There’s
currently no defined way for the allocation functions to recover from an
error such as out of memory, they must *terminate* program execution.

Please see the following page for details:

   https://gmplib.org/manual/Custom-Allocation.html

Thus, situtations where you can currently throw an Emacs error that can
be handled in user code would instead terminate the Emacs process. This
can for example arise from malicious input that involves very large
integers as results (7^7^7^7 etc.), and of course also elsewhere.

Also, it may become hard to stop long GMP calculations, whereas you can
easily stop computations that are written in Elisp. Thus, long GMP
calculations may lead to denial of editing attacks.

All the best,
Markus


Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Eli Zaretskii
> From: Markus Triska <[hidden email]>
> Date: Sat, 21 Apr 2018 18:46:07 +0200
>
> Using GMP has a significant downside if you run out of memory: There’s
> currently no defined way for the allocation functions to recover from an
> error such as out of memory, they must *terminate* program execution.
>
> Please see the following page for details:
>
>    https://gmplib.org/manual/Custom-Allocation.html
>
> Thus, situtations where you can currently throw an Emacs error that can
> be handled in user code would instead terminate the Emacs process. This
> can for example arise from malicious input that involves very large
> integers as results (7^7^7^7 etc.), and of course also elsewhere.
>
> Also, it may become hard to stop long GMP calculations, whereas you can
> easily stop computations that are written in Elisp. Thus, long GMP
> calculations may lead to denial of editing attacks.

How are those dangers different from using any other external
library.  Like the JSON library, for excample, or libxml2?

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Markus Triska-4
Eli Zaretskii <[hidden email]> writes:

> How are those dangers different from using any other external
> library.  Like the JSON library, for excample, or libxml2?

Please note that in particular the first issue I mentioned is quite
specific to GMP: It could be solved by providing a different API, or by
generalizing the existing API to let applications safely handle such
situations. Some of the libraries you mention may support this.

The second issue I mentioned is also quite specific to GMP, since short
arithmetic expressions may cause long running times for calculations. Of
course, this issue may in principle also arise in other libraries.

All the best,
Markus


Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Paul Eggert
In reply to this post by Markus Triska-4
Markus Triska wrote:

> Using GMP has a significant downside if you run out of memory: There’s
> currently no defined way for the allocation functions to recover from an
> error such as out of memory, they must *terminate* program execution.

This should not be a problem if we use GMP's mpn_* functions. They never
allocate memory; it is the user's responsibility to allocate it. This should fit
better with how Emacs does things internally.

> Also, it may become hard to stop long GMP calculations, whereas you can
> easily stop computations that are written in Elisp. Thus, long GMP
> calculations may lead to denial of editing attacks.

Yes, this could be a problem. Perhaps we'll need to extend GMP to fix it. On the
other hand, no doubt there are other, similar problems in GNU Emacs (just look
for calls to memcpy :-), and perhaps this problem could just be added to the list.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Richard Stallman
In reply to this post by Siraphob (Ben) Phipathananunth
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Of course we could to do that. Hopefully there isn't existing
  > Emacs Lisp code that relies on unsafe arithmetic /anywhere/. If the
  > functions + - * / operate on bignums (instead of dedicated bignum
  > functions), would that mean we drop 32/64 bit integers entirely?

To eliminate the current types for small integers would
require rewriting of much of the C code in Emacs.
It would be better to represent small integers as now,
and have a different structure for larger integers.

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.


Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Richard Stallman
In reply to this post by Paul Eggert
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Yes, this could be a problem. Perhaps we'll need to extend GMP to fix it.

The only things that might interrupt Lisp code are signals.
If C-g is detected by a signal, it can interrupt almost anything in C,
including computation in a GMP function,

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.


Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Eli Zaretskii
> From: Richard Stallman <[hidden email]>
> Date: Sat, 21 Apr 2018 18:42:52 -0400
> Cc: [hidden email], [hidden email]
>
> The only things that might interrupt Lisp code are signals.
> If C-g is detected by a signal, it can interrupt almost anything in C,
> including computation in a GMP function,

We've changed how signals are processed in Emacs several years ago.
Nowadays, a signal handler just sets a flag, and the Lisp interpreter
tests that flag "when appropriate" (normally, as part of maybe_quit).

This change was done to avoid non-trivial processing inside signal
handlers.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Daniel Colascione-5
In reply to this post by Richard Stallman
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
>   > Of course we could to do that. Hopefully there isn't existing
>   > Emacs Lisp code that relies on unsafe arithmetic /anywhere/. If the
>   > functions + - * / operate on bignums (instead of dedicated bignum
>   > functions), would that mean we drop 32/64 bit integers entirely?
>
> To eliminate the current types for small integers would
> require rewriting of much of the C code in Emacs.
> It would be better to represent small integers as now,
> and have a different structure for larger integers.
>

I'd love to see Emacs get transparent bigint support. Python semantics are
fine, as is using a normal int representation at the C level. Adding
transparent bigints as Lisp types doesn't require us to increase various
Emacs core limits right away.


Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Siraphob (Ben) Phipathananunth
In reply to this post by Richard Stallman
Richard Stallman wrote:
 > To eliminate the current types for small integers would
 > require rewriting of much of the C code in Emacs.
 > It would be better to represent small integers as now,
 > and have a different structure for larger integers.

 From what I understand, we would want to use fixnums by default in the
C code, and convert to bignums automatically (in lisp) when the number
exceeds the range of a fixnum, while retaining behavior as before,
using the regular math operations + - * / (and more) to interface this
to Lisp.

Paul Eggert wrote:
 > programs wouldn't care whether an integer is represented via fixnum or
 > bignum, as it'd be an issue merely of efficiency.

Would it not slow down computation to have to constantly convert
between the two types? (especially if the computation is switching
above/below the fixnum/bignum boundary). In such a case, a fix could
be to convert lisp numbers exceeding fixnum limits to bignums for the
rest of the number's life (until GC). This ensures memory usage is kept
low for fixnum computations.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Paul Eggert
Siraphob (Ben) Phipathananunth wrote:
> Would it not slow down computation to have to constantly convert
> between the two types?

A bit perhaps, but it shouldn't be that big a deal. Most integer computation
will likely be fixnum only, and the only slowdown there will be integer overflow
checking that we currently aren't doing. With decent hardware and compiler, I'd
guess this would cost us three machine instructions per Lisp arithmetic
operation, including the conditional branch that is typically not taken. Hardly
anybody will notice.
> In such a case, a fix could
> be to convert lisp numbers exceeding fixnum limits to bignums for the
> rest of the number's life (until GC). This ensures memory usage is kept
> low for fixnum computations.

Yes, the idea is to use bignums to represent numbers outside of fixnum range,
and to use fixnums to represent numbers inside fixnum range. Bignums require
garbage collection; fixnums do not.

Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Helmut Eller-2
In reply to this post by Paul Eggert
On Sat, Apr 21 2018, Paul Eggert wrote:

> Surely we should just add bignum support to existing functions +, -,
> etc.

I'm not so sure of that.  E.g. (format "%x" -1) => "3fffffffffffffff".
That doesn't make any sense on if -1 is a bignum.

Helmut


Reply | Threaded
Open this post in threaded view
|

Re: Using the GNU GMP Library for Bignums in Emacs

Philipp Stephani
In reply to this post by Daniel Colascione-5


<[hidden email]> schrieb am So., 22. Apr. 2018 um 04:49 Uhr:
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
>   > Of course we could to do that. Hopefully there isn't existing
>   > Emacs Lisp code that relies on unsafe arithmetic /anywhere/. If the
>   > functions + - * / operate on bignums (instead of dedicated bignum
>   > functions), would that mean we drop 32/64 bit integers entirely?
>
> To eliminate the current types for small integers would
> require rewriting of much of the C code in Emacs.
> It would be better to represent small integers as now,
> and have a different structure for larger integers.
>

I'd love to see Emacs get transparent bigint support. Python semantics are
fine, as is using a normal int representation at the C level. Adding
transparent bigints as Lisp types doesn't require us to increase various
Emacs core limits right away.



We need to be very careful with transparent bigint support though, as a naive design will break many invariants. For example, integers are currently documented to use modular arithmetic (https://www.gnu.org/software/emacs/manual/html_node/elisp/Integer-Type.htmlhttps://www.gnu.org/software/emacs/manual/html_node/elisp/Integer-Basics.html), and they are documented to be pure value types, i.e. same-valued integers are the same object (https://www.gnu.org/software/emacs/manual/html_node/elisp/Equality-Predicates.html). Especially the second property is widely used.
1234 ... 14