Getting the correct line/column numbers on byte compilation error/warning messages

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Getting the correct line/column numbers on byte compilation error/warning messages

Alan Mackenzie
Hello, Emacs.

The line/column numbers currently output by the byte compiler are not
rigorous - they sort of work most of the time.  When they don't work, we
get bug #24449, or bug #2681, or bug #8774, or bug #9109, or bug #22288,
or bug #24128.  Clearly a solution to this would be popular.

The reason for all these bugs is the mechanism in the byte compiler
where the reader stores the location of each occurrence of each symbol
in a list.  "Each" time the byte compiler encounters a symbol, the
earlier occurrences of that symbol are discarded from the pertinent
list, leaving the location of the next element as the location for the
error/warning message.  This is a "gross hack" which sometimes causes
the wrong line/column numbers to be output in error/warning messages.

I've been working on a replacement, and although I don't yet have
anything to show, I'd like to describe what I have.

I've modified lread.c so that when an enabling flag is set, it records
the source offset of each cons or vector in a hash table.  The key to
the table is the cons cell/vector, and the value is the offset (more or
less).  After each read operation, the "top structure" is recorded in
global variables, `read-outer-form' and `read-outer-loc'.  (The latter
is non-nil when the former refers to the cdr of a dotted pair, or an
element of a vector.)

As the compilation proceeds recursively, `read-outer-form/loc' are bound
to successively more enclosed forms.  Should an error or warning occur,
the line and column numbers are taken from these two variables via the
hash table.

This is all well and good.  The implementation is problematic, largely
because currently, the compilation optimises by ripping the original
form apart and sticking it back together again in new conses.  This
breaks the mechanism for accessing location information from the hash
table, whose keys are the original cons cells.

I've worked out some primitive functions which do what standard
functions do, but store their results in specified cons cells.  For
example:

    (defun bo-cons (kar kdr dest)
      "Return a cons of KAR and KDR.  This cons will be DEST, with its
    car and cdr overwritten."
      (setcar dest kar)
      (setcdr dest kdr)
      (setq bo-fixed t)
      dest)

(The prefix "bo-" is at the moment purely for convenience, and the
variable `bo-fixed' is used to indicate change to a structure, because
the comparison (eq newform form) would now always return t.)

I've amended many of the functions in byte-opt.el to use only
cons-preserving primitives, though, as yet, I haven't tested them much.

What do people think?

--
Alan Mackenzie (Nuremberg, Germany).

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting the correct line/column numbers on byte compilation error/warning messages

Clément Pit-Claudel
On 2017-07-16 15:44, Alan Mackenzie wrote:
> What do people think?

I think fixing that problem would be very, very nice.  As to your solution: which primitives did you actually have to redefine? It sounds like in most cases the new primitives would be very thin wrappers around setf, wouldn't they?

Thanks for working on this,
Clément.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting the correct line/column numbers on byte compilation error/warning messages

Alan Mackenzie
Hello, Clément.

On Mon, Jul 17, 2017 at 12:27:57 +0200, Clément Pit-Claudel wrote:
> On 2017-07-16 15:44, Alan Mackenzie wrote:
> > What do people think?

> I think fixing that problem would be very, very nice.  As to your
> solution: which primitives did you actually have to redefine? It
> sounds like in most cases the new primitives would be very thin
> wrappers around setf, wouldn't they?

Well, it's still a work in progress, so I don't yet know the full set of
new primitives I'll need.  But so far, I've got
(defun bo-cons (kar kdr dest) ...)
(defun bo-dup-hash (src) ...) ; create a new cons with the same hash
    entry as SRC.
(defun bo-reuse-cons (form kons) ...) ; Put the result of FORM into
    kons.
(bo-mapcar (fn list) ...) ; Result list uses cons structure in LIST.
(bo-list-2 (elt0 elt1 dest0 dest1) ; Use DEST[01] to hold the 2 element
    list.
also bo-list-1, bo-list-3, etc,

...amongst a whole lot of half-baked ideas.  It's slowly beginning to
come together, though.  Funnily enough, I haven't used setf a single
time yet.  I've only a vague idea what it does.  I'll need to look it
up.

> Thanks for working on this,

We'll see what we can manage.

> Clément.

--
Alan Mackenzie (Nuremberg, Germany).

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting the correct line/column numbers on byte compilation error/warning messages

Stefan Monnier
In reply to this post by Alan Mackenzie
>     (defun bo-cons (kar kdr dest)
>       "Return a cons of KAR and KDR.  This cons will be DEST, with its
>     car and cdr overwritten."
>       (setcar dest kar)
>       (setcdr dest kdr)
>       (setq bo-fixed t)
>       dest)

Macro expansion uses macroexp--cons for a similar purpose.


        Stefan "who'd actually prefer a «real» solution rather than
                another hack, but beggars can't be choosers"


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting the correct line/column numbers on byte compilation error/warning messages

Alan Mackenzie
Hello, Stefan.

On Mon, Jul 17, 2017 at 15:42:00 -0400, Stefan Monnier wrote:

>         Stefan "who'd actually prefer a «real» solution rather than
>                 another hack, but beggars can't be choosers"

I've been thinking about this for nearly a year, and I can't conceive of
a "real" solution which doesn't involve redesigning the byte compiler
from scratch (or even one which does).  Lisp source code doesn't seem to
lend itself to keeping track of source positions as the forms are
manipulated, taken apart, and put together again.

Can you conceive of some better scheme by which accurate line/column
numbers can be output by error messages, and which doesn't involve
rewriting the byte compiler?

What I do anticipate about my hack is, assuming I ever get it working,
it will produce the correct source code positions all the time, not just
most of the time.

--
Alan Mackenzie (Nuremberg, Germany).

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting the correct line/column numbers on byte compilation error/warning messages

Stefan Monnier
>> Stefan "who'd actually prefer a «real» solution rather than
>>         another hack, but beggars can't be choosers"
> I've been thinking about this for nearly a year, and I can't conceive of
> a "real" solution which doesn't involve redesigning the byte compiler
> from scratch (or even one which does).

AFAIK redesign is not needed (in the sense that the code structure
doesn't need to be changed), but a lot of code would need to be
touched, yes.

> Lisp source code doesn't seem to lend itself to keeping track of
> source positions as the forms are manipulated, taken apart, and put
> together again.

All compilers for all languages do that "take-apart-and-put-together"
over and over again.  The only difference is the Lisp macros which are
defined to operate on a form of the code which doesn't include
position info.

> Can you conceive of some better scheme by which accurate line/column
> numbers can be output by error messages, and which doesn't involve
> rewriting the byte compiler?

The «real» solution, AFAIC, is to use a representation of the code where
position-info is added to each node:
- provide a new `read` function which provides something similar to
  `edebug-read` (and then change Edebug to use that function).
- change macroexp to accept and return such a representation.
- change byte-opt to accept and return such a representation.
- change cconv to accept and return such a representation.
- change bytecomp.el to accept representation.

The changes to macroexp, byte-opt, cconv, and bytecomp aren't
complicated and will be fairly repetitive, but they'll touch most of
the code.

> What I do anticipate about my hack is, assuming I ever get it working,
> it will produce the correct source code positions all the time, not just
> most of the time.

It might work well in practice, yes.  Tho using setc[ad]r like you
suggest is risky since macros can return code which is a DAG rather than
a tree (it's fairly unusual but not unheard of), so if you setc[ad]r on
one side it might lead to nasty surprises on the other.


        Stefan

Loading...