bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
I have a ZIP file that, when I try to visit it in archive-mode, Emacs
throws a file mode specification error on, specifically
args-out-of-range.  The reason is that the function archive-l-e in
arc-mode.el, which is supposed to return a buffer position in this case,
returns a number which exceeds the buffer size.  And the reason for this
is that archive-l-e builds its return value by iteratively calling (+
(ash result 8) (aref str (- len i))), where `str' is a string extracted
from the end of the ZIP file, which in this case is "\377\377\377\377",
so that the return value increases from 255 (the decimal value of octal
377) to 4294967040 after four iterations, which far exceeds the file
(and hence buffer) size.

Perhaps "\377\377\377\377" is an invalid string at the end of a ZIP file
(I checked other ZIP files I have, which I can visit in archive-mode
with no problem, and they have different strings at the end, consisting
of bytes with smaller decimal values, so that the return value of
archive-l-e does not exceed the file size).  Yet when I call `unzip -l'
on the file in the shell, the contents are displayed, and I also had no
problem unpacking the file with unzip (and AFAICT the content is
undamaged).  So at least unzip can deal with this file.

In case it helps, here is the final byte sequence in the file (I've
replace the control characters and raw bytes by ASCII representations):

PK^E^F\377\377\377\377%^@%^@\377\377\377\377\377\377\377\377^@^@


In GNU Emacs 28.0.50 (build 25, x86_64-pc-linux-gnu, GTK+ Version 3.24.17, cairo version 1.17.3)
 of 2020-09-24 built on strobe-jhalfs
Repository revision: 89dd8cd215148da4c6dffc15dc6c35df5122247b
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12008000
System Description: Linux From Scratch SVN-20200401

Configured using:
 'configure 'CFLAGS=-Og -g3' PKG_CONFIG_PATH=/opt/qt5/lib/pkgconfig'

Configured features:
XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND DBUS GSETTINGS GLIB NOTIFY
INOTIFY ACL GNUTLS LIBXML2 FREETYPE HARFBUZZ ZLIB TOOLKIT_SCROLL_BARS
GTK3 X11 XDBE XIM MODULES THREADS LIBSYSTEMD PDUMPER LCMS2

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
On Thu, 24 Sep 2020 22:25:23 +0200 Andreas Schwab <[hidden email]> wrote:

> On Sep 24 2020, Stephen Berman wrote:
>
>> In case it helps, here is the final byte sequence in the file (I've
>> replace the control characters and raw bytes by ASCII representations):
>>
>> PK^E^F\377\377\377\377%^@%^@\377\377\377\377\377\377\377\377^@^@
>
> That means the archive is in ZIP64 format.  See
> <https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT>.  The real
> values are contained in the zip64 end of central directory record, which
> has "PK\6\6" as the signature.

Thanks.  But then it seems that arc-mode.el doesn't handle this format
correctly: archive-zip-summarize has this code:

  (let ((p (archive-l-e (+ (point) 16) 4))
        files)
    (when (= p -1)
      ;; If the offset of end-of-central-directory is -1, this is a
      ;; Zip64 extended ZIP file format, and we need to glean the info
      ;; from Zip64 records instead.

But the first value returned by archive-l-e for this file is 255, not
-1, and on each iteration the value increases, so the ZIP64 code in the
body of the when-clause is never executed.

Steve Berman



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
On Thu, 24 Sep 2020 22:40:15 +0200 Stephen Berman <[hidden email]> wrote:

> On Thu, 24 Sep 2020 22:25:23 +0200 Andreas Schwab <[hidden email]> wrote:
>
>> On Sep 24 2020, Stephen Berman wrote:
>>
>>> In case it helps, here is the final byte sequence in the file (I've
>>> replace the control characters and raw bytes by ASCII representations):
>>>
>>> PK^E^F\377\377\377\377%^@%^@\377\377\377\377\377\377\377\377^@^@
>>
>> That means the archive is in ZIP64 format.  See
>> <https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT>.  The real
>> values are contained in the zip64 end of central directory record, which
>> has "PK\6\6" as the signature.
>
> Thanks.  But then it seems that arc-mode.el doesn't handle this format
> correctly: archive-zip-summarize has this code:
>
>   (let ((p (archive-l-e (+ (point) 16) 4))
>         files)
>     (when (= p -1)
>       ;; If the offset of end-of-central-directory is -1, this is a
>       ;; Zip64 extended ZIP file format, and we need to glean the info
>       ;; from Zip64 records instead.
>
> But the first value returned by archive-l-e for this file is 255, not
> -1, and on each iteration the value increases, so the ZIP64 code in the
> body of the when-clause is never executed.

I've confirmed that changing the when-clause condition to (= p
4294967295) (that's octal 37777777777, hex ffffffff) makes the body get
executed and the contents of the problematic ZIP file are correctly
displayed in archive-mode.  Is that the correct fix?

Steve Berman



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
On Thu, 24 Sep 2020 23:58:51 +0200 Stephen Berman <[hidden email]> wrote:

> On Thu, 24 Sep 2020 22:40:15 +0200 Stephen Berman <[hidden email]> wrote:
>
>> On Thu, 24 Sep 2020 22:25:23 +0200 Andreas Schwab <[hidden email]> wrote:
>>
>>> On Sep 24 2020, Stephen Berman wrote:
>>>
>>>> In case it helps, here is the final byte sequence in the file (I've
>>>> replace the control characters and raw bytes by ASCII representations):
>>>>
>>>> PK^E^F\377\377\377\377%^@%^@\377\377\377\377\377\377\377\377^@^@
>>>
>>> That means the archive is in ZIP64 format.  See
>>> <https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT>.  The real
>>> values are contained in the zip64 end of central directory record, which
>>> has "PK\6\6" as the signature.
>>
>> Thanks.  But then it seems that arc-mode.el doesn't handle this format
>> correctly: archive-zip-summarize has this code:
>>
>>   (let ((p (archive-l-e (+ (point) 16) 4))
>>         files)
>>     (when (= p -1)
>>       ;; If the offset of end-of-central-directory is -1, this is a
>>       ;; Zip64 extended ZIP file format, and we need to glean the info
>>       ;; from Zip64 records instead.
>>
>> But the first value returned by archive-l-e for this file is 255, not
>> -1, and on each iteration the value increases, so the ZIP64 code in the
>> body of the when-clause is never executed.
>
> I've confirmed that changing the when-clause condition to (= p
> 4294967295) (that's octal 37777777777, hex ffffffff) makes the body get
> executed and the contents of the problematic ZIP file are correctly
> displayed in archive-mode.  Is that the correct fix?

No, it's not, at least not without additional changes.  I looked at the
display in archive-mode too quickly before; looking again, I see that
each file in the archive is shown as having size 4294967295 and the
total size shown is correspondingly wrong.

Steve Berman



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
In reply to this post by Stephen Berman
On Fri, 25 Sep 2020 09:01:02 +0300 Eli Zaretskii <[hidden email]> wrote:

>> From: Stephen Berman <[hidden email]>
>> Date: Thu, 24 Sep 2020 22:40:15 +0200
>> Cc: [hidden email]
>>
>>   (let ((p (archive-l-e (+ (point) 16) 4))
>>         files)
>>     (when (= p -1)
>>       ;; If the offset of end-of-central-directory is -1, this is a
>>       ;; Zip64 extended ZIP file format, and we need to glean the info
>>       ;; from Zip64 records instead.
>>
>> But the first value returned by archive-l-e for this file is 255, not
>> -1, and on each iteration the value increases, so the ZIP64 code in the
>> body of the when-clause is never executed.
>
> This code worked when support for ZIP64 was added.
>
> Could it be that the problem is in archive-l-e?  Something related to
> bignums, perhaps?

According to NEWS bignum support was added in Emacs 27, so I just made a
fresh build from the Emacs 26 branch and get the same failure.

>                    Why does it return 255 instead of -1 in this case?

Here's what happens (in Emacs 26 as well as 27 and master, though the
code changed slightly due, I assume to the introduction of bignums):
archive-l-e loops through the string "\377\377\377\377", each time
adding the decimal value of the byte to the value after shifting left by
8 bits.  Since octal 377 is decimal 255, the result of each iteration is
the following: 255, 65535, 16777215, 4294967295.  The last value is what
is returned to archive-zip-summarize.  Is it supposed to return -1 due
to overflowing or something?  In any case, the same thing happens in
Emacs 26, 27 and master, so it seems whatever changed the return value
happened earlier.

Steve Berman



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Eli Zaretskii
> From: Stephen Berman <[hidden email]>
> Cc: [hidden email],  [hidden email]
> Date: Fri, 25 Sep 2020 09:13:42 +0200
>
> On Fri, 25 Sep 2020 09:01:02 +0300 Eli Zaretskii <[hidden email]> wrote:
>
> >> From: Stephen Berman <[hidden email]>
> >> Date: Thu, 24 Sep 2020 22:40:15 +0200
> >> Cc: [hidden email]
> >>
> >>   (let ((p (archive-l-e (+ (point) 16) 4))
> >>         files)
> >>     (when (= p -1)
> >>       ;; If the offset of end-of-central-directory is -1, this is a
> >>       ;; Zip64 extended ZIP file format, and we need to glean the info
> >>       ;; from Zip64 records instead.
> >>
> > This code worked when support for ZIP64 was added.
> >
> > Could it be that the problem is in archive-l-e?  Something related to
> > bignums, perhaps?
>
> According to NEWS bignum support was added in Emacs 27, so I just made a
> fresh build from the Emacs 26 branch and get the same failure.

Where can I find a zip file which exhibits this problem?



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Eli Zaretskii
> Date: Fri, 25 Sep 2020 12:37:31 +0300
> From: Eli Zaretskii <[hidden email]>
> Cc: [hidden email]
>
> Where can I find a zip file which exhibits this problem?

Never mind, I found such a file on my system, the same one I used when
I added the zip64 support.  The code indeed bit-rotted since then, I'm
looking into this.



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
On Fri, 25 Sep 2020 12:53:25 +0300 Eli Zaretskii <[hidden email]> wrote:

>> Date: Fri, 25 Sep 2020 12:37:31 +0300
>> From: Eli Zaretskii <[hidden email]>
>> Cc: [hidden email]
>>
>> Where can I find a zip file which exhibits this problem?
>
> Never mind, I found such a file on my system, the same one I used when
> I added the zip64 support.  The code indeed bit-rotted since then, I'm
> looking into this.

Thanks.

Steve Berman



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
In reply to this post by Eli Zaretskii
On Fri, 25 Sep 2020 14:00:33 +0300 Eli Zaretskii <[hidden email]> wrote:

>> Date: Fri, 25 Sep 2020 12:53:25 +0300
>> From: Eli Zaretskii <[hidden email]>
>> Cc: [hidden email]
>>
>> > Date: Fri, 25 Sep 2020 12:37:31 +0300
>> > From: Eli Zaretskii <[hidden email]>
>> > Cc: [hidden email]
>> >
>> > Where can I find a zip file which exhibits this problem?
>>
>> Never mind, I found such a file on my system, the same one I used when
>> I added the zip64 support.  The code indeed bit-rotted since then, I'm
>> looking into this.
>
> With the file I have, replacing -1 by #xffffffff does the job.  So I
> guess I will need the file you used after all.  It could be either the
> file's contents, or maybe there are issues related to whether Emacs is
> a 32-bit (my case) or a 64-bit (your case) program.
I can't readily make that specific file available to you (I would have
to get permission to share it), but fortunately the server I downloaded
it from automatically creates such ZIP files, so I just uploaded two
small test files to the server and downloaded them as a ZIP file, and it
shows exactly the same behavior (trying to visit it in archive-mode
fails with -1 but not with #xffffffff, but with the latter the size of
each of the files in the archive is given as #xffffffff), so it should
serve for debugging.  I've attached that ZIP file.

Steve Berman


test.zip (886 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Eli Zaretskii
> From: Stephen Berman <[hidden email]>
> Cc: [hidden email]
> Date: Fri, 25 Sep 2020 14:05:08 +0200
>
> > With the file I have, replacing -1 by #xffffffff does the job.  So I
> > guess I will need the file you used after all.  It could be either the
> > file's contents, or maybe there are issues related to whether Emacs is
> > a 32-bit (my case) or a 64-bit (your case) program.
>
> I can't readily make that specific file available to you (I would have
> to get permission to share it), but fortunately the server I downloaded
> it from automatically creates such ZIP files, so I just uploaded two
> small test files to the server and downloaded them as a ZIP file, and it
> shows exactly the same behavior (trying to visit it in archive-mode
> fails with -1 but not with #xffffffff, but with the latter the size of
> each of the files in the archive is given as #xffffffff), so it should
> serve for debugging.  I've attached that ZIP file.

Thanks, with this file I do see the problem.



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Eli Zaretskii
> Date: Fri, 25 Sep 2020 15:38:56 +0300
> From: Eli Zaretskii <[hidden email]>
> Cc: [hidden email]
>
> Thanks, with this file I do see the problem.

Oops, I spoke too soon, using the wrong version of Emacs to test this.

The change -1 => #xffffffff fixes the problem with this file as well.
So I guess the remaining issues rear their ugly heads in a 64-bit
Emacs, and I need to debug there...



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
On Fri, 25 Sep 2020 15:43:48 +0300 Eli Zaretskii <[hidden email]> wrote:

>> Date: Fri, 25 Sep 2020 15:38:56 +0300
>> From: Eli Zaretskii <[hidden email]>
>> Cc: [hidden email]
>>
>> Thanks, with this file I do see the problem.
>
> Oops, I spoke too soon, using the wrong version of Emacs to test this.
>
> The change -1 => #xffffffff fixes the problem with this file as well.
> So I guess the remaining issues rear their ugly heads in a 64-bit
> Emacs, and I need to debug there...

I'd be happy to try debugging here, if you can give me specific
instructions.

Steve Berman



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Eli Zaretskii
In reply to this post by Eli Zaretskii
> Date: Fri, 25 Sep 2020 15:43:48 +0300
> From: Eli Zaretskii <[hidden email]>
> Cc: [hidden email]
>
> > Date: Fri, 25 Sep 2020 15:38:56 +0300
> > From: Eli Zaretskii <[hidden email]>
> > Cc: [hidden email]
> >
> > Thanks, with this file I do see the problem.
>
> Oops, I spoke too soon, using the wrong version of Emacs to test this.
>
> The change -1 => #xffffffff fixes the problem with this file as well.
> So I guess the remaining issues rear their ugly heads in a 64-bit
> Emacs, and I need to debug there...

Wrong again, sigh...  This has nothing to do with 64-bit builds.

Anyway, here's the proposed patch, please test:

diff --git a/lisp/arc-mode.el b/lisp/arc-mode.el
index d6e85bf..c09f78e 100644
--- a/lisp/arc-mode.el
+++ b/lisp/arc-mode.el
@@ -1799,10 +1799,10 @@ archive-zip-summarize
         files
  visual
         emacs-int-has-32bits)
-    (when (= p -1)
-      ;; If the offset of end-of-central-directory is -1, this is a
-      ;; Zip64 extended ZIP file format, and we need to glean the info
-      ;; from Zip64 records instead.
+    (when (or (= p #xffffffff) (= p -1))
+      ;; If the offset of end-of-central-directory is 0xFFFFFFFF, this
+      ;; is a Zip64 extended ZIP file format, and we need to glean the
+      ;; info from Zip64 records instead.
       ;;
       ;; First, find the Zip64 end-of-central-directory locator.
       (search-backward "PK\006\007")
@@ -1828,6 +1828,15 @@ archive-zip-summarize
              (efnname (let ((str (buffer-substring (+ p 46) (+ p 46 fnlen))))
  (decode-coding-string
  str archive-file-name-coding-system)))
+             (ucsize  (if (and (or (= ucsize #xffffffff) (= ucsize -1))
+                               (> exlen 0))
+                          ;; APPNOTE.TXT, para 4.5.3: the Extra Field
+                          ;; begins with 2 bytes of signature
+                          ;; (\000\001), followed by 2 bytes that give
+                          ;; the size of the extra block, followed by
+                          ;; an 8-byte uncompressed size.
+                          (archive-l-e (+ p 46 fnlen 4) 8)
+                        ucsize))
      (isdir   (and (= ucsize 0)
    (string= (file-name-nondirectory efnname) "")))
      (mode    (cond ((memq creator '(2 3)) ; Unix



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Stephen Berman
On Fri, 25 Sep 2020 16:33:21 +0300 Eli Zaretskii <[hidden email]> wrote:

>> Date: Fri, 25 Sep 2020 15:43:48 +0300
>> From: Eli Zaretskii <[hidden email]>
>> Cc: [hidden email]
>>
>> > Date: Fri, 25 Sep 2020 15:38:56 +0300
>> > From: Eli Zaretskii <[hidden email]>
>> > Cc: [hidden email]
>> >
>> > Thanks, with this file I do see the problem.
>>
>> Oops, I spoke too soon, using the wrong version of Emacs to test this.
>>
>> The change -1 => #xffffffff fixes the problem with this file as well.
>> So I guess the remaining issues rear their ugly heads in a 64-bit
>> Emacs, and I need to debug there...
>
> Wrong again, sigh...  This has nothing to do with 64-bit builds.
>
> Anyway, here's the proposed patch, please test:

It works for me (at first I applied it to master and the patch partly
failed to apply, then I looked at the line numbers and realized it was
for Emacs 27).  Thanks.

Steve Berman



Reply | Threaded
Open this post in threaded view
|

bug#43597: 28.0.50; arc-mode.el fails to display a ZIP file

Eli Zaretskii
> From: Stephen Berman <[hidden email]>
> Cc: [hidden email]
> Date: Fri, 25 Sep 2020 15:58:36 +0200
>
> > Anyway, here's the proposed patch, please test:
>
> It works for me (at first I applied it to master and the patch partly
> failed to apply, then I looked at the line numbers and realized it was
> for Emacs 27).  Thanks.

Thanks, installed on the emacs-27 branch, and closing the bug.