bug#26656: unidata-check needs updating for special casing

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Glenn Morris-3
Package: emacs
Version: 26.0.50
Severity: minor

admin/unidata/unidata-gen.el includes a unidata-check function, to be
used like this:

;; (let ((unidata-dir "/path/to/admin/unidata"))
;;   (unidata-setup-list "unidata.txt")
;;   (unidata-check))

It works in emacs-25, but has numerous failures related to the special
casing rules in master. Presumably it needs updating for this somewhat
recent addition. Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Michal Nazarewicz-4
2017-04-25 13:12 GMT-04:00 Glenn Morris <[hidden email]>:

> admin/unidata/unidata-gen.el includes a unidata-check function, to be
> used like this:
>
> ;; (let ((unidata-dir "/path/to/admin/unidata"))
> ;;   (unidata-setup-list "unidata.txt")
> ;;   (unidata-check))
>
> It works in emacs-25, but has numerous failures related to the special
> casing rules in master. Presumably it needs updating for this somewhat
> recent addition. Thanks.

I’m currently travelling but will take a stab at it mid-May.



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Michal Nazarewicz-4
In reply to this post by Glenn Morris-3
On Tue, Apr 25 2017, Glenn Morris wrote:

> Package: emacs
> Version: 26.0.50
> Severity: minor
>
> admin/unidata/unidata-gen.el includes a unidata-check function, to be
> used like this:
>
> ;; (let ((unidata-dir "/path/to/admin/unidata"))
> ;;   (unidata-setup-list "unidata.txt")
> ;;   (unidata-check))
>
> It works in emacs-25, but has numerous failures related to the special
> casing rules in master. Presumably it needs updating for this somewhat
> recent addition. Thanks.

I’m rather conflicted about the best way to fix this.  As currently
written, unidata-check makes sense for properties from UnicodeData.txt
file.

The function compares values in unidata.txt file with values generated
from various generator functions.  However, SpecialCasing.txt is read
directly by unidata-gen-table-special-casing--do-load function so there
is no other file that unidata-check can read to compare the generation
of.

Should I just skip the check for special casing properties.  With the
following:

diff --git a/admin/unidata/unidata-gen.el b/admin/unidata/unidata-gen.el
index 64e2babd4b9..f99004a4f7e 100644
--- a/admin/unidata/unidata-gen.el
+++ b/admin/unidata/unidata-gen.el
@@ -1353,7 +1353,8 @@ unidata-check
             (alist (and (functionp index)
                         (funcall index)))
             (check #x400))
-       (dolist (e unidata-list)
+       (dolist (e (unless (eq generator 'unidata-gen-table-special-casing)
+                     unidata-list))
          (let* ((char (car e))
                 (val1
                  (if alist (nth 1 (assoc char alist))

the function reports no errors.

--
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Glenn Morris-3
Michal Nazarewicz wrote:

> Should I just skip the check for special casing properties.

Thanks, that sounds fine. If the check is actually doing anything useful,
it would be good to convert it to an ert test under test/, and
add something similar for SpecialCasing, if that makes sense.



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Michal Nazarewicz-4
> Michal Nazarewicz wrote:
>> Should I just skip the check for special casing properties.

On Mon, Jun 19 2017, Glenn Morris wrote:
> Thanks, that sounds fine.

All right then.  Unless there are some objections I’ll get the following
submitted in a few days:

---- >8 ----------------------------------------------------------------
From 467660531e2aa6f68ef431a1dcc52823a6967d42 Mon Sep 17 00:00:00 2001
From: Michal Nazarewicz <[hidden email]>
Date: Mon, 19 Jun 2017 21:34:25 +0200
Subject: [PATCH] =?UTF-8?q?unidata:=20don=E2=80=99t=20check=20special=20ca?=
 =?UTF-8?q?sing=20in=20unidata-check=20=20(bug#26656)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* admin/unidata/unidata-gen.el (unidata-check): Do not test special
casing mapping of characters since that mapping is not constructed from
the unidata.txt file.
Also, check for integer decoder and cons char earlier so that less
unnecessary processing is performed.
---
 admin/unidata/unidata-gen.el | 89 ++++++++++++++++++++++----------------------
 1 file changed, 45 insertions(+), 44 deletions(-)

diff --git a/admin/unidata/unidata-gen.el b/admin/unidata/unidata-gen.el
index 64e2babd4b9..42695f9bebb 100644
--- a/admin/unidata/unidata-gen.el
+++ b/admin/unidata/unidata-gen.el
@@ -1346,50 +1346,51 @@ unidata-check
      (generator (unidata-prop-generator proplist))
      (default-value (unidata-prop-default proplist))
      (val-list (unidata-prop-val-list proplist))
-     (table (progn
-      (message "Generating %S table..." prop)
-      (funcall generator prop index default-value val-list)))
-     (decoder (char-table-extra-slot table 1))
-     (alist (and (functionp index)
- (funcall index)))
-     (check #x400))
- (dolist (e unidata-list)
-  (let* ((char (car e))
- (val1
-  (if alist (nth 1 (assoc char alist))
-    (nth index e)))
- val2)
-    (if (and (stringp val1) (= (length val1) 0))
- (setq val1 nil))
-    (unless (or (consp char)
- (integerp decoder))
-      (setq val2
-    (cond ((functionp decoder)
-   (funcall decoder char (aref table char) table))
-  (t ; must be nil
-   (aref table char))))
-      (if val1
-  (cond ((eq generator 'unidata-gen-table-symbol)
- (setq val1 (intern val1)))
- ((eq generator 'unidata-gen-table-integer)
- (setq val1 (string-to-number val1)))
- ((eq generator 'unidata-gen-table-character)
- (setq val1 (string-to-number val1 16)))
- ((eq generator 'unidata-gen-table-decomposition)
- (setq val1 (unidata-split-decomposition val1))))
- (cond ((eq prop 'decomposition)
-       (setq val1 (list char)))
-      ((eq prop 'bracket-type)
-       (setq val1 'n))))
-      (when (>= char check)
- (message "%S %04X" prop check)
- (setq check (+ check #x400)))
-      (or (equal val1 val2)
-  ;; <control> characters get a 'name' property of nil
-  (and (eq prop 'name) (string= val1 "<control>") (null val2))
-  (insert (format "> %04X %S\n< %04X %S\n"
-  char val1 char val2)))
-      (sit-for 0))))))))
+     (check #x400)
+     table decoder alist)
+        (unless (eq generator 'unidata-gen-table-special-casing)
+          (setq table (progn
+                        (message "Generating %S table..." prop)
+                        (funcall generator prop index default-value val-list))
+                decoder (char-table-extra-slot table 1))
+          (unless (integerp decoder)
+            (setq alist (and (functionp index) (funcall index)))
+            (dolist (e unidata-list)
+              (let ((char (car e)) val1 val2)
+                (unless (consp char)
+                  (setq val1 (if alist
+                                 (nth 1 (assoc char alist))
+                               (nth index e)))
+                  (and (stringp val1)
+                       (= (length val1) 0)
+                       (setq val1 nil))
+                  (if val1
+                      (cond ((eq generator 'unidata-gen-table-symbol)
+                             (setq val1 (intern val1)))
+                            ((eq generator 'unidata-gen-table-integer)
+                             (setq val1 (string-to-number val1)))
+                            ((eq generator 'unidata-gen-table-character)
+                             (setq val1 (string-to-number val1 16)))
+                            ((eq generator 'unidata-gen-table-decomposition)
+                             (setq val1 (unidata-split-decomposition val1))))
+                    (cond ((eq prop 'decomposition)
+                           (setq val1 (list char)))
+                          ((eq prop 'bracket-type)
+                           (setq val1 'n))))
+                  (setq val2 (aref table char))
+                  (when decoder
+                    (setq val2 (funcall decoder char val2 table)))
+                  (when (>= char check)
+                    (message "%S %04X" prop check)
+                    (setq check (+ check #x400)))
+                  (or (equal val1 val2)
+                      ;; <control> characters get a 'name' property of nil
+                      (and (eq prop 'name)
+                           (string= val1 "<control>")
+                           (null val2))
+                      (insert (format "> %04X %S\n< %04X %S\n"
+                                      char val1 char val2)))
+                  (sit-for 0))))))))))
 
 ;; The entry functions.  They generate files described in the header
 ;; comment of this file.
---- >8 ----------------------------------------------------------------

A lot of the changes is white-space so the difference is actually
smaller than it looks.  It adds (eq generator
'unidata-gen-table-special-casing) condition and moves (integerp
decoder) and (consp char) tests earlier.

> If the check is actually doing anything useful, it would be good to
> convert it to an ert test under test/, and add something similar for
> SpecialCasing, if that makes sense.

Honestly I hoped you would know what the intention of the check is.

Special casing property has some sanity tests in
test/src/casefiddle-tests.el (namely the
casefiddle-tests-char-properties test case).

I feel like the test would be better written with just a few values for
every property rather than writing a complicated function testing all
the data.

--
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Eli Zaretskii
> From: Michal Nazarewicz <[hidden email]>
> BrEferences: <[hidden email]> <[hidden email]>
> <[hidden email]>
> Date: Mon, 19 Jun 2017 22:30:54 +0200
> Cc: [hidden email]
>
> > Michal Nazarewicz wrote:
> >> Should I just skip the check for special casing properties.
>
> On Mon, Jun 19 2017, Glenn Morris wrote:
> > Thanks, that sounds fine.
>
> All right then.  Unless there are some objections I’ll get the following
> submitted in a few days:

Please add a comment there explaining why this is skipped.

Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Michal Nazarewicz-4
On Tue, Jun 20 2017, Eli Zaretskii wrote:

>> From: Michal Nazarewicz <[hidden email]>
>> BrEferences: <[hidden email]> <[hidden email]>
>> <[hidden email]>
>> Date: Mon, 19 Jun 2017 22:30:54 +0200
>> Cc: [hidden email]
>>
>> > Michal Nazarewicz wrote:
>> >> Should I just skip the check for special casing properties.
>>
>> On Mon, Jun 19 2017, Glenn Morris wrote:
>> > Thanks, that sounds fine.
>>
>> All right then.  Unless there are some objections I’ll get the following
>> submitted in a few days:
>
> Please add a comment there explaining why this is skipped.

Will something like:

+        ;; Special casing properties are not read from unidata.txt so
+        ;; there’s nothing for us to check here.

do?

--
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Eli Zaretskii
> From: Michal Nazarewicz <[hidden email]>
> Cc: [hidden email], [hidden email]
> Date: Wed, 21 Jun 2017 14:36:21 +0200
>
> > Please add a comment there explaining why this is skipped.
>
> Will something like:
>
> +        ;; Special casing properties are not read from unidata.txt so
> +        ;; there’s nothing for us to check here.
>
> do?

Yes, but I must say I liked your original explanation in

  http://lists.gnu.org/archive/html/bug-gnu-emacs/2017-06/msg00610.html

better ;-)



Reply | Threaded
Open this post in threaded view
|

bug#26656: unidata-check needs updating for special casing

Michal Nazarewicz-4
On Wed, Jun 21 2017, Eli Zaretskii wrote:

>> From: Michal Nazarewicz <[hidden email]>
>> Cc: [hidden email], [hidden email]
>> Date: Wed, 21 Jun 2017 14:36:21 +0200
>>
>> > Please add a comment there explaining why this is skipped.
>>
>> Will something like:
>>
>> +        ;; Special casing properties are not read from unidata.txt so
>> +        ;; there’s nothing for us to check here.
>>
>> do?
>
> Yes, but I must say I liked your original explanation in
>
>   http://lists.gnu.org/archive/html/bug-gnu-emacs/2017-06/msg00610.html
>
> better ;-)

Pushed.

--
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»