bug#35721: 27.0.50; Strange Arabic shaping behavior

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Basil L. Contovounesios
I see the following on the master, harfbuzz, and emacs-26 branches
(precise versions follow my signature), but I'm not sure how much of
this is expected or due to e.g. my font.

0. emacs -Q
1. C-x 8 RET 0634 RET

The "tail" of the sheen is truncated by the fringe:



2. C-a C-u C-x =

--8<---------------cut here---------------start------------->8---
             position: 146 of 146 (99%), column: 0
            character: ش‎ (displayed as ش‎) (codepoint 1588, #o3064, #x634)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x0634
               script: arabic
               syntax: w which means: word
             category: .:Base, R:Right-to-left (strong), b:Arabic
             to input: type "C-x 8 RET 634" or "C-x 8 RET ARABIC LETTER SHEEN"
          buffer code: #xD8 #xB4
            file code: #xD8 #xB4 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    xft:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-19-*-*-*-m-0-iso10646-1 (#x46A)

Character code properties: customize what to show
  name: ARABIC LETTER SHEEN
  general-category: Lo (Letter, Other)
  decomposition: (1588) ('ش')

There are text properties here:
  fontified            nil
--8<---------------cut here---------------end--------------->8---

3. SPC

The "tail" of the sheen becomes visible, but falls outside of the box
cursor:



4. C-x 8 RET 0643 RET

The kaf is correctly shaped in its initial form:



5. C-SPC

The kaf changes to its isolated form:



6. C-g C-a C-k
7. C-u C-\ arabic RET
8. a ; RET

The sheen is correctly shaped in its initial form and the kaf is
truncated by the fringe:



9. a ; RET

The first sheen unexpectedly changes to its isolated form:



I occasionally see this happen even without typing anything, as if by a
timer, but I'm not sure how to reproduce it.  I think, without being
100% certain, that it's only happened while using the 'arabic' input
method.

10. a

The first sheen reverts to its initial form:



Any insights?  Thanks,

--
Basil

In GNU Emacs 27.0.50 (build 40, x86_64-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2019-05-13 built on thunk
Repository revision: a1e5cce99b75c1bd50995b7b4d81423b1296fa60
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12003000
System Description: Debian GNU/Linux buster/sid

Configured using:
 'configure 'CC=ccache gcc' 'CFLAGS=-O2 -march=native' --config-cache
 --prefix=/home/blc/.local --with-mailutils --with-x-toolkit=lucid
 --with-modules --with-file-notification=yes --with-x'

Configured features:
XAW3D XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS
GLIB NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT
LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS LUCID X11 XDBE XIM MODULES THREADS
LIBSYSTEMD JSON PDUMPER LCMS2 GMP

Important settings:
  value of $LANG: en_IE.UTF-8
  locale-coding-system: utf-8-unix


In GNU Emacs 27.0.50 (build 2, x86_64-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2019-05-13 built on thunk
Repository revision: 5d7dafacf4afc888511649f6fc24c28210cd0dfc
Repository branch: harfbuzz
Windowing system distributor 'The X.Org Foundation', version 11.0.12003000
System Description: Debian GNU/Linux buster/sid

Configured using:
 'configure 'CC=ccache gcc' 'CFLAGS=-O0 -g3 -ggdb -gdwarf-4'
 --config-cache --prefix=/home/blc/.local --program-suffix=-harfbuzz
 --enable-checking=yes,glyphs --enable-check-lisp-object-type
 --with-mailutils --with-x-toolkit=lucid --with-modules
 --with-file-notification=yes --with-x'

Configured features:
XAW3D XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS
GLIB NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ
M17N_FLT LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS LUCID X11 XDBE XIM MODULES
THREADS LIBSYSTEMD JSON PDUMPER LCMS2 GMP


In GNU Emacs 26.2.50 (build 2, x86_64-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2019-04-30 built on thunk
Repository revision: c26d452ae15a74f0eeec53ba529eebaa95eb5489
Windowing system distributor 'The X.Org Foundation', version 11.0.12003000
System Description: Debian GNU/Linux buster/sid

Configured using:
 'configure 'CC=ccache gcc' 'CFLAGS=-O0 -g3 -ggdb -gdwarf-4'
 --config-cache --prefix=/home/blc/.local --program-suffix=26
 --enable-checking=yes,glyphs --enable-check-lisp-object-type
 --with-mailutils --with-x-toolkit=lucid --with-modules
 --with-file-notification=yes --with-x'

Configured features:
XAW3D XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS
GLIB NOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT
ZLIB TOOLKIT_SCROLL_BARS LUCID X11 XDBE XIM MODULES THREADS LIBSYSTEMD
LCMS2

01.png (1K) Download Attachment
02.png (1K) Download Attachment
03.png (1K) Download Attachment
04.png (1K) Download Attachment
05.png (1K) Download Attachment
06.png (2K) Download Attachment
07.png (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Eli Zaretskii
> From: "Basil L. Contovounesios" <[hidden email]>
> Date: Mon, 13 May 2019 23:09:06 +0100
>
> I see the following on the master, harfbuzz, and emacs-26 branches
> (precise versions follow my signature), but I'm not sure how much of
> this is expected or due to e.g. my font.
>
> 0. emacs -Q
> 1. C-x 8 RET 0634 RET
>
> The "tail" of the sheen is truncated by the fringe:

After looking at the code and thinking about this, I think this is a
feature (as strange as it may sound); see below for why I think so.
And yes, it definitely depends on the font, in this case DejaVu Sans
Mono.  I don't see this with any other fixed-pitch font I have.

I'm not 100% sure I'm right here, so I CC Behdad and Kenichi in the
hope that they will comment on this.  Behdad, I see a very similar
issue with hb-view, when it renders this character using DejaVu Sans
Mono, so it isn't just an Emacs issue (and seeing what hb-view
produces actually made me think my opinion is correct about this).

> 3. SPC
>
> The "tail" of the sheen becomes visible, but falls outside of the box
> cursor:

Yes, this particular font's glyph for sheen has a negative value of
left bearing.  Which AFAIU means it extends beyond the box dimensions
to the left.

> 4. C-x 8 RET 0643 RET
>
> The kaf is correctly shaped in its initial form:
>
> 5. C-SPC
>
> The kaf changes to its isolated form:

This is different problem, related to how we redraw portions of the
buffer inside the region (more generally, those which have colors
different from the default face).

The problem is that we only pass to the shaping engine stretches of
text that have the same face.  The basic reason for that is that a
different face can use a different font, and we can only handle
character composition for characters supported by the same font.
Another fundamental reason is that the display engine processes text
in chunks that have the same face.  So when the active region, or some
other Emacs feature, paints portions of text in some non-default face,
we redraw the display, and pass to the shaping engine only the portion
that has that different face.  If that portion is a single character,
you will see that it loses its correct shape and is rendered in its
isolated form.  And if the colors change between two characters that
need to be shaped together, the shaping will break.

You can easily see this effect if you display HELLO, and then
shift-select portions of the Arabic greeting (or any other script that
is a heavy user of character compositions).

To fix this, we need some mechanism that will pass larger chunks of
text to the shaper in these cases, which will need some changes in how
the display engine iterates through buffer/string text when it
prepares them for display: we currently stop at every change of face.

Patches to fix this are most welcome.

> I occasionally see this happen even without typing anything, as if by a
> timer, but I'm not sure how to reproduce it.  I think, without being
> 100% certain, that it's only happened while using the 'arabic' input
> method.

Maybe, but given my description above, I'm not surprised, because it's
enough that Emacs decides, for some reason, to redraw just that one
character.

Now to the original problem.  Let me turn the table and ask you: what
did you expect to happen instead?  This is a fixed-pitch font, so how
can Emacs display a character that extends to the left from its box,
at the left-most window coordinate?  It has no choice but consider its
extension be off-screen, as if the window was hscrolled.

The "normal" case for this character is to be part of R2L text, which
begins at the right window margin, and flows to the left.  In that
case, the extension will overlap the character cell of the next (in
the logical order) character.

So I think we have no bug here, we behave as expected.  If not, I'm
sure Behdad and Kenichi will correct me.



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Eli Zaretskii
Adding Khaled.

The original message with images can be viewed here:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=35721#5

> Date: Tue, 14 May 2019 18:10:19 +0300
> From: Eli Zaretskii <[hidden email]>
> Cc: [hidden email]
>
> > From: "Basil L. Contovounesios" <[hidden email]>
> > Date: Mon, 13 May 2019 23:09:06 +0100
> >
> > I see the following on the master, harfbuzz, and emacs-26 branches
> > (precise versions follow my signature), but I'm not sure how much of
> > this is expected or due to e.g. my font.
> >
> > 0. emacs -Q
> > 1. C-x 8 RET 0634 RET
> >
> > The "tail" of the sheen is truncated by the fringe:
>
> After looking at the code and thinking about this, I think this is a
> feature (as strange as it may sound); see below for why I think so.
> And yes, it definitely depends on the font, in this case DejaVu Sans
> Mono.  I don't see this with any other fixed-pitch font I have.
>
> I'm not 100% sure I'm right here, so I CC Behdad and Kenichi in the
> hope that they will comment on this.  Behdad, I see a very similar
> issue with hb-view, when it renders this character using DejaVu Sans
> Mono, so it isn't just an Emacs issue (and seeing what hb-view
> produces actually made me think my opinion is correct about this).
>
> > 3. SPC
> >
> > The "tail" of the sheen becomes visible, but falls outside of the box
> > cursor:
>
> Yes, this particular font's glyph for sheen has a negative value of
> left bearing.  Which AFAIU means it extends beyond the box dimensions
> to the left.
>
> > 4. C-x 8 RET 0643 RET
> >
> > The kaf is correctly shaped in its initial form:
> >
> > 5. C-SPC
> >
> > The kaf changes to its isolated form:
>
> This is different problem, related to how we redraw portions of the
> buffer inside the region (more generally, those which have colors
> different from the default face).
>
> The problem is that we only pass to the shaping engine stretches of
> text that have the same face.  The basic reason for that is that a
> different face can use a different font, and we can only handle
> character composition for characters supported by the same font.
> Another fundamental reason is that the display engine processes text
> in chunks that have the same face.  So when the active region, or some
> other Emacs feature, paints portions of text in some non-default face,
> we redraw the display, and pass to the shaping engine only the portion
> that has that different face.  If that portion is a single character,
> you will see that it loses its correct shape and is rendered in its
> isolated form.  And if the colors change between two characters that
> need to be shaped together, the shaping will break.
>
> You can easily see this effect if you display HELLO, and then
> shift-select portions of the Arabic greeting (or any other script that
> is a heavy user of character compositions).
>
> To fix this, we need some mechanism that will pass larger chunks of
> text to the shaper in these cases, which will need some changes in how
> the display engine iterates through buffer/string text when it
> prepares them for display: we currently stop at every change of face.
>
> Patches to fix this are most welcome.
>
> > I occasionally see this happen even without typing anything, as if by a
> > timer, but I'm not sure how to reproduce it.  I think, without being
> > 100% certain, that it's only happened while using the 'arabic' input
> > method.
>
> Maybe, but given my description above, I'm not surprised, because it's
> enough that Emacs decides, for some reason, to redraw just that one
> character.
>
> Now to the original problem.  Let me turn the table and ask you: what
> did you expect to happen instead?  This is a fixed-pitch font, so how
> can Emacs display a character that extends to the left from its box,
> at the left-most window coordinate?  It has no choice but consider its
> extension be off-screen, as if the window was hscrolled.
>
> The "normal" case for this character is to be part of R2L text, which
> begins at the right window margin, and flows to the left.  In that
> case, the extension will overlap the character cell of the next (in
> the logical order) character.
>
> So I think we have no bug here, we behave as expected.  If not, I'm
> sure Behdad and Kenichi will correct me.



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Behdad Esfahbod-2
Hi Eli,

Pretty much things you said: if font is not mono-spaced, there's nothing we can do.  Also, if you don't pass neighboring context text to HarfBuzz, again, nothing we can do.

On Tue, May 14, 2019 at 11:24 AM Eli Zaretskii <[hidden email]> wrote:
Adding Khaled.

The original message with images can be viewed here:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=35721#5

> Date: Tue, 14 May 2019 18:10:19 +0300
> From: Eli Zaretskii <[hidden email]>
> Cc: [hidden email]
>
> > From: "Basil L. Contovounesios" <[hidden email]>
> > Date: Mon, 13 May 2019 23:09:06 +0100
> >
> > I see the following on the master, harfbuzz, and emacs-26 branches
> > (precise versions follow my signature), but I'm not sure how much of
> > this is expected or due to e.g. my font.
> >
> > 0. emacs -Q
> > 1. C-x 8 RET 0634 RET
> >
> > The "tail" of the sheen is truncated by the fringe:
>
> After looking at the code and thinking about this, I think this is a
> feature (as strange as it may sound); see below for why I think so.
> And yes, it definitely depends on the font, in this case DejaVu Sans
> Mono.  I don't see this with any other fixed-pitch font I have.
>
> I'm not 100% sure I'm right here, so I CC Behdad and Kenichi in the
> hope that they will comment on this.  Behdad, I see a very similar
> issue with hb-view, when it renders this character using DejaVu Sans
> Mono, so it isn't just an Emacs issue (and seeing what hb-view
> produces actually made me think my opinion is correct about this).
>
> > 3. SPC
> >
> > The "tail" of the sheen becomes visible, but falls outside of the box
> > cursor:
>
> Yes, this particular font's glyph for sheen has a negative value of
> left bearing.  Which AFAIU means it extends beyond the box dimensions
> to the left.
>
> > 4. C-x 8 RET 0643 RET
> >
> > The kaf is correctly shaped in its initial form:
> >
> > 5. C-SPC
> >
> > The kaf changes to its isolated form:
>
> This is different problem, related to how we redraw portions of the
> buffer inside the region (more generally, those which have colors
> different from the default face).
>
> The problem is that we only pass to the shaping engine stretches of
> text that have the same face.  The basic reason for that is that a
> different face can use a different font, and we can only handle
> character composition for characters supported by the same font.
> Another fundamental reason is that the display engine processes text
> in chunks that have the same face.  So when the active region, or some
> other Emacs feature, paints portions of text in some non-default face,
> we redraw the display, and pass to the shaping engine only the portion
> that has that different face.  If that portion is a single character,
> you will see that it loses its correct shape and is rendered in its
> isolated form.  And if the colors change between two characters that
> need to be shaped together, the shaping will break.
>
> You can easily see this effect if you display HELLO, and then
> shift-select portions of the Arabic greeting (or any other script that
> is a heavy user of character compositions).
>
> To fix this, we need some mechanism that will pass larger chunks of
> text to the shaper in these cases, which will need some changes in how
> the display engine iterates through buffer/string text when it
> prepares them for display: we currently stop at every change of face.
>
> Patches to fix this are most welcome.
>
> > I occasionally see this happen even without typing anything, as if by a
> > timer, but I'm not sure how to reproduce it.  I think, without being
> > 100% certain, that it's only happened while using the 'arabic' input
> > method.
>
> Maybe, but given my description above, I'm not surprised, because it's
> enough that Emacs decides, for some reason, to redraw just that one
> character.
>
> Now to the original problem.  Let me turn the table and ask you: what
> did you expect to happen instead?  This is a fixed-pitch font, so how
> can Emacs display a character that extends to the left from its box,
> at the left-most window coordinate?  It has no choice but consider its
> extension be off-screen, as if the window was hscrolled.
>
> The "normal" case for this character is to be part of R2L text, which
> begins at the right window margin, and flows to the left.  In that
> case, the extension will overlap the character cell of the next (in
> the logical order) character.
>
> So I think we have no bug here, we behave as expected.  If not, I'm
> sure Behdad and Kenichi will correct me.


--
Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Eli Zaretskii
> From: Behdad Esfahbod <[hidden email]>
> Date: Wed, 15 May 2019 16:02:19 -0700
> Cc: [hidden email], Kenichi Handa <[hidden email]>,
> Khaled Hosny <[hidden email]>, [hidden email]
>
> Pretty much things you said: if font is not mono-spaced, there's nothing we can do.

Thanks.  Let me be sure I understand what you are saying.  If I invoke
hb-view like this:

  hb-view -u 0x0634 -O png -o sheen.png DejaVuSansMono.ttf

then the result is the following PNG image:



Do I understand you correctly that this is the expected result with
that font?  Because what Emacs displays, even without HarfBuzz as its
shaping engine, looks exactly like that: the leftmost part of the
letter is off-screen.

> Also, if you don't pass neighboring context text to HarfBuzz, again,
> nothing we can do.

I believe this is about the other part: displaying text which is
partially selected, when selection is shown as a different background
color.  You are saying that to do its job, a shaping engine needs to
see the entire text, not just the part which has the same colors.
Right?

Thanks again for your comments.

sheen.png (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Khaled Hosny
On Thu, May 16, 2019 at 04:28:10PM +0300, Eli Zaretskii wrote:

> > From: Behdad Esfahbod <[hidden email]>
> >
> > Also, if you don't pass neighboring context text to HarfBuzz, again,
> > nothing we can do.
>
> I believe this is about the other part: displaying text which is
> partially selected, when selection is shown as a different background
> color.  You are saying that to do its job, a shaping engine needs to
> see the entire text, not just the part which has the same colors.
> Right?

There are two kinds of formatting changes, that involve a different font
and that don’t.

For changes that involve a different font (using a new typeface, making
text bold or italic etc.) each part of the text on the sides of the
change has to be shaped separately, but if enough context is given then
HarfBuzz can at least do basic Arabic shaping (selecting the right form
for the context).

Changes that does not involve a different font like color, underline,
background etc. should not break the shaping at all and the text should
be shaped together. The formatting information should be kept along the
text and be applied after shaping i.e. if characters N to N+5 are red,
then after shaping glyphs belonging to characters N to N+5 should be
drawn red.

Regards,
Khaled



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Eli Zaretskii
> Date: Thu, 16 May 2019 15:45:08 +0200
> From: Khaled Hosny <[hidden email]>
> Cc: Behdad Esfahbod <[hidden email]>, [hidden email], [hidden email],
> [hidden email]
>
> > I believe this is about the other part: displaying text which is
> > partially selected, when selection is shown as a different background
> > color.  You are saying that to do its job, a shaping engine needs to
> > see the entire text, not just the part which has the same colors.
> > Right?
>
> There are two kinds of formatting changes, that involve a different font
> and that don’t.
>
> For changes that involve a different font (using a new typeface, making
> text bold or italic etc.) each part of the text on the sides of the
> change has to be shaped separately, but if enough context is given then
> HarfBuzz can at least do basic Arabic shaping (selecting the right form
> for the context).
>
> Changes that does not involve a different font like color, underline,
> background etc. should not break the shaping at all and the text should
> be shaped together. The formatting information should be kept along the
> text and be applied after shaping i.e. if characters N to N+5 are red,
> then after shaping glyphs belonging to characters N to N+5 should be
> drawn red.

Right, thanks.

The main point of this bug report is about the other part, though: how
sheen should be displayed when using DejaVu Sans Mono.



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Khaled Hosny
On Thu, May 16, 2019 at 05:08:47PM +0300, Eli Zaretskii wrote:
> The main point of this bug report is about the other part, though: how
> sheen should be displayed when using DejaVu Sans Mono.

I don’t think there is a right answer here. Glyphs can have -ve right or
left side bearings, so if the graphics context has no room on the either
side then I think it is expected that the glyph would be truncated. This
is not Arabic specific (e.g. italic f often have -ve right side
bearing, in variable width fonts at least).



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Eli Zaretskii
tags 35721 notabug
thanks

> Date: Thu, 16 May 2019 16:20:00 +0200
> From: Khaled Hosny <[hidden email]>
> Cc: [hidden email], [hidden email], [hidden email],
> [hidden email]
>
> On Thu, May 16, 2019 at 05:08:47PM +0300, Eli Zaretskii wrote:
> > The main point of this bug report is about the other part, though: how
> > sheen should be displayed when using DejaVu Sans Mono.
>
> I don’t think there is a right answer here. Glyphs can have -ve right or
> left side bearings, so if the graphics context has no room on the either
> side then I think it is expected that the glyph would be truncated. This
> is not Arabic specific (e.g. italic f often have -ve right side
> bearing, in variable width fonts at least).

OK, thanks.

Basil, is it okay with you to close this bug, given these comments?



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Basil L. Contovounesios
In reply to this post by Eli Zaretskii
Eli Zaretskii <[hidden email]> writes:

>> From: "Basil L. Contovounesios" <[hidden email]>
>> Date: Mon, 13 May 2019 23:09:06 +0100
>>
>> I see the following on the master, harfbuzz, and emacs-26 branches
>> (precise versions follow my signature), but I'm not sure how much of
>> this is expected or due to e.g. my font.
>>
>> 0. emacs -Q
>> 1. C-x 8 RET 0634 RET
>>
>> The "tail" of the sheen is truncated by the fringe:
>
> After looking at the code and thinking about this, I think this is a
> feature (as strange as it may sound); see below for why I think so.
> And yes, it definitely depends on the font, in this case DejaVu Sans
> Mono.  I don't see this with any other fixed-pitch font I have.
>
> I'm not 100% sure I'm right here, so I CC Behdad and Kenichi in the
> hope that they will comment on this.  Behdad, I see a very similar
> issue with hb-view, when it renders this character using DejaVu Sans
> Mono, so it isn't just an Emacs issue (and seeing what hb-view
> produces actually made me think my opinion is correct about this).
>
>> 3. SPC
>>
>> The "tail" of the sheen becomes visible, but falls outside of the box
>> cursor:
>
> Yes, this particular font's glyph for sheen has a negative value of
> left bearing.  Which AFAIU means it extends beyond the box dimensions
> to the left.

OK.

>> 4. C-x 8 RET 0643 RET
>>
>> The kaf is correctly shaped in its initial form:
>>
>> 5. C-SPC
>>
>> The kaf changes to its isolated form:
>
> This is different problem, related to how we redraw portions of the
> buffer inside the region (more generally, those which have colors
> different from the default face).
>
> The problem is that we only pass to the shaping engine stretches of
> text that have the same face.  The basic reason for that is that a
> different face can use a different font, and we can only handle
> character composition for characters supported by the same font.
> Another fundamental reason is that the display engine processes text
> in chunks that have the same face.  So when the active region, or some
> other Emacs feature, paints portions of text in some non-default face,
> we redraw the display, and pass to the shaping engine only the portion
> that has that different face.  If that portion is a single character,
> you will see that it loses its correct shape and is rendered in its
> isolated form.  And if the colors change between two characters that
> need to be shaped together, the shaping will break.
>
> You can easily see this effect if you display HELLO, and then
> shift-select portions of the Arabic greeting (or any other script that
> is a heavy user of character compositions).
>
> To fix this, we need some mechanism that will pass larger chunks of
> text to the shaper in these cases, which will need some changes in how
> the display engine iterates through buffer/string text when it
> prepares them for display: we currently stop at every change of face.

Makes sense, thanks for explaining.

> Patches to fix this are most welcome.

I don't think I need to tell you not to hold your breath.
Maybe one day...

>> I occasionally see this happen even without typing anything, as if by a
>> timer, but I'm not sure how to reproduce it.  I think, without being
>> 100% certain, that it's only happened while using the 'arabic' input
>> method.
>
> Maybe, but given my description above, I'm not surprised, because it's
> enough that Emacs decides, for some reason, to redraw just that one
> character.

Your description above explains the case where the mark is activated in
the middle of a composition, but I don't think it explains why inserting
characters further down the buffer would affect compositions on previous
lines, as in steps 9 and 10 in the OP, where no face change is involved.

> Now to the original problem.  Let me turn the table and ask you: what
> did you expect to happen instead?  This is a fixed-pitch font, so how
> can Emacs display a character that extends to the left from its box,
> at the left-most window coordinate?  It has no choice but consider its
> extension be off-screen, as if the window was hscrolled.
>
> The "normal" case for this character is to be part of R2L text, which
> begins at the right window margin, and flows to the left.  In that
> case, the extension will overlap the character cell of the next (in
> the logical order) character.

Right.  I only noticed the truncation while writing up the rest of the
report, but it seemed relevant enough to mention.  Following your
explanation, the current truncating behaviour seems reasonable to me.

> So I think we have no bug here, we behave as expected.  If not, I'm
> sure Behdad and Kenichi will correct me.

Thanks,

--
Basil



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Basil L. Contovounesios
In reply to this post by Eli Zaretskii
Eli Zaretskii <[hidden email]> writes:

> tags 35721 notabug
> thanks
>
>> Date: Thu, 16 May 2019 16:20:00 +0200
>> From: Khaled Hosny <[hidden email]>
>> Cc: [hidden email], [hidden email], [hidden email],
>> [hidden email]
>>
>> On Thu, May 16, 2019 at 05:08:47PM +0300, Eli Zaretskii wrote:
>> > The main point of this bug report is about the other part, though: how
>> > sheen should be displayed when using DejaVu Sans Mono.
>>
>> I don’t think there is a right answer here. Glyphs can have -ve right or
>> left side bearings, so if the graphics context has no room on the either
>> side then I think it is expected that the glyph would be truncated. This
>> is not Arabic specific (e.g. italic f often have -ve right side
>> bearing, in variable width fonts at least).
>
> OK, thanks.
>
> Basil, is it okay with you to close this bug, given these comments?

Thanks to everyone for your explanations.  I now understand why the
truncation happens, and why mark activation is expected to affect
character composition.

The issue that prompted this report, however, is the alternating
toggling of certain character compositions while typing in a separate
part of the buffer (steps 9 and 10 in the OP), where no face change is
involved.  I'm sorry for not making this clearer from the outset.

If you think this is unsurprising behaviour given the display engine's
current implementation, I don't mind if you close this issue.
Otherwise, perhaps either this issue can be retitled, or I can submit a
new issue focussing only on that part of the OP.

Sorry for the confusion, and thanks again.

--
Basil



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Eli Zaretskii
> From: "Basil L. Contovounesios" <[hidden email]>
> Cc: Khaled Hosny <[hidden email]>,  <[hidden email]>,  <[hidden email]>,  <[hidden email]>
> Date: Thu, 16 May 2019 21:54:10 +0100
>
> The issue that prompted this report, however, is the alternating
> toggling of certain character compositions while typing in a separate
> part of the buffer (steps 9 and 10 in the OP), where no face change is
> involved.  I'm sorry for not making this clearer from the outset.
>
> If you think this is unsurprising behaviour given the display engine's
> current implementation, I don't mind if you close this issue.

I'm quite sure it is the result of how the display of complex scripts
is implemented.  However, ...

> Otherwise, perhaps either this issue can be retitled, or I can submit a
> new issue focussing only on that part of the OP.

... I think it would be best to file a new bug report, which is _only_
about the above-mentioned alternations in display of composed
characters, with specific examples only for that issue.  Then these
examples could be analyzed, and either (1) we find some bug that will
then be fixed, or (2) we conclude that these are artifacts of the
current implementation of the display engine, and make this a wishlist
bug report.

Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#35721: 27.0.50; Strange Arabic shaping behavior

Basil L. Contovounesios
close 35721
quit

Eli Zaretskii <[hidden email]> writes:

>> From: "Basil L. Contovounesios" <[hidden email]>
>> Cc: Khaled Hosny <[hidden email]>,  <[hidden email]>,  <[hidden email]>,  <[hidden email]>
>> Date: Thu, 16 May 2019 21:54:10 +0100
>>
>> The issue that prompted this report, however, is the alternating
>> toggling of certain character compositions while typing in a separate
>> part of the buffer (steps 9 and 10 in the OP), where no face change is
>> involved.  I'm sorry for not making this clearer from the outset.
>>
>> If you think this is unsurprising behaviour given the display engine's
>> current implementation, I don't mind if you close this issue.
>
> I'm quite sure it is the result of how the display of complex scripts
> is implemented.  However, ...
>
>> Otherwise, perhaps either this issue can be retitled, or I can submit a
>> new issue focussing only on that part of the OP.
>
> ... I think it would be best to file a new bug report, which is _only_
> about the above-mentioned alternations in display of composed
> characters, with specific examples only for that issue.  Then these
> examples could be analyzed, and either (1) we find some bug that will
> then be fixed, or (2) we conclude that these are artifacts of the
> current implementation of the display engine, and make this a wishlist
> bug report.

Filed as bug#35811[1], so I'm closing this report.

[1]: https://debbugs.gnu.org/35811

Thanks,

--
Basil