bug#41005: problem with rendering Persian text in Emacs 27

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

hossein valizadeh
Hello,

I have been using GNU Emacs for a few years now, and I recently decided
to build Emacs 27.0.91 from the emacs-27 branch.  I use Arch GNU/Linux,
along with the i3 window manager.

I have noticed a new problem with rendering Persian text in Emacs 27,
which to my knowledge was not present in earlier Emacs versions.  If you
see the screenshots I have attached, some text seems to get garbled at
random.  The same words sometimes appear correctly and sometimes do not.
I recall John Wiegley mentioning in his EmacsConf 2019 talk that Emacs
has recently started using the HarfBuzz text shaping engine.  I wonder
if that's relevant and possibly the cause of this issue?

Thanks in advance for looking into this.  I really hope the issue can be
resolved, as I use Emacs and Org extensively for writing my documents,
and this bug effectively renders Emacs unusable for me.  I'll be happy
to try and provide any further information needed to debug this.

Regards,
Hossein Valizadeh

scr-1.png (83K) Download Attachment
latest-screenshot.png (540K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
> From: hossein valizadeh <[hidden email]>
> Date: Fri, 1 May 2020 22:32:31 +0430
>
> I have noticed a new problem with rendering Persian text in Emacs 27,
> which to my knowledge was not present in earlier Emacs versions.  If you
> see the screenshots I have attached, some text seems to get garbled at
> random.  The same words sometimes appear correctly and sometimes do not.
> I recall John Wiegley mentioning in his EmacsConf 2019 talk that Emacs
> has recently started using the HarfBuzz text shaping engine.  I wonder
> if that's relevant and possibly the cause of this issue?

I don't know.  Was your Emacs built with HarfBuzz?  You didn't attach
the data collected by "M-x report-emacs-bug", so I cannot know.

I suggest to start by attaching a sample file that exhibits the
problem, as text, not as an image, and then showing what it looked
like in previous versions of Emacs and what it looks like in Emacs
27.0.91 on your system.  Also, please tell which font was in use when
the problematic display was produced.  Then we can look into this
issue and see what could have caused it.

Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
[Please keep the bug address on the CC list; use "Reply All".]

> From: hossein valizadeh <[hidden email]>
> Date: Wed, 3 Jun 2020 09:45:09 +0430
>
> The Emacs I'm using is compiled by HarfBuzz. I also attached a video file containing information about this
> problem for a better understanding.
>
> I checked my config file, line by line in different modes and in combination with different commands.
> I think it gets garbled in on of the following 2 situations:
>
> 1. the auto-fill-mode is on
> 2. column-number-mode is on (or while I'm using %c modifier in custom mode line)
>
> # These two seem to be related, As auto-fill-mode calculates the number of character in each line too.
> # I tried these in raw and un-configured Emacs; the results were the same and character got garbled.
> # The using font makes no different.(In raw Emacs I used "DejaVu Sans Mono" font )
> # Scale the buffer may help to show the word correctly but other words are still garbled.
> # It mostly happens when I'm trying to add something to existing line. (and sometimes it just normally
> happen)
> # I have deactivated these two options on my config file and Emacs worked properly since then.
>
> *** I noticed another bug :
>
> when I changed the input language to Persian(or any other language) by set-input-method command and
> scale the text with "C-x +", When I start typing, the first character in buffer ignores the input-method and an
> English character appears in buffer; But it won't happen for the next characters.

I asked you to please provide a test file with text that you think
looks incorrectly on display, and images that show how it looks in
Emacs 27 and in Emacs 26:

>  I suggest to start by attaching a sample file that exhibits the
>  problem, as text, not as an image, and then showing what it looked
>  like in previous versions of Emacs and what it looks like in Emacs
>  27.0.91 on your system.  Also, please tell which font was in use when
>  the problematic display was produced.  Then we can look into this
>  issue and see what could have caused it.

Please provide that stuff, it is important to make progress with this
bug report.  TIA.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Nicholas Drozd
In reply to this post by hossein valizadeh
Not certain, but this sounds like it could be related to bug#37683
[1]. That report was closed because it does not seem to be an EWW
problem, but the issue still shows up for me.

To recap, the problem there is that Arabic script letters, which are
typically joined together in a cursive style, show up separated and
unconnected, if and only if text scaling is at zero. If the text is
made larger or smaller, the text becomes properly joined. This leads
me to suspect that it is not a Harfbuzz issue, because the text is
displayed correctly under some conditions but not under others.

[1] https://lists.gnu.org/archive/html/bug-gnu-emacs/2019-10/msg00921.html



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
> From: Nicholas Drozd <[hidden email]>
> Date: Wed, 3 Jun 2020 12:24:30 -0500
>
> Not certain, but this sounds like it could be related to bug#37683
> [1]. That report was closed because it does not seem to be an EWW
> problem, but the issue still shows up for me.
>
> To recap, the problem there is that Arabic script letters, which are
> typically joined together in a cursive style, show up separated and
> unconnected, if and only if text scaling is at zero. If the text is
> made larger or smaller, the text becomes properly joined. This leads
> me to suspect that it is not a Harfbuzz issue, because the text is
> displayed correctly under some conditions but not under others.

It happens to you with any font that supports Arabic, or just with
some?

And if you change the scale, then change it back to zero, does the
problem happen again, or does it only happen when you look at some
text for the first time?



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

hossein valizadeh
Hello,

1. This happen with every font that supports Arabic.
2. When scale back to zero the problem happen again.

I attached my emacs-bug-info.


On Thu, Jun 4, 2020 at 7:09 AM hossein valizadeh <[hidden email]> wrote:
Hello,

1. This happen with every font that supports Arabic.
2. When scale back to zero the problem happen again.

On Wed, Jun 3, 2020 at 10:32 PM Eli Zaretskii <[hidden email]> wrote:
> From: Nicholas Drozd <[hidden email]>
> Date: Wed, 3 Jun 2020 12:24:30 -0500
>
> Not certain, but this sounds like it could be related to bug#37683
> [1]. That report was closed because it does not seem to be an EWW
> problem, but the issue still shows up for me.
>
> To recap, the problem there is that Arabic script letters, which are
> typically joined together in a cursive style, show up separated and
> unconnected, if and only if text scaling is at zero. If the text is
> made larger or smaller, the text becomes properly joined. This leads
> me to suspect that it is not a Harfbuzz issue, because the text is
> displayed correctly under some conditions but not under others.

It happens to you with any font that supports Arabic, or just with
some?

And if you change the scale, then change it back to zero, does the
problem happen again, or does it only happen when you look at some
text for the first time?

Emacs-bug-info.txt (24K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
On June 4, 2020 7:01:30 AM GMT+03:00, Eli Zaretskii <[hidden email]> wrote:

> On June 4, 2020 6:01:21 AM GMT+03:00, hossein valizadeh
> <[hidden email]> wrote:
> > Hello,
> >
> > 1. This happen with every font that supports Arabic.
> > 2. When scale back to zero the problem happen again.
> >
> > I attached my emacs-bug-info.
> >
> >
> > On Thu, Jun 4, 2020 at 7:09 AM hossein valizadeh
> > <[hidden email]>
> > wrote:
> >
> > > Hello,
> > >
> > > 1. This happen with every font that supports Arabic.
> > > 2. When scale back to zero the problem happen again.
> > >
>
>
> Thanks.
>
> What is your version of HarfBuzz?
> And what happens if you unset XMODIFIERS, i.e. disable the ibus input
> method framework?

Also, please go to the problematic place in the text and type "C-u C-x =", then post everything that Emacs shows in the *Help* buffer as result.  Please do this both at text scale zero, when shaping is incorrect, and at non-zero scale, and post the contents of *Help* in both cases.

Screenshots of both displays as well as the text of the buffer used for these experiments will also help, as mentioned earlier

And finally, please try this with the version on the master branch, where a few fixes were installed lately.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

hossein valizadeh
HarfBuzz version is  2.6.7
XMODIFIERS=''  make no difference.
Text used in these screenshots are  https://time.ir  in  EWW mode. (help buffer content displayed in screenshots)

i will try on master branch and  report the result.

On Thu, Jun 4, 2020 at 8:40 AM Eli Zaretskii <[hidden email]> wrote:
On June 4, 2020 7:01:30 AM GMT+03:00, Eli Zaretskii <[hidden email]> wrote:
> On June 4, 2020 6:01:21 AM GMT+03:00, hossein valizadeh
> <[hidden email]> wrote:
> > Hello,
> >
> > 1. This happen with every font that supports Arabic.
> > 2. When scale back to zero the problem happen again.
> >
> > I attached my emacs-bug-info.
> >
> >
> > On Thu, Jun 4, 2020 at 7:09 AM hossein valizadeh
> > <[hidden email]>
> > wrote:
> >
> > > Hello,
> > >
> > > 1. This happen with every font that supports Arabic.
> > > 2. When scale back to zero the problem happen again.
> > >
>
>
> Thanks.
>
> What is your version of HarfBuzz?
> And what happens if you unset XMODIFIERS, i.e. disable the ibus input
> method framework?

Also, please go to the problematic place in the text and type "C-u C-x =", then post everything that Emacs shows in the *Help* buffer as result.  Please do this both at text scale zero, when shaping is incorrect, and at non-zero scale, and post the contents of *Help* in both cases.

Screenshots of both displays as well as the text of the buffer used for these experiments will also help, as mentioned earlier

And finally, please try this with the version on the master branch, where a few fixes were installed lately.

zero-scale.png (217K) Download Attachment
none-zero-scale.png (236K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Pip Cet
hossein valizadeh <[hidden email]> writes:

>  And finally, please try this with the version on the master branch, where a few fixes were
>  installed lately.

I can reproduce the very similar issue described (Farsi Wikipedia entry
for Emacs), on current master. I believe I've figured it out, but I can
also debug further if required.

What happens is that font-shape-gstring is called with direction == L2R,
mis-shapes the text, then caches that version in the composition gstring
cache. The cache doesn't distinguish directions, and it's never cleared,
so this "infects" other buffers, but only if they're opened afterwards,
and only for the same words.

shr appears to force bidi-display-reordering off. Removing that let
binding from shr-insert-document avoids the problem.

We should consider adding direction to our gstrings, on master. While
we're there, let's also add script, language, and harfbuzz features to
the gstrings so we know we've captured the precise harfbuzz parameters?



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Pip Cet
Eli Zaretskii <[hidden email]> writes:

>> From: Pip Cet <[hidden email]>
>> Cc: Eli Zaretskii <[hidden email]>,  [hidden email],
>>   [hidden email]
>> Date: Thu, 04 Jun 2020 08:28:24 +0000
>>
>> What happens is that font-shape-gstring is called with direction == L2R,
>> mis-shapes the text, then caches that version in the composition gstring
>> cache. The cache doesn't distinguish directions, and it's never cleared,
>> so this "infects" other buffers, but only if they're opened afterwards,
>> and only for the same words.
>>
>> shr appears to force bidi-display-reordering off. Removing that let
>> binding from shr-insert-document avoids the problem.

>> We should consider adding direction to our gstrings, on master. While
>> we're there, let's also add script, language, and harfbuzz features to
>> the gstrings so we know we've captured the precise harfbuzz parameters?
>
> On emacs-27, I can fix this by a simpler band-aid below.

> On master, I think we should indeed add direction to the cached
> gstrings, as there could be other much more subtle situations where
> looking at bidi-display-reordering is not enough, and we could then
> still cache a composition with the wrong direction.  Patches welcome.

Sure, such subtle situations exist.

> As for script and language, for now adding them would just waste
> memory, since we don't yet have any meaningful support for
> buffer-local, let-alone paragraph-local, scripts and languages.  When
> we added HarfBuzz support, I considered adding some functionality to
> determine language and/or script from the codepoints, but decided that
> it made little sense, since HarfBuzz already does exactly that in
> hb_buffer_guess_segment_properties.  So I think we should add to the
> FIXME in hbfont.c the fact that when we do support those internally in
> Emacs, we should add these attributes to cached gstrings, but not yet
> actually add them for now.

We're going to have to change the lgstring structure, though, right?

Could we maybe get away with just doing so once? My suggestion would be
to add a single field to gstrings which would currently be L2R or R2L
but might become an alist or something when we get around to adding
those features?

> Here's the patch I propose for emacs-27:

Let's try that.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

hossein valizadeh
I tried the patch.
The eww problem is solved, but the problem still there, when I enable auto-fill-mode or column-number-mode.

Please look at this video file for a better understanding:
http://s13.picofile.com/d/8399189550/7eeb413f-0df7-4da6-9db1-1632c9fc749f/out.mkv
https://filebin.net/mzmjm74lp7wsxr8e
https://gofile.io/d/H8xk26


For example, if you type in the following sentence:
این نام است که می‌ماند


Then go back a few characters in the same line and type words randomly. You will see that the letters in some words are displayed separately. I type a few words at random, after the word این and before the word نام :‌

این فراموشی را به همه اینکه فرمت مراتب افتتاح گرامی گرایش سراسیمه نام است که می‌ماند

This line should look like this:


‌But if one of the auto-fill-mode or column-number-mode is enabled. it will be displayed this way:




Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
> From: hossein valizadeh <[hidden email]>
> Date: Fri, 5 Jun 2020 09:16:53 +0430
> Cc: Eli Zaretskii <[hidden email]>, [hidden email],
> Nicholas Drozd <[hidden email]>
>
> I tried the patch.

Please tell which patch was that, and in what Emacs version you tried
it.  (Please understand that you are generally talking to people some
of whom don't read Arabic or Persian, so the more details you supply
the less misunderstanding and confusion will follow, and the faster
this problem will be solved.)

> The eww problem is solved, but the problem still there, when I enable auto-fill-mode or
> column-number-mode.
>
> Please look at this video file for a better understanding:
> http://s13.picofile.com/d/8399189550/7eeb413f-0df7-4da6-9db1-1632c9fc749f/out.mkv
> https://filebin.net/mzmjm74lp7wsxr8e
> https://gofile.io/d/H8xk26

I cannot play this on my system, I see a bunch of ads (or what looks
like ads), and the name of a .mkv file.

> For example, if you type in the following sentence:
> این نام است که می‌ماند
>
> Then go back a few characters in the same line and type words randomly. You will see that the letters in
> some words are displayed separately. I type a few words at random, after the word این and before the word
> نام :‌
>
> این فراموشی را به همه اینکه فرمت مراتب افتتاح گرامی گرایش سراسیمه نام است که می‌ماند
>
> This line should look like this:
>
> http://s12.picofile.com/file/8399190550/correct.png
>
> ‌But if one of the auto-fill-mode or column-number-mode is enabled. it will be displayed this way:
>
> http://s13.picofile.com/file/8399190584/malformed.png

Please show a full recipe for reproducing this, starting from
"emacs -Q", and describing every step to reproduce the result.  Please
tell the codepoint of each character you type to reproduce the problem
and each Emacs command.  I'd also greatly appreciate if you
specifically point out at which parts of the display to look at what
differences to pay attention to.  This is needed to fully understand
the problem and analyze its root cause(s), given that not all of us
can read the Arabic script.

The master branch has recently got a few improvements in this area, so
please use that version of the code if you can.  And in any case,
please always state in what version you see which problem.

Thanks.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Basil L. Contovounesios
Eli Zaretskii <[hidden email]> writes:

>> From: hossein valizadeh <[hidden email]>
>> Date: Fri, 5 Jun 2020 09:16:53 +0430
>> Cc: Eli Zaretskii <[hidden email]>, [hidden email],
>> Nicholas Drozd <[hidden email]>
>>
>> The eww problem is solved, but the problem still there, when I enable auto-fill-mode or
>> column-number-mode.
>>
>> Please look at this video file for a better understanding:
>> http://s13.picofile.com/d/8399189550/7eeb413f-0df7-4da6-9db1-1632c9fc749f/out.mkv
>> https://filebin.net/mzmjm74lp7wsxr8e
>> https://gofile.io/d/H8xk26
>
> I cannot play this on my system, I see a bunch of ads (or what looks
> like ads), and the name of a .mkv file.

On the first page, you need to click twice the red button with the link
icon that according to my beginner-level reading of Perso-Arabic script
says "[something] link download".

That said, the easiest way for me to download the file was from the
third gofile link.  (All three links point to the same file AFAICT.)

If you're not able to play .mkv on Windows I think VLC supports it OOTB
(but I haven't used Windows in almost a decade so YMMV).

The large screen resolution of the screencast makes it a bit hard for me
to tell what's going on on my 14" laptop, though.

HTH,

--
Basil



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
In reply to this post by Pip Cet
> From: Pip Cet <[hidden email]>
> Cc: [hidden email],  [hidden email],  [hidden email]
> Date: Fri, 05 Jun 2020 08:41:19 +0000
>
> >> We're going to have to change the lgstring structure, though, right?
> >
> > I think so.  We should probably add one more element to the vector in
> > LGSTRING_HEADER, because the header is the hash key of the composition
> > cache.
>
> Do we have to maintain compatibility? If so, I suggest we change
>
> [FONT-OBJECT CHAR ...]
>
> to
>
> [FONT-OBJECT [CHAR ...] DIRECTION], and use ARRAYP (AREF (..., 2)) to
> decide whether the new format is in effect.  I actually thought about
> suggesting the format be [FONT-OBJECT STRING DIRECTION], but that would
> make debugging harder when pretty-printing the string in a failed
> composition re-attempts that composition.
>
> But of course it would be easier not to maintain compatibility, and then
> we could order the elements any way we choose.

We don't have to be backward-compatible here, I think, as the
structure of the header is an internal implementation detail.  So
something like [FONT-OBJECT DIRECTION CHAR ...] is also a possibility.

We could also put DIRECTION elsewhere and just modify the code that
passes the hash key to the hash function.

> > (I think we should prefer vectors to lists in
> > this case, because consing them is slightly faster, and the number of
> > elements is known in advance and fixed.)
>
> No argument there, though harfbuzz features, if we ever add them,
> probably should be added as a list inside the vector. I'm talking about
> things like "kern=0" passed to hb_feature_from_string, then to hb_shape,
> to disable kerning.

Maybe.  Something to consider when we actually support those features.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

hossein valizadeh
In reply to this post by Eli Zaretskii
> Please tell which patch was that, and in what Emacs version you tried
> it.  (Please understand that you are generally talking to people some
> of whom don't read Arabic or Persian, so the more details you supply
> the less misunderstanding and confusion will follow, and the faster
> this problem will be solved.)

I'm sorry, my English is so bad, at least in writing. That's why it's
hard for me to give a full explanation.
My Emacs version:  27.0.91
I applied this patch:

diff --git a/src/hbfont.c b/src/hbfont.c
index 576c5fe..4b3f64e 100644
--- a/src/hbfont.c
+++ b/src/hbfont.c
@@ -26,6 +26,7 @@
 #include "composite.h"
 #include "font.h"
 #include "dispextern.h"
+#include "buffer.h"

 #ifdef HAVE_NTGUI

@@ -438,7 +439,11 @@ hbfont_shape (Lisp_Object lgstring, Lisp_Object direction)

   /* If the caller didn't provide a meaningful DIRECTION, let HarfBuzz
      guess it. */
-  if (!NILP (direction))
+  if (!NILP (direction)
+      /* If they bind bidi-display-reordering to nil, the DIRECTION
+        they provide is meaningless, and we should let HarfBuzz guess
+        the real direction.  */
+      && !NILP (BVAR (current_buffer, bidi_display_reordering)))
     {
       hb_direction_t dir = HB_DIRECTION_LTR;
       if (EQ (direction, QL2R))

--------------------------------------------------------------------------------
> I cannot play this on my system, I see a bunch of ads (or what looks
> like ads), and the name of a .mkv file.

video file reuploaded (.mp4 and .ogg):
https://srv-file9.gofile.io/download/cz7P41/out.mp4
https://srv-file4.gofile.io/download/Mwv8k4/out.ogg
--------------------------------------------------------------------------------

This patch solved the problems of eww, newsticker, (and possibly some
other major modes) in displaying Persian/Arabic words. However, if one
of the auto-fill-mode or column-number-mode is enabled, there is still
the same problem in files that use Persian or Arabic characters.
Especially when you want to go back a few characters in a line and add
something to that line.

--------------------------------------------------------------------------------
> Please tell the codepoint of each character you type to reproduce
> the problem and each Emacs command.

For example, if you type in the following sentence:
این نام است که می‌ماند

codepoint:
\u0627\u06cc\u0646 \u0646\u0627\u0645 \u0627\u0633\u062a \u06a9\u0647 \u0645\u06cc\u200c\u0645\u0627\u0646\u062f


Then go back a few characters in the same line and type words randomly. You will see that the letters in some words are displayed separately. I type a few words at random, after the word این and before the word نام :‌

این فراموشی را به همه اینکه فرمت مراتب افتتاح گرامی گرایش سراسیمه نام است که می‌ماند

codepoint:
\u0627\u06cc\u0646 \u0641\u0631\u0627\u0645\u0648\u0634\u06cc \u0631\u0627 \u0628\u0647 \u0647\u0645\u0647 \u0627\u06cc\u0646\u06a9\u0647 \u0641\u0631\u0645\u062a \u0645\u0631\u0627\u062a\u0628 \u0627\u0641\u062a\u062a\u0627\u062d \u06af\u0631\u0627\u0645\u06cc \u06af\u0631\u0627\u06cc\u0634 \u0633\u0631\u0627\u0633\u06cc\u0645\u0647 \u0646\u0627\u0645 \u0627\u0633\u062a \u06a9\u0647 \u0645\u06cc\u200c\u0645\u0627\u0646\u062f

This line should look like this:
http://s12.picofile.com/file/8399190550/correct.png

But if one of the auto-fill-mode or column-number-mode is enabled. it will be displayed this way:
http://s13.picofile.com/file/8399190584/malformed.png

--------------------------------------------------------------------------------
> Please show a full recipe for reproducing this, starting from
> "emacs -Q", and describing every step to reproduce the result.

All steps are displayed in the video file.

> I'd also greatly appreciate if you specifically point out at which
> parts of the display to look at what differences to pay attention
> to. This is needed to fully understand the problem and analyze its
> root cause(s), given that not all of us can read the Arabic script.

The letters that should normally be connected will be displayed separately:
http://s12.picofile.com/file/8399222618/latest_screenshot.png
--------------------------------------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
> From: hossein valizadeh <[hidden email]>
> Date: Fri, 5 Jun 2020 17:02:55 +0430
> Cc: [hidden email], [hidden email],
> Nicholas Drozd <[hidden email]>
>
> I'm sorry, my English is so bad, at least in writing. That's why it's
> hard for me to give a full explanation.

Your English is entirely adequate, there's nothing for you to be
ashamed of.

Thanks for the details, I think they point to the code identified by
Pip Cet, which is called from current-column and similar APIs.  A fix
will probably be available soon.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Pip Cet
Eli Zaretskii <[hidden email]> writes:

>> From: hossein valizadeh <[hidden email]>
>> Date: Fri, 5 Jun 2020 17:02:55 +0430
>> Cc: [hidden email], [hidden email],
>> Nicholas Drozd <[hidden email]>
>>
>> I'm sorry, my English is so bad, at least in writing. That's why it's
>> hard for me to give a full explanation.
>
> Your English is entirely adequate, there's nothing for you to be
> ashamed of.
>
> Thanks for the details, I think they point to the code identified by
> Pip Cet, which is called from current-column and similar APIs.  A fix
> will probably be available soon.
I think the attached patch is a fairly minimal fix; it's against master,
applies to emacs-27 but I haven't tested it there.

Given these two bugs, I wonder whether it wouldn't be more reasonable
always to let HarfBuzz guess the direction, at least for Emacs-27:
scripts which change direction, if they are supported by HarfBuzz, won't
work anyway.

Or am I missing something?


From 18d0e15ac298f40951ddeeec56e9d87c01f51798 Mon Sep 17 00:00:00 2001
From: Pip Cet <[hidden email]>
Date: Fri, 5 Jun 2020 12:54:01 +0000
Subject: [PATCH] Test patch for bug#41005

---
 src/composite.c | 4 +++-
 src/indent.c    | 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/composite.c b/src/composite.c
index 2c589e4f3a..13421a80da 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -1213,7 +1213,9 @@ composition_reseat_it (struct composition_it *cmp_it, ptrdiff_t charpos,
  continue;
       if (charpos < endpos)
  {
-  if ((bidi_level & 1) == 0)
+  if (bidi_level < 0)
+    direction = Qnil;
+  else if ((bidi_level & 1) == 0)
     direction = QL2R;
   else
     direction = QR2L;
diff --git a/src/indent.c b/src/indent.c
index c0b4c13b2c..581323b91e 100644
--- a/src/indent.c
+++ b/src/indent.c
@@ -596,7 +596,7 @@ scan_for_column (ptrdiff_t *endpos, EMACS_INT *goalcol, ptrdiff_t *prevcol)
       if (cmp_it.id >= 0
   || (scan == cmp_it.stop_pos
       && composition_reseat_it (&cmp_it, scan, scan_byte, end,
- w, NEUTRAL_DIR, NULL, Qnil)))
+ w, -1, NULL, Qnil)))
  composition_update_it (&cmp_it, scan, scan_byte, Qnil);
       if (cmp_it.id >= 0)
  {
@@ -1504,7 +1504,7 @@ compute_motion (ptrdiff_t from, ptrdiff_t frombyte, EMACS_INT fromvpos,
   if (cmp_it.id >= 0
       || (pos == cmp_it.stop_pos
   && composition_reseat_it (&cmp_it, pos, pos_byte, to, win,
-    NEUTRAL_DIR, NULL, Qnil)))
+    -1, NULL, Qnil)))
     composition_update_it (&cmp_it, pos, pos_byte, Qnil);
   if (cmp_it.id >= 0)
     {
--
2.27.0.rc0

Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
> From: Pip Cet <[hidden email]>
> Cc: hossein valizadeh <[hidden email]>,  [hidden email],
>   [hidden email]
> Date: Fri, 05 Jun 2020 13:05:43 +0000
>
> I think the attached patch is a fairly minimal fix; it's against master,
> applies to emacs-27 but I haven't tested it there.

Thanks, it LGTM.

I think we should put this on emacs-27, because this is a regression
caused by Emacs 27's support for HarfBuzz as the default shaping
engine.  The other shapers didn't want us to provide the direction,
they determined it internally.  We added the DIRECTION argument as
part of integrating HarfBuzz.

We could do better than your patch by actually computing the resolved
bidi level there, which would require start_display followed by
move_it_to, in which case we probably won't need to call
composition_reseat_it by hand at all, and could just pick up the
result produced by move_it_to.  Or maybe we should just use
Fvertical_motion instead (which does all that internally).  But these
ideas are for the master branch, not for emacs-27.

> Given these two bugs, I wonder whether it wouldn't be more reasonable
> always to let HarfBuzz guess the direction, at least for Emacs-27:
> scripts which change direction, if they are supported by HarfBuzz, won't
> work anyway.

Please explain "scripts that change direction" and "won't work
anyway", I don't think I understand that part.

The reason we don't let HarfBuzz guess in all cases is because the
resolved bidi level, when we have it, is a more accurate indication of
the required direction.  For example, if you have RTL characters
inside the LRO..PDF embedding, it would be wrong to let the shaper
guess, because it could (and usually will) guess wrongly that the
direction is R2L.  It is true that these are rare and unusual use
cases, but they do exist, and Emacs does want to support them,
including with scripts that must use the shaping engine.



Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

hossein valizadeh
In reply to this post by Pip Cet
I tested  Pip Cet's patch on Emacs 27.0.91 and everything was fine.
In both cases (auto-fill-mode & column-number-mode) There was no
confusion in the letters; and all the words were displayed correctly.

Thanks.

0001-Test-patch-for-bug-41005.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#41005: problem with rendering Persian text in Emacs 27

Eli Zaretskii
> From: hossein valizadeh <[hidden email]>
> Date: Fri, 5 Jun 2020 18:53:40 +0430
> Cc: Eli Zaretskii <[hidden email]>, [hidden email],
> Nicholas Drozd <[hidden email]>
>
> I tested  Pip Cet's patch on Emacs 27.0.91 and everything was fine.
> In both cases (auto-fill-mode & column-number-mode) There was no
> confusion in the letters; and all the words were displayed correctly.

Great, thanks for testing it.



12