bug#42099: Emacs -nw Turkish Layout Problem

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Yigit Emre Sahinoglu

When using emacs from terminal with -nw option (with or without -Q),
some characters are not working properly. It's somehow keyboard layout
half messed up. Characters like ö,ç,ü are working but ş (U+015F), Ş (U+15E),
ğ (U+011E) and Ğ (U+011F) are not working. ş is return _ and Ş is return ^. i is working but instead of İ, 0 is returned. Changing the coding system cp1254
(Windows Turkish) doesn't have any effect. It's discovered by
https://www.reddit.com/r/emacs/comments/hgvcth/cant_type_the_characters_%C5%9F_or_%C4%9F_in_emacs_nw_on/


In GNU Emacs 27.0.91 (build 1, x86_64-w64-mingw32)
 of 2020-04-20 built on CIRROCUMULUS
Repository revision: c36c5a3dedbb2e0349be1b6c3b7567ea7b594f1c
Repository branch: HEAD
Windowing system distributor 'Microsoft Corp.', version 10.0.19042
System Description: Microsoft Windows 10 Pro (v10.0.2009.19042.330)

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Configured using:
 'configure --without-dbus --host=x86_64-w64-mingw32
 --without-compress-install 'CFLAGS=-O2 -static''

Configured features:
XPM JPEG TIFF GIF PNG RSVG SOUND NOTIFY W32NOTIFY ACL GNUTLS LIBXML2
HARFBUZZ ZLIB TOOLKIT_SCROLL_BARS MODULES THREADS PDUMPER LCMS2 GMP

Important settings:
  value of $LANG: ENU
  locale-coding-system: cp1252

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg
epg-config gnus-util rmail rmail-loaddefs text-property-search time-date
subr-x seq byte-opt gv bytecomp byte-compile cconv mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs
cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils
term/w32console tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel dos-w32 ls-lisp disp-table term/w32-win w32-win
w32-vars term/common-win tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads w32notify w32 lcms2 multi-tty make-network-process
emacs)

Memory information:
((conses 16 43500 9407)
 (symbols 48 6001 1)
 (strings 32 15207 1533)
 (string-bytes 1 504344)
 (vectors 16 6689)
 (vector-slots 8 78176 6458)
 (floats 8 18 309)
 (intervals 56 191 0)
 (buffers 1000 11))


Yiğit Emre Şahinoğlu
Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Eli Zaretskii
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Sun, 28 Jun 2020 04:23:59 +0300
>
> When using emacs from terminal with -nw option (with or without -Q),
> some characters are not working properly. It's somehow keyboard layout
> half messed up. Characters like ö,ç,ü are working but ş (U+015F), Ş (U+15E),
> ğ (U+011E) and Ğ (U+011F) are not working. ş is return _ and Ş is return ^. i is working but instead of İ, 0 is
> returned. Changing the coding system cp1254
> (Windows Turkish) doesn't have any effect. It's discovered by
> https://www.reddit.com/r/emacs/comments/hgvcth/cant_type_the_characters_%C5%9F_or_%C4%9F_in_emacs_nw_on/

What is the console input and output codepages on that system?  If you
are not sure, use the functions w32-get-console-codepage and
w32-get-console-output-codepage to find out.

Also, what is the font that the Command Prompt window is using on that
system, and does it support those characters? what happens if you
configure the Command Prompt window to use the Lucida Console font,
for example?

And finally, please go to the place where you insert one of the
problematic characters and type "C-x =" -- does Emacs show in the echo
area the correct codepoint?

From the Reddit discussion, I understand that you are using some
non-default console program to display text-mode frames.  If so,
please make sure to try all of the above from the default Command
Prompt window that runs the default Windows shell cmd.exe.



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Eli Zaretskii
[Please keep the bug address on the CC line.]

> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Sun, 28 Jun 2020 18:21:17 +0300
>
> w32-get-console-codepage,  w32-get-console-output-codepage  = 437 (#o665, #x1b5) (for GUI and "-Q
> -nw")

This is part of the problem, I think: codepage 437 doesn't support
Turkish characters.

> Command prompt fonts and encodings are fine for the system part. As I'm aware, encodings are fine
> (because GUI don't have this problem and the problem still exists with -Q) for the Emacs, too.

The GUI display is completely different from the -nw display in this
aspect, so the fact that GUI frames display correctly doesn't mean
anything regarding the text-mode display issues.

> I insert all Turkish char to the text file. After opening that file with "emacs -Q -nw" get this:
>
> Original:                    ö Ö ç Ç      ş          Ş     i       İ         ğ         Ğ       ü Ü
> Emacs Response:     ö Ö ç Ç  \u015F \u015E  i  \u0130 \u011F \u011E   ü Ü

This is clearly a display problem: Emacs uses the \u Unicode escapes
because it knows that codepage 437 cannot display those characters.

This is not the problem reported by the OP in Reddit.  There, the
problem seems to be one of _typing_ Turkish characters: he types a
character, but instead of inserting it Emacs invokes the 'undo'
command.  I think that problem is due to some strange interaction
between the non-default console emulator he uses and the Turkish
letters.



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Eli Zaretskii
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Sun, 28 Jun 2020 20:54:45 +0300
>
> For some reason I cannot edit cc from gmail (Either gui changed or i
> cannot see it. Sorry for the trouble.).

Use "Reply All".

> I love helping people on reddit If I can. When I try to solve his
> problem, I face the same problem. I try with every terminal available
> on Windows (with native or with emulator) but the problem still
> exists. He uses Win 7, I use Win 10. He uses emacs 25.2.1 and  25.3.1,
> I use 27.0.91. He uses cmder (emulator) and with cmd.exe, I use
> Powershell 7,6 , cmd.exe with and without Windows Terminal (emulator).

Are you saying that you see the same problem as the OP?  Because the
problem you described earlier was different.

If you start "emacs -Q -nw" in a cmd.exe window, then type ş, what
happens?  And what does "C-h l" report after that?

Also, what happens if you type "C-x 8 RET 15f RET", do you see the ş
character on display, or do you see something else?

> I don't know how emacs works. But if emacs reads some kind of keyboard
> layout table like QMK Firmware, this table is somehow messed up while
> using "-nw" (or `X resources` on Windows somehow send true signals to
> emacs and keyboard layout is fine with GUI). If you can tell me where
> the keyboard layout files (if emacs reads from files) I can look at
> it.

There are now keyboard layout files on Windows.  Does it help to
evaluate this:

  M-: (w32-set-console-codepage 857) RET

If that doesn't help, try this instead:

  M-: (w32-set-console-codepage 1254) RET



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Yigit Emre Sahinoglu
I'm download Thunderbird for just for Cc problem. ^_^

> Are you saying that you see the same problem as the OP?  Because the
> problem you described earlier was different.
Yes, I have the same problem. I described the problem which I faced but
he is finder of the bug.(I love referencing the sources...). Sorry for
the misunderstanding.

I done all my testing which I sent to you with previous messages do
under "emacs -Q -nw" conditions but I'm gonna answer again one by one.

> If you start "emacs -Q -nw" in a cmd.exe window, then type ş, what
> happens?
"ş" turns into "_".


> And what does "C-h l" report after that?
_               ;; self-insert-command
C-h k        ;; view-lossage


> Also, what happens if you type "C-x 8 RET 15f RET", do you see the ş
> character on display, or do you see something else?
\u015F


> There are now keyboard layout files on Windows.  Does it help to
> evaluate this:
>
>    M-: (w32-set-console-codepage 857) RET
>
> If that doesn't help, try this instead:
>
>    M-: (w32-set-console-codepage 1254) RET
Still same. No change.


On 2020-06-28 21:25, Eli Zaretskii wrote:

>> From: Yigit Emre Sahinoglu <[hidden email]>
>> Date: Sun, 28 Jun 2020 20:54:45 +0300
>>
>> For some reason I cannot edit cc from gmail (Either gui changed or i
>> cannot see it. Sorry for the trouble.).
> Use "Reply All".
>
>> I love helping people on reddit If I can. When I try to solve his
>> problem, I face the same problem. I try with every terminal available
>> on Windows (with native or with emulator) but the problem still
>> exists. He uses Win 7, I use Win 10. He uses emacs 25.2.1 and  25.3.1,
>> I use 27.0.91. He uses cmder (emulator) and with cmd.exe, I use
>> Powershell 7,6 , cmd.exe with and without Windows Terminal (emulator).
> Are you saying that you see the same problem as the OP?  Because the
> problem you described earlier was different.
>
> If you start "emacs -Q -nw" in a cmd.exe window, then type ş, what
> happens?  And what does "C-h l" report after that?
>
> Also, what happens if you type "C-x 8 RET 15f RET", do you see the ş
> character on display, or do you see something else?
>
>> I don't know how emacs works. But if emacs reads some kind of keyboard
>> layout table like QMK Firmware, this table is somehow messed up while
>> using "-nw" (or `X resources` on Windows somehow send true signals to
>> emacs and keyboard layout is fine with GUI). If you can tell me where
>> the keyboard layout files (if emacs reads from files) I can look at
>> it.
> There are now keyboard layout files on Windows.  Does it help to
> evaluate this:
>
>    M-: (w32-set-console-codepage 857) RET
>
> If that doesn't help, try this instead:
>
>    M-: (w32-set-console-codepage 1254) RET



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Yigit Emre Sahinoglu
And Thunderbird causes formatting issues... Sorry for it. I'm disable
one of my browser extensions and I can see the "reply all" button on
Gmail. Never happen again. =(

Best regards,
Yiğit Emre Şahinoğlu

On Sun, Jun 28, 2020 at 9:39 PM Yigit Emre Sahinoglu
<[hidden email]> wrote:

>
> I'm download Thunderbird for just for Cc problem. ^_^
>
> > Are you saying that you see the same problem as the OP?  Because the
> > problem you described earlier was different.
> Yes, I have the same problem. I described the problem which I faced but
> he is finder of the bug.(I love referencing the sources...). Sorry for
> the misunderstanding.
>
> I done all my testing which I sent to you with previous messages do
> under "emacs -Q -nw" conditions but I'm gonna answer again one by one.
>
> > If you start "emacs -Q -nw" in a cmd.exe window, then type ş, what
> > happens?
> "ş" turns into "_".
>
>
> > And what does "C-h l" report after that?
> _               ;; self-insert-command
> C-h k        ;; view-lossage
>
>
> > Also, what happens if you type "C-x 8 RET 15f RET", do you see the ş
> > character on display, or do you see something else?
> \u015F
>
>
> > There are now keyboard layout files on Windows.  Does it help to
> > evaluate this:
> >
> >    M-: (w32-set-console-codepage 857) RET
> >
> > If that doesn't help, try this instead:
> >
> >    M-: (w32-set-console-codepage 1254) RET
> Still same. No change.
>
>
> On 2020-06-28 21:25, Eli Zaretskii wrote:
> >> From: Yigit Emre Sahinoglu <[hidden email]>
> >> Date: Sun, 28 Jun 2020 20:54:45 +0300
> >>
> >> For some reason I cannot edit cc from gmail (Either gui changed or i
> >> cannot see it. Sorry for the trouble.).
> > Use "Reply All".
> >
> >> I love helping people on reddit If I can. When I try to solve his
> >> problem, I face the same problem. I try with every terminal available
> >> on Windows (with native or with emulator) but the problem still
> >> exists. He uses Win 7, I use Win 10. He uses emacs 25.2.1 and  25.3.1,
> >> I use 27.0.91. He uses cmder (emulator) and with cmd.exe, I use
> >> Powershell 7,6 , cmd.exe with and without Windows Terminal (emulator).
> > Are you saying that you see the same problem as the OP?  Because the
> > problem you described earlier was different.
> >
> > If you start "emacs -Q -nw" in a cmd.exe window, then type ş, what
> > happens?  And what does "C-h l" report after that?
> >
> > Also, what happens if you type "C-x 8 RET 15f RET", do you see the ş
> > character on display, or do you see something else?
> >
> >> I don't know how emacs works. But if emacs reads some kind of keyboard
> >> layout table like QMK Firmware, this table is somehow messed up while
> >> using "-nw" (or `X resources` on Windows somehow send true signals to
> >> emacs and keyboard layout is fine with GUI). If you can tell me where
> >> the keyboard layout files (if emacs reads from files) I can look at
> >> it.
> > There are now keyboard layout files on Windows.  Does it help to
> > evaluate this:
> >
> >    M-: (w32-set-console-codepage 857) RET
> >
> > If that doesn't help, try this instead:
> >
> >    M-: (w32-set-console-codepage 1254) RET



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Eli Zaretskii
In reply to this post by Yigit Emre Sahinoglu
> Cc: [hidden email]
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Sun, 28 Jun 2020 21:39:34 +0300
>
> > If you start "emacs -Q -nw" in a cmd.exe window, then type ş, what
> > happens?
> "ş" turns into "_".
>
>
> > And what does "C-h l" report after that?
> _               ;; self-insert-command
> C-h k        ;; view-lossage
>
>
> > Also, what happens if you type "C-x 8 RET 15f RET", do you see the ş
> > character on display, or do you see something else?
> \u015F

So there are two problems: both keyboard input and display of Turkish
characters are broken.  Is this still on a system where
w32-get-console-codepage returns 437?

Does anything change regarding typing Turkish characters if you set
w32-use-fallback-wm-chars-method to a non-nil value?



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Yigit Emre Sahinoglu
w32-get-console-codepage = 437 (#0665, #x1b5). Problem still exists as
I described (ö, ç, ü works but ş, İ (types 0), ğ are broken.)

"M-: (setq w32-use-fallback-wm-chars-method t)" does nothing. Problem
still exists (I check the value if it's correctly set with
describe-value).

On Sun, Jun 28, 2020 at 10:09 PM Eli Zaretskii <[hidden email]> wrote:

>
> > Cc: [hidden email]
> > From: Yigit Emre Sahinoglu <[hidden email]>
> > Date: Sun, 28 Jun 2020 21:39:34 +0300
> >
> > > If you start "emacs -Q -nw" in a cmd.exe window, then type ş, what
> > > happens?
> > "ş" turns into "_".
> >
> >
> > > And what does "C-h l" report after that?
> > _               ;; self-insert-command
> > C-h k        ;; view-lossage
> >
> >
> > > Also, what happens if you type "C-x 8 RET 15f RET", do you see the ş
> > > character on display, or do you see something else?
> > \u015F
>
> So there are two problems: both keyboard input and display of Turkish
> characters are broken.  Is this still on a system where
> w32-get-console-codepage returns 437?
>
> Does anything change regarding typing Turkish characters if you set
> w32-use-fallback-wm-chars-method to a non-nil value?



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Eli Zaretskii
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Sun, 28 Jun 2020 22:46:39 +0300
> Cc: [hidden email]
>
> w32-get-console-codepage = 437 (#0665, #x1b5). Problem still exists as
> I described (ö, ç, ü works but ş, İ (types 0), ğ are broken.)

OK.  So I think the problem might be in the setup of the Windows
system.  If you type "chcp RET" in the Command Prompt window, does it
also show codepage 437?  And if you type the problematic characters
into the Command Prompt window (outside of Emacs) at the cmd.exe
prompt, do you also see those characters displayed incorrectly?



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Yigit Emre Sahinoglu
chcp returns "Active code page: 437"

All chars displayed correctly on cmd.exe. It's something about Emacs
Windows (maybe mingw do something finicky), I'm sure of that. WSL Emacs
works perfectly fine with "emacs -Q -nw" even from cmd.exe. You can look
at the attachments.


On 2020-06-29 17:07, Eli Zaretskii wrote:

>> From: Yigit Emre Sahinoglu <[hidden email]>
>> Date: Sun, 28 Jun 2020 22:46:39 +0300
>> Cc: [hidden email]
>>
>> w32-get-console-codepage = 437 (#0665, #x1b5). Problem still exists as
>> I described (ö, ç, ü works but ş, İ (types 0), ğ are broken.)
> OK.  So I think the problem might be in the setup of the Windows
> system.  If you type "chcp RET" in the Command Prompt window, does it
> also show codepage 437?  And if you type the problematic characters
> into the Command Prompt window (outside of Emacs) at the cmd.exe
> prompt, do you also see those characters displayed incorrectly?

cmd_XAWZQ3ayEB.png (34K) Download Attachment
cmd_3LO9oNHOnO.png (34K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Eli Zaretskii
> Cc: [hidden email]
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Mon, 29 Jun 2020 17:42:13 +0300
>
> chcp returns "Active code page: 437"
>
> All chars displayed correctly on cmd.exe. It's something about Emacs
> Windows (maybe mingw do something finicky), I'm sure of that.

As long as the codepage reported to Emacs is 437, you will not be able
to see nor input Turkish characters.  the question is why is this
codepage being returned, when the system evidently uses a different
encoding...

This is Windows 10, right?  Do you per chance have the UTF-8 support
feature enabled?

You could also try this, once inside Emacs:

  C-x RET t cp857 RET
  C-x RET k cp857 RET

Does that help?



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Yigit Emre Sahinoglu
  C-x RET t cp857 RET
  C-x RET k cp857 RET

Not helping. The problem still exists.

This is Windows 10, right?  Do you per chance have the UTF-8 support
feature enabled?

I use Windows 10. I cannot find UTF-8 support feature (look from `Windows features on or off`  from Control Panel) but I'm pretty sure that UTF-8 support feature exists. Only problematic software I'm aware of is `emacs -nw`.


Yiğit Emre Şahinoğlu


On Mon, Jun 29, 2020 at 6:51 PM Eli Zaretskii <[hidden email]> wrote:
> Cc: [hidden email]
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Mon, 29 Jun 2020 17:42:13 +0300
>
> chcp returns "Active code page: 437"
>
> All chars displayed correctly on cmd.exe. It's something about Emacs
> Windows (maybe mingw do something finicky), I'm sure of that.

As long as the codepage reported to Emacs is 437, you will not be able
to see nor input Turkish characters.  the question is why is this
codepage being returned, when the system evidently uses a different
encoding...

This is Windows 10, right?  Do you per chance have the UTF-8 support
feature enabled?

You could also try this, once inside Emacs:

  C-x RET t cp857 RET
  C-x RET k cp857 RET

Does that help?
Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Eli Zaretskii
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Mon, 29 Jun 2020 20:09:57 +0300
> Cc: [hidden email]
>
> I forgot to say this, when I enter `chcp 857` and then enter emacs, the problem still exists even though the
> system returns chcp 857.

Thanks.

I think the next step is for someone to run Emacs under GDB and see
what kind of character codes Windows feeds us on the Turkish Windows
systems.  The relevant function is key_event in w32inevt.c.  I guess
there's some basic misunderstanding of what happens there on these
Windows systems.



Reply | Threaded
Open this post in threaded view
|

bug#42099: Emacs -nw Turkish Layout Problem

Yigit Emre Sahinoglu

"Ah, shit here we go again..." CJ

I think that I'm very bad while using debuggers. I'm gonna try it in order to send any info I can provide.

On Tue, Jun 30, 2020 at 7:25 PM Eli Zaretskii <[hidden email]> wrote:
> From: Yigit Emre Sahinoglu <[hidden email]>
> Date: Mon, 29 Jun 2020 20:09:57 +0300
> Cc: [hidden email]
>
> I forgot to say this, when I enter `chcp 857` and then enter emacs, the problem still exists even though the
> system returns chcp 857.

Thanks.

I think the next step is for someone to run Emacs under GDB and see
what kind of character codes Windows feeds us on the Turkish Windows
systems.  The relevant function is key_event in w32inevt.c.  I guess
there's some basic misunderstanding of what happens there on these
Windows systems.