29.0.50; Tamil text not shaped in modeline

  • Done
  • quality assurance status badge
Details
3 participants
  • Eli Zaretskii
  • Jai Vetrivelan
  • Visuwesh
Owner
unassigned
Submitted by
Visuwesh
Severity
normal
V
V
Visuwesh wrote on 2 Feb 2022 17:55
(address . bug-gnu-emacs@gnu.org)
87h79h438r.fsf@gmail.com
Tamil text is not shaped properly in the modeline. Please view the
attached file "Noto_Serif_modeline.jpg" to see what I mean. I set
"tamil" script to use "Noto Serif Tamil" in the default fontset like so,

(set-fontset-font "fontset-default" 'tamil '("Noto Serif Tamil" . "iso10646-1"))

Changing the font to "Lohit Tamil" [1] does not fix the shaping issue so
Noto Serif Tamil is not the problem (See file
"Lohit_Tamil_modeline.png").

As shown in "Noto_Serif_Expected.png", the text is shaped as I would
expect, in the minibuffer.

To see if this was specific to Tamil, I tried creating a buffer named
"????" and "???????" ("mera" and "namaskaar C-b DEL" in devanagari-itrans)
but they are shaped better (but not the same as in-buffer text). I also
set "devanagari" script to use "Noto Serif Devanagari" in the default
fontset.

I also see the problem in a GTK3 build of Emacs 27.2 (sorry, I do not
have the time to check Emacs 28 or Emacs 29 GTK3, PGTK build right now).

Attachment: Devanagari.png
In GNU Emacs 29.0.50 (build 1, x86_64-pc-linux-gnu, X toolkit, cairo version 1.16.0, Xaw scroll bars)
Repository revision: 58bb9eb4005599155a8fce8d5c5beb531a72c534
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
System Description: NixOS 21.11 (Porcupine)

Configured using:
'configure
--prefix=/nix/store/0m0yw7b3zly74ljs3qmkblb780xg03id-emacs-git-20220130.0
--disable-build-details --with-modules --with-x-toolkit=lucid
--with-xft --with-cairo --with-native-compilation'

Configured features:
CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG JSON
LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NATIVE_COMP NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS X11 XDBE XIM XPM LUCID ZLIB

Important settings:
value of $EMACSLOADPATH:
value of $EMACSNATIVELOADPATH: /nix/store/07cbjwzil3jfhyifj15h60k7yvixzqxs-emacs-packages-deps/share/emacs/native-lisp::
value of $LANG: en_GB.UTF-8
locale-coding-system: utf-8-unix

Major mode: Group

Minor modes in effect:
gpm-mouse-mode: t
shell-dirtrack-mode: t
recentf-mode: t
gnus-undo-mode: t
eros-mode: t
pdf-occur-global-minor-mode: t
minibuffer-depth-indicate-mode: t
repeat-mode: t
display-time-mode: t
display-battery-mode: t
straight-use-package-mode: t
straight-package-neutering-mode: t
tooltip-mode: t
global-eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tab-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
auto-composition-mode: linux
auto-encryption-mode: t
auto-compression-mode: t
buffer-read-only: t
indent-tabs-mode: t
transient-mark-mode: t

Load-path shadows:
/home/viz/.nix-profile/share/emacs/site-lisp/site-start hides /nix/store/07cbjwzil3jfhyifj15h60k7yvixzqxs-emacs-packages-deps/share/emacs/site-lisp/site-start
/home/viz/lib/emacs/straight/build/map/map hides /nix/store/0m0yw7b3zly74ljs3qmkblb780xg03id-emacs-git-20220130.0/share/emacs/29.0.50/lisp/emacs-lisp/map
/home/viz/lib/emacs/straight/build/let-alist/let-alist hides /nix/store/0m0yw7b3zly74ljs3qmkblb780xg03id-emacs-git-20220130.0/share/emacs/29.0.50/lisp/emacs-lisp/let-alist

Features:
(shadow emacsbug ind-util thai-util thai-word lao-util enriched
find-dired shell-command+ subword-mode-expansions cap-words superword
subword nix-mode ffap smie nix-repl nix-shell nix-store magit-section
dash nix-instantiate nix-shebang nix-format nix follow dabbrev cal-iso
wdired t-mouse term/linux org-capture doct chemtable timezone nnfolder
gnus-draft tabify man hippie-exp siege-mode comp comp-cstr log-edit
pcvs-util add-log vc url-http url-gw url-cache url-auth sendmail lacarte
icomplete ecomplete writegood-mode cal-islam holidays hol-loaddefs
mule-util cal-move calc-math calcalg2 calccomp calc-arith calc-alg
calc-aent calc-ext calc-misc calc-menu calc calc-loaddefs rect calc-macs
dired-aux pdf-sync pdf-outline pdf-links pdf-history org-agenda flyspell
ispell org-pdftools pdf-annot facemenu org-noter org-refile goto-addr
org-indent org-element avl-tree generator the-org-mode-expansions ob-C
cc-mode-expansions cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles
cc-align cc-engine cc-vars cc-defs ob-shell shell ob-racket async
ob-async tempo ol-eww eww xdg url-queue mm-url ol-rmail ol-mhe ol-irc
ol-info ol-gnus nnselect gnus-search eieio-opt speedbar ezimage dframe
ol-docview doc-view ol-bibtex ol-bbdb ol-w3m ol-doi org-link-doi org ob
ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-footnote org-src
ob-comint org-pcomplete pcomplete org-list org-faces org-entities
org-version ob-emacs-lisp ob-core ob-eval org-table oc-basic bibtex ol
org-keys oc org-compat org-macs org-loaddefs cl-print debug backtrace
expand-region text-mode-expansions er-basic-expansions
expand-region-core expand-region-custom pulse xref view descr-text
time-stamp misearch multi-isearch shortdoc help-fns radix-tree
face-remap reveal noutline outline recentf tree-widget executable vc-git
vc-dispatcher smerge-mode diff diff-mode gnus-fun flow-fill shr-color
color mm-archive sort gnus-cite mail-extr textsec uni-scripts
idna-mapping ucs-normalize uni-confusable textsec-check gnus-async
gnus-bcklg qp gnus-ml gnutls network-stream nsm nndraft nnmh nnmaildir
nnagent nnml nnnil gnus-agent gnus-srvr gnus-score score-mode nnvirtual
gnus-msg gnus-art mm-uu mml2015 mm-view mml-smime smime dig nntp
gnus-cache gnus-sum shr pixel-fill kinsoku svg dom gnus-group gnus-undo
gnus-start gnus-dbus gnus-cloud nnimap nnmail mail-source utf7 netrc
nnoo parse-time iso8601 gnus-spec gnus-int gnus-range message yank-media
rmc puny rfc822 mml mml-sec epa epg rfc6068 epg-config mm-decode
mm-bodies mm-encode mailabbrev gmm-utils mailheader gnus-win gnus
nnheader gnus-util mail-utils range server paredit edmacro kmacro eros
time-date checkdoc lisp-mnt mail-parse rfc2231 rfc2047 rfc2045 mm-util
ietf-drums mail-prsvr flymake-proc flymake project warnings thingatpt
hl-todo writegood-mode-autoloads wordel-autoloads sokoban-autoloads
ement-autoloads ts-autoloads s-autoloads map-autoloads plz-autoloads
nov-autoloads esxml-autoloads kv-autoloads transmission-autoloads
lua-mode-autoloads nix-mode-autoloads magit-section-autoloads
dash-autoloads racket-mode-autoloads eros-autoloads
flymake-shellcheck-autoloads avy avy-autoloads siege-mode-autoloads
paredit-autoloads puni-autoloads expand-region-autoloads
filladapt-autoloads compose quail scroll-other-window
org-pdftools-autoloads org-noter-autoloads finder-inf
math-delimiters-autoloads doct-autoloads ob-async-autoloads
async-autoloads emacs-ob-racket-autoloads valign-autoloads
org-starless-autoloads cdlatex-autoloads auctex-autoloads tex-site
easy-mmode pdf-occur ibuf-ext ibuffer ibuffer-loaddefs tablist advice
tablist-filter semantic/wisent/comp semantic/wisent
semantic/wisent/wisent semantic/util-modes semantic/util semantic
semantic/tag semantic/lex semantic/fw mode-local find-func cedet
pdf-isearch let-alist pdf-misc imenu pdf-tools package browse-url url
url-proxy url-privacy url-expand url-methods url-history url-cookie
url-domsuf url-util mailcap url-handlers url-parse auth-source eieio
eieio-core eieio-loaddefs json map url-vars compile comint ansi-color
ring cus-edit wid-edit pdf-view password-cache bookmark
text-property-search pp jka-compr pdf-cache pdf-info tq pdf-util
pdf-macs image-mode dired-x dired dired-loaddefs exif
pdf-tools-autoloads let-alist-autoloads tablist-autoloads derived
mb-depth cus-load repeat visual-fill-autoloads olivetti-autoloads
hl-todo-autoloads time format-spec battery dbus filenotify xml
disp-table lacarte-autoloads shell-command-plus-autoloads rx icalendar
diary-lib diary-loaddefs cal-menu calendar cal-loaddefs
chemtable-autoloads molar-mass-autoloads pcase straight-autoloads info
cl-seq cl-extra help-mode straight cl-macs cl-loaddefs cl-lib
vz-nh-theme seq gv subr-x byte-opt bytecomp byte-compile cconv
iso-transl tooltip eldoc paren electric uniquify ediff-hook vc-hooks
lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock
font-lock syntax font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget keymap hashtable-print-readable backquote threads
dbusbind inotify dynamic-setting system-font-setting font-render-setting
cairo x-toolkit x multi-tty make-network-process native-compile emacs)

Memory information:
((conses 16 2101434 313772)
(symbols 48 56342 74)
(strings 32 371367 47970)
(string-bytes 1 159337297)
(vectors 16 144937)
(vector-slots 8 3674344 378355)
(floats 8 35189 1747)
(intervals 56 182734 6086)
(buffers 992 90))
E
E
Eli Zaretskii wrote on 2 Feb 2022 19:54
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
83r18l5cc6.fsf@gnu.org
Toggle quote (16 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Date: Wed, 02 Feb 2022 22:25:48 +0530
>
> Tamil text is not shaped properly in the modeline. Please view the
> attached file "Noto_Serif_modeline.jpg" to see what I mean. I set
> "tamil" script to use "Noto Serif Tamil" in the default fontset like so,
>
> (set-fontset-font "fontset-default" 'tamil '("Noto Serif Tamil" . "iso10646-1"))
>
> Changing the font to "Lohit Tamil" [1] does not fix the shaping issue so
> Noto Serif Tamil is not the problem (See file
> "Lohit_Tamil_modeline.png").
>
> As shown in "Noto_Serif_Expected.png", the text is shaped as I would
> expect, in the minibuffer.

on the mode line, the buffer is shown in bold, so perhaps there's a
problem with the bold variant of the Noto Serif Tamil font? If you
type the same text in a buffer, but give it the 'bold' face, do you
see the same problem with buffer text as on the mode line?
V
V
Visuwesh wrote on 3 Feb 2022 02:45
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
87tudg7kzz.fsf@gmail.com
[ Resending message since it didn't appear in the archives. ]
Eli Zaretskii <eliz@gnu.org> writes:

Hello, Eli

Toggle quote (22 lines)
>> From: Visuwesh <visuweshm@gmail.com>
>> Date: Wed, 02 Feb 2022 22:25:48 +0530
>>
>> Tamil text is not shaped properly in the modeline. Please view the
>> attached file "Noto_Serif_modeline.jpg" to see what I mean. I set
>> "tamil" script to use "Noto Serif Tamil" in the default fontset like so,
>>
>> (set-fontset-font "fontset-default" 'tamil '("Noto Serif Tamil"
>> . "iso10646-1"))
>>
>> Changing the font to "Lohit Tamil" [1] does not fix the shaping issue so
>> Noto Serif Tamil is not the problem (See file
>> "Lohit_Tamil_modeline.png").
>>
>> As shown in "Noto_Serif_Expected.png", the text is shaped as I would
>> expect, in the minibuffer.
>
> on the mode line, the buffer is shown in bold, so perhaps there's a
> problem with the bold variant of the Noto Serif Tamil font? If you
> type the same text in a buffer, but give it the 'bold' face, do you
> see the same problem with buffer text as on the mode line?

No, when I give it the bold face and insert it in a buffer, the text is
shaped properly. Then, I tried setting mode-line-format to
"???????????????" and '(:propertize "???????????????" face bold) and in both
cases, the text is shaped properly. I'm not sure where the problem is
anymore. I attached screenshots of the same as well.
Attachment: Bold_text.png
E
E
Eli Zaretskii wrote on 3 Feb 2022 08:39
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
83bkzo5rgs.fsf@gnu.org
Toggle quote (15 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Thu, 03 Feb 2022 07:15:01 +0530
>
> > on the mode line, the buffer is shown in bold, so perhaps there's a
> > problem with the bold variant of the Noto Serif Tamil font? If you
> > type the same text in a buffer, but give it the 'bold' face, do you
> > see the same problem with buffer text as on the mode line?
>
> No, when I give it the bold face and insert it in a buffer, the text is
> shaped properly. Then, I tried setting mode-line-format to
> "???????????????" and '(:propertize "???????????????" face bold) and in both
> cases, the text is shaped properly. I'm not sure where the problem is
> anymore. I attached screenshots of the same as well.

Strange. I guess the only way of investigating this is to step with
GDB into the code which renders the mode line, and see which font
specifically is being used there?

Btw, do I understand correctly that the problem you see is the
incorrect location of the dot-like diacriticals above the letters? Or
is the problem something else? (I don't read the Tamil script.)
V
V
Visuwesh wrote on 3 Feb 2022 09:07
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
877dacpe2k.fsf@gmail.com
[???????, ???????? 03 2022] Eli Zaretskii wrote:

Hello, Eli

Toggle quote (20 lines)
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 53729@debbugs.gnu.org
>> Date: Thu, 03 Feb 2022 07:15:01 +0530
>>
>> > on the mode line, the buffer is shown in bold, so perhaps there's a
>> > problem with the bold variant of the Noto Serif Tamil font? If you
>> > type the same text in a buffer, but give it the 'bold' face, do you
>> > see the same problem with buffer text as on the mode line?
>>
>> No, when I give it the bold face and insert it in a buffer, the text is
>> shaped properly. Then, I tried setting mode-line-format to
>> "???????????????" and '(:propertize "???????????????" face bold) and in both
>> cases, the text is shaped properly. I'm not sure where the problem is
>> anymore. I attached screenshots of the same as well.
>
> Strange. I guess the only way of investigating this is to step with
> GDB into the code which renders the mode line, and see which font
> specifically is being used there?
>

I suppose so but I'm confident that "Noto Serif Tamil" is the font used
in the modeline. The only other Tamil font I have installed is "Noto
Sans Tamil" and I can easily make out the difference between the two.
Font selection does not seem to be the problem, at least.

In either case, I think I can only get to this in two weeks. And is the
information in etc/DEBUG all I need (except the breakpoint which will be
provided?)?

Toggle quote (4 lines)
> Btw, do I understand correctly that the problem you see is the
> incorrect location of the dot-like diacriticals above the letters? Or
> is the problem something else? (I don't read the Tamil script.)

You're right. In the OP, even simple combinations like ? + ? is not
rendered right: the dot should be on top of ? but in the buffer name, it
is next to it. However, Emacs seems to have no problem shaping ? + ?.
The grossest of all is ? + ? where the combined letter should be ? plus
some kind of arc that surrounds the letter i.e., ?? (hopefully Emacs
renders this fine on your end, if not, I guess I could write it down on
paper and send a picture).

I really hope the above explains the problems with shaping.
E
E
Eli Zaretskii wrote on 3 Feb 2022 10:23
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
83zgn8482y.fsf@gnu.org
Toggle quote (14 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Thu, 03 Feb 2022 13:37:09 +0530
>
> > Strange. I guess the only way of investigating this is to step with
> > GDB into the code which renders the mode line, and see which font
> > specifically is being used there?
> >
>
> I suppose so but I'm confident that "Noto Serif Tamil" is the font used
> in the modeline. The only other Tamil font I have installed is "Noto
> Sans Tamil" and I can easily make out the difference between the two.
> Font selection does not seem to be the problem, at least.

It could be that Emacs selects some variant of Noto Serif Tamil (some
weight or maybe width) which causes this.

Btw, can you try this with other fonts and see if any of them displays
the buffer name correctly in the mode line?

Toggle quote (4 lines)
> In either case, I think I can only get to this in two weeks. And is the
> information in etc/DEBUG all I need (except the breakpoint which will be
> provided?)?

It should get you started, yes. There are special commands defined in
src/.gdbinit that will help showing Lisp objects, and feel free to ask
for guidance if you aren't sure how to proceed or have any questions.

Toggle quote (8 lines)
> You're right. In the OP, even simple combinations like ? + ? is not
> rendered right: the dot should be on top of ? but in the buffer name, it
> is next to it. However, Emacs seems to have no problem shaping ? + ?.
> The grossest of all is ? + ? where the combined letter should be ? plus
> some kind of arc that surrounds the letter i.e., ?? (hopefully Emacs
> renders this fine on your end, if not, I guess I could write it down on
> paper and send a picture).

OK, thanks.

One more question: which version of HarfBuzz do you have installed
there?
V
V
Visuwesh wrote on 3 Feb 2022 11:05
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
8735l0p8nr.fsf@gmail.com
[???????, ???????? 03 2022] Eli Zaretskii wrote:

Toggle quote (8 lines)
>> I suppose so but I'm confident that "Noto Serif Tamil" is the font used
>> in the modeline. The only other Tamil font I have installed is "Noto
>> Sans Tamil" and I can easily make out the difference between the two.
>> Font selection does not seem to be the problem, at least.
>
> It could be that Emacs selects some variant of Noto Serif Tamil (some
> weight or maybe width) which causes this.

Right. I know I cannot trust my eyes but it seems to be the bold font
but I will step through in gdb.

Toggle quote (4 lines)
> Btw, can you try this with other fonts and see if any of them displays
> the buffer name correctly in the mode line?
>

I tried "Lohit Tamil" but the text is not shaped properly when I use it
too, and the (incorrect) shaping is different from what I observe when I
use "Noto Serif Tamil". I will try other fonts over the weekend and
report back.

Toggle quote (9 lines)
>> In either case, I think I can only get to this in two weeks. And is the
>> information in etc/DEBUG all I need (except the breakpoint which will be
>> provided?)?
>
> It should get you started, yes. There are special commands defined in
> src/.gdbinit that will help showing Lisp objects, and feel free to ask
> for guidance if you aren't sure how to proceed or have any questions.
>

Thanks.

Toggle quote (13 lines)
>> You're right. In the OP, even simple combinations like ? + ? is not
>> rendered right: the dot should be on top of ? but in the buffer name, it
>> is next to it. However, Emacs seems to have no problem shaping ? + ?.
>> The grossest of all is ? + ? where the combined letter should be ? plus
>> some kind of arc that surrounds the letter i.e., ?? (hopefully Emacs
>> renders this fine on your end, if not, I guess I could write it down on
>> paper and send a picture).
>
> OK, thanks.
>
> One more question: which version of HarfBuzz do you have installed
> there?

HarfBuzz 3.0.0, and if it matters, I have: Cairo 1.16.0, Pango 1.48.10.
V
V
Visuwesh wrote on 10 Feb 2022 14:09
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
877da2amw9.fsf@gmail.com
[???????, ???????? 03 2022] Visuwesh wrote:

Toggle quote (23 lines)
> [???????, ???????? 03 2022] Eli Zaretskii wrote:
>
>>> I suppose so but I'm confident that "Noto Serif Tamil" is the font used
>>> in the modeline. The only other Tamil font I have installed is "Noto
>>> Sans Tamil" and I can easily make out the difference between the two.
>>> Font selection does not seem to be the problem, at least.
>>
>> It could be that Emacs selects some variant of Noto Serif Tamil (some
>> weight or maybe width) which causes this.
>
> Right. I know I cannot trust my eyes but it seems to be the bold font
> but I will step through in gdb.
>
>> Btw, can you try this with other fonts and see if any of them displays
>> the buffer name correctly in the mode line?
>>
>
> I tried "Lohit Tamil" but the text is not shaped properly when I use it
> too, and the (incorrect) shaping is different from what I observe when I
> use "Noto Serif Tamil". I will try other fonts over the weekend and
> report back.
>

Things were a little more busier than I thought: I finally sat down and
tested other fonts, and the situation has gotten even more confusing!

I tried all the Tamil fonts in Google fonts and to ease the checking
process, I wrote the following Elisp snippet:

(with-current-buffer (get-buffer-create "???????????????.pdf")
(switch-to-buffer-other-window "???????????????.pdf")
(let ((fonts '("Arima Madurai"
"Baloo Thambi 2"
"Catamaran"
"Coiny"
"Hind Madurai"
"Kavivanar"
"Meera Inimai"
"Mukta Malar"
"Oi"
"Pavanam"
"Noto Sans Tamil"
"Noto Serif Tamil"
"Lohit Tamil"))
(i 0)
(die nil))
(while (not die)
(erase-buffer)
(insert "???????????????")
(set-fontset-font "fontset-default" 'tamil (cons (nth i fonts) "iso10646-1"))
(pcase (read-char-choice
(format "%s ([n]ext, [p]rev, [q]uit): " (nth i fonts))
'(?n ?q ?p))
(?q (setq die t))
(?n (setq i (mod (1+ i) (length fonts))))
(?p (setq i (mod (1- i) (length fonts))))))))

and the buffer name in the modeline has the right shaping! This is the
case for _every_ font I tried: including Noto Serif Tamil. But if I
open a file named "???????????????.pdf", the shaping is as in the OP.

Here's a screenshot of the Emacs frame (this is in emacs -Q):
Toggle quote (9 lines)
>>> In either case, I think I can only get to this in two weeks. And is the
>>> information in etc/DEBUG all I need (except the breakpoint which will be
>>> provided?)?
>>
>> It should get you started, yes. There are special commands defined in
>> src/.gdbinit that will help showing Lisp objects, and feel free to ask
>> for guidance if you aren't sure how to proceed or have any questions.
>>

Given the above strangeness, can you please instruct me to use gdb to
stab Emacs? Thanks.
E
E
Eli Zaretskii wrote on 13 Feb 2022 14:53
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
8335kmrhy6.fsf@gnu.org
Toggle quote (39 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Thu, 10 Feb 2022 18:39:42 +0530
>
> I tried all the Tamil fonts in Google fonts and to ease the checking
> process, I wrote the following Elisp snippet:
>
> (with-current-buffer (get-buffer-create "???????????????.pdf")
> (switch-to-buffer-other-window "???????????????.pdf")
> (let ((fonts '("Arima Madurai"
> "Baloo Thambi 2"
> "Catamaran"
> "Coiny"
> "Hind Madurai"
> "Kavivanar"
> "Meera Inimai"
> "Mukta Malar"
> "Oi"
> "Pavanam"
> "Noto Sans Tamil"
> "Noto Serif Tamil"
> "Lohit Tamil"))
> (i 0)
> (die nil))
> (while (not die)
> (erase-buffer)
> (insert "???????????????")
> (set-fontset-font "fontset-default" 'tamil (cons (nth i fonts) "iso10646-1"))
> (pcase (read-char-choice
> (format "%s ([n]ext, [p]rev, [q]uit): " (nth i fonts))
> '(?n ?q ?p))
> (?q (setq die t))
> (?n (setq i (mod (1+ i) (length fonts))))
> (?p (setq i (mod (1- i) (length fonts))))))))
>
> and the buffer name in the modeline has the right shaping! This is the
> case for _every_ font I tried: including Noto Serif Tamil. But if I
> open a file named "???????????????.pdf", the shaping is as in the OP.

What happens if you turn off auto-composition mode before repeating
the above experiment? Do you see any difference in the buffer name
produced by you and the buffer name produced by Emacs when visiting
that file?
V
V
Visuwesh wrote on 13 Feb 2022 15:56
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
87leyeizn2.fsf@gmail.com
[??????, ???????? 13 2022] Eli Zaretskii wrote:

Toggle quote (18 lines)
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 53729@debbugs.gnu.org
>> Date: Thu, 10 Feb 2022 18:39:42 +0530
>>
>> I tried all the Tamil fonts in Google fonts and to ease the checking
>> process, I wrote the following Elisp snippet:
>>
>> [...]
>>
>> and the buffer name in the modeline has the right shaping! This is the
>> case for _every_ font I tried: including Noto Serif Tamil. But if I
>> open a file named "???????????????.pdf", the shaping is as in the OP.
>
> What happens if you turn off auto-composition mode before repeating
> the above experiment? Do you see any difference in the buffer name
> produced by you and the buffer name produced by Emacs when visiting
> that file?

If I turn off global-auto-composition-mode and do the above, none of the
text is shaped i.e., the buffer name produced by me and the buffer name
produced by Emacs both are not shaped. Turning it on again and visiting
the file does not produce the right shaping either (but the buffer
created by me does).

I did this out of curiosity: in dired, I typed C M-n when over the file
and added ".1" to the end of new file name, and when I visit this file,
the buffer name is shaped properly. [ The new filename is
???????????????.pdf.1 ]

I'm not sure if this has to do with the filename since when I yank the
file name from dired and create an empty file (M-x
dired-create-empty-file) by that name in another directory and visit it,
Emacs shapes the buffer name properly.

In either case, here's the filename as yanked from dired:
???????????????.pdf
If this doesn't do it, then I guess I can send you the file off-list.
E
E
Eli Zaretskii wrote on 13 Feb 2022 17:44
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
83wnhypvh6.fsf@gnu.org
Toggle quote (17 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Sun, 13 Feb 2022 20:26:17 +0530
>
> >> and the buffer name in the modeline has the right shaping! This is the
> >> case for _every_ font I tried: including Noto Serif Tamil. But if I
> >> open a file named "???????????????.pdf", the shaping is as in the OP.
> >
> > What happens if you turn off auto-composition mode before repeating
> > the above experiment? Do you see any difference in the buffer name
> > produced by you and the buffer name produced by Emacs when visiting
> > that file?
>
> If I turn off global-auto-composition-mode and do the above, none of the
> text is shaped i.e., the buffer name produced by me and the buffer name
> produced by Emacs both are not shaped.

Of course they aren't shaped: turning off auto-composition-mode
disables the shaping. I'm asking whether both the buffer name
produced by you and the buffer name produced by visiting that file
look identical on the mode line, or do they somehow differ? If they
do differ, what is the difference?

Toggle quote (5 lines)
> I did this out of curiosity: in dired, I typed C M-n when over the file
> and added ".1" to the end of new file name, and when I visit this file,
> the buffer name is shaped properly. [ The new filename is
> ???????????????.pdf.1 ]

So you are saying that changing the file-name extension affects the
shaping on the mode line?

Toggle quote (5 lines)
> I'm not sure if this has to do with the filename since when I yank the
> file name from dired and create an empty file (M-x
> dired-create-empty-file) by that name in another directory and visit it,
> Emacs shapes the buffer name properly.

Very strange. Is that the only file name with such problems? It
sounds like maybe its file name has more than meets the eye (which is
one reason why I asked you to disable auto-composition-mode).
V
V
Visuwesh wrote on 14 Feb 2022 04:01
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
87fsomi225.fsf@gmail.com
[??????, ???????? 13 2022] Eli Zaretskii wrote:

Hi Eli,

Toggle quote (24 lines)
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 53729@debbugs.gnu.org
>> Date: Sun, 13 Feb 2022 20:26:17 +0530
>>
>> >> and the buffer name in the modeline has the right shaping! This is the
>> >> case for _every_ font I tried: including Noto Serif Tamil. But if I
>> >> open a file named "???????????????.pdf", the shaping is as in the OP.
>> >
>> > What happens if you turn off auto-composition mode before repeating
>> > the above experiment? Do you see any difference in the buffer name
>> > produced by you and the buffer name produced by Emacs when visiting
>> > that file?
>>
>> If I turn off global-auto-composition-mode and do the above, none of the
>> text is shaped i.e., the buffer name produced by me and the buffer name
>> produced by Emacs both are not shaped.
>
> Of course they aren't shaped: turning off auto-composition-mode
> disables the shaping. I'm asking whether both the buffer name
> produced by you and the buffer name produced by visiting that file
> look identical on the mode line, or do they somehow differ? If they
> do differ, what is the difference?
>

I misunderstood what you meant, sorry. They look the same.

Toggle quote (9 lines)
>> I did this out of curiosity: in dired, I typed C M-n when over the file
>> and added ".1" to the end of new file name, and when I visit this file,
>> the buffer name is shaped properly. [ The new filename is
>> ???????????????.pdf.1 ]
>
> So you are saying that changing the file-name extension affects the
> shaping on the mode line?
>

Yes, that seems to be the case. I tried changing the extension to
"jpeg", and the shaping was incorrect. If I completely remove the
extension, the text is shaped properly.

Toggle quote (9 lines)
>> I'm not sure if this has to do with the filename since when I yank the
>> file name from dired and create an empty file (M-x
>> dired-create-empty-file) by that name in another directory and visit it,
>> Emacs shapes the buffer name properly.
>
> Very strange. Is that the only file name with such problems? It
> sounds like maybe its file name has more than meets the eye (which is
> one reason why I asked you to disable auto-composition-mode).

I'm not sure if that's the case. If I rename that file in Emacs to
"???????.pdf", the text is not shaped properly again.

Also, I found out that non-empty files (with Tamil names) don't have
their buffer name shaped properly. If I create a new empty file in
dired, the buffer name is shaped properly but that is not the case if I
rename an existing, non-empty file.
E
E
Eli Zaretskii wrote on 14 Feb 2022 15:05
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
83ilthpmpq.fsf@gnu.org
Toggle quote (17 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Mon, 14 Feb 2022 08:31:38 +0530
>
> >> If I turn off global-auto-composition-mode and do the above, none of the
> >> text is shaped i.e., the buffer name produced by me and the buffer name
> >> produced by Emacs both are not shaped.
> >
> > Of course they aren't shaped: turning off auto-composition-mode
> > disables the shaping. I'm asking whether both the buffer name
> > produced by you and the buffer name produced by visiting that file
> > look identical on the mode line, or do they somehow differ? If they
> > do differ, what is the difference?
> >
>
> I misunderstood what you meant, sorry. They look the same.

And what does Emacs display if you evaluate the below in the buffer
whose file name is displayed on the mode line improperly:

(append (file-name-nondirectory (buffer-file-name)) nil)

This should produce the list of character codes that constitute the
file name; I want to see that there's no strange characters in the
file name.

Toggle quote (12 lines)
> >> I did this out of curiosity: in dired, I typed C M-n when over the file
> >> and added ".1" to the end of new file name, and when I visit this file,
> >> the buffer name is shaped properly. [ The new filename is
> >> ???????????????.pdf.1 ]
> >
> > So you are saying that changing the file-name extension affects the
> > shaping on the mode line?
>
> Yes, that seems to be the case. I tried changing the extension to
> "jpeg", and the shaping was incorrect. If I completely remove the
> extension, the text is shaped properly.

Looks like the rendering of the file name is affected by the non-Tamil
text that follows it?

If you display the mode-line text as a string, does it display
correctly? Here's how to do that: evaluate:

(let ((str (format-mode-line mode-line-format)))
(remove-list-of-text-properties 0 (length str) '(help-echo face mouse-face local-map display keymap) str) str)

Toggle quote (8 lines)
> I'm not sure if that's the case. If I rename that file in Emacs to
> "???????.pdf", the text is not shaped properly again.
>
> Also, I found out that non-empty files (with Tamil names) don't have
> their buffer name shaped properly. If I create a new empty file in
> dired, the buffer name is shaped properly but that is not the case if I
> rename an existing, non-empty file.

Again sounds like what's else on the mode line somehow affects the
rendering of the Tamil file name.

But I cannot reproduce this on my system, so it is hard to tell what
is going on here.
V
V
Visuwesh wrote on 15 Feb 2022 02:47
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
878rucaokf.fsf@gmail.com
[???????, ???????? 14 2022] Eli Zaretskii wrote:

Toggle quote (26 lines)
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 53729@debbugs.gnu.org
>> Date: Mon, 14 Feb 2022 08:31:38 +0530
>>
>> >> If I turn off global-auto-composition-mode and do the above, none of the
>> >> text is shaped i.e., the buffer name produced by me and the buffer name
>> >> produced by Emacs both are not shaped.
>> >
>> > Of course they aren't shaped: turning off auto-composition-mode
>> > disables the shaping. I'm asking whether both the buffer name
>> > produced by you and the buffer name produced by visiting that file
>> > look identical on the mode line, or do they somehow differ? If they
>> > do differ, what is the difference?
>> >
>>
>> I misunderstood what you meant, sorry. They look the same.
>
> And what does Emacs display if you evaluate the below in the buffer
> whose file name is displayed on the mode line improperly:
>
> (append (file-name-nondirectory (buffer-file-name)) nil)
>
> This should produce the list of character codes that constitute the
> file name; I want to see that there's no strange characters in the
> file name.

(2949 2965 3021 2985 3007 2970 3021 2970 3007 2993 2965 3009 2965 2995 3021 46 112 100 102)

Toggle quote (16 lines)
>> >> I did this out of curiosity: in dired, I typed C M-n when over the file
>> >> and added ".1" to the end of new file name, and when I visit this file,
>> >> the buffer name is shaped properly. [ The new filename is
>> >> ???????????????.pdf.1 ]
>> >
>> > So you are saying that changing the file-name extension affects the
>> > shaping on the mode line?
>>
>> Yes, that seems to be the case. I tried changing the extension to
>> "jpeg", and the shaping was incorrect. If I completely remove the
>> extension, the text is shaped properly.
>
> Looks like the rendering of the file name is affected by the non-Tamil
> text that follows it?
>

And it looks like it is not just any non-Tamil text that affects it but
only image like extensions? For example, if the extension is .txt, .sh,
.c, .el, .svg, .djvu, then the buffer name is shaped properly. But if I
use the extensions .tiff, .png, .jpeg, .pdf, .jpg, then the buffer name
is not shaped. The file I renamed was a bash script.

Toggle quote (7 lines)
> If you display the mode-line text as a string, does it display
> correctly? Here's how to do that: evaluate:
>
> (let ((str (format-mode-line mode-line-format)))
> (remove-list-of-text-properties 0 (length str) '(help-echo face mouse-face local-map display keymap) str) str)
>

It is displayed correctly.

Toggle quote (14 lines)
>> I'm not sure if that's the case. If I rename that file in Emacs to
>> "???????.pdf", the text is not shaped properly again.
>>
>> Also, I found out that non-empty files (with Tamil names) don't have
>> their buffer name shaped properly. If I create a new empty file in
>> dired, the buffer name is shaped properly but that is not the case if I
>> rename an existing, non-empty file.
>
> Again sounds like what's else on the mode line somehow affects the
> rendering of the Tamil file name.
>
> But I cannot reproduce this on my system, so it is hard to tell what
> is going on here.

That is really unfortunate.
E
E
Eli Zaretskii wrote on 15 Feb 2022 15:26
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
83y22c2olc.fsf@gnu.org
Toggle quote (14 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Tue, 15 Feb 2022 07:17:12 +0530
>
> > Looks like the rendering of the file name is affected by the non-Tamil
> > text that follows it?
> >
>
> And it looks like it is not just any non-Tamil text that affects it but
> only image like extensions? For example, if the extension is .txt, .sh,
> .c, .el, .svg, .djvu, then the buffer name is shaped properly. But if I
> use the extensions .tiff, .png, .jpeg, .pdf, .jpg, then the buffer name
> is not shaped. The file I renamed was a bash script.

The display code which renders this doesn't know anything about the
meaning of the extensions, it only cares about the characters it needs
to display. So if image-file extensions tend to cause this, it's
because of the characters that are part of the extensions, or maybe
because having an image file visited by the buffer affects the mode
line in some other way that causes this issue.

Toggle quote (5 lines)
> > But I cannot reproduce this on my system, so it is hard to tell what
> > is going on here.
>
> That is really unfortunate.

Maybe someone else can reproduce this?
V
V
Visuwesh wrote on 16 Feb 2022 15:06
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
87y22a9a9j.fsf@gmail.com
[????????, ???????? 15 2022] Eli Zaretskii wrote:

Toggle quote (28 lines)
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 53729@debbugs.gnu.org
>> Date: Tue, 15 Feb 2022 07:17:12 +0530
>>
>> > Looks like the rendering of the file name is affected by the non-Tamil
>> > text that follows it?
>> >
>>
>> And it looks like it is not just any non-Tamil text that affects it but
>> only image like extensions? For example, if the extension is .txt, .sh,
>> .c, .el, .svg, .djvu, then the buffer name is shaped properly. But if I
>> use the extensions .tiff, .png, .jpeg, .pdf, .jpg, then the buffer name
>> is not shaped. The file I renamed was a bash script.
>
> The display code which renders this doesn't know anything about the
> meaning of the extensions, it only cares about the characters it needs
> to display. So if image-file extensions tend to cause this, it's
> because of the characters that are part of the extensions, or maybe
> because having an image file visited by the buffer affects the mode
> line in some other way that causes this issue.
>
>> > But I cannot reproduce this on my system, so it is hard to tell what
>> > is going on here.
>>
>> That is really unfortunate.
>
> Maybe someone else can reproduce this?

I will try reproducing it in latest master by the end of this week, and
maybe Emacs 28 pretest on Windows (compiling master on Windows is
unfortunately not feasible for me), and report back. I will also see if
I can ask someone else to reproduce this in Linux.
E
E
Eli Zaretskii wrote on 16 Feb 2022 15:07
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
8335ki3nwi.fsf@gnu.org
Toggle quote (16 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Wed, 16 Feb 2022 19:36:00 +0530
>
> >> > But I cannot reproduce this on my system, so it is hard to tell what
> >> > is going on here.
> >>
> >> That is really unfortunate.
> >
> > Maybe someone else can reproduce this?
>
> I will try reproducing it in latest master by the end of this week, and
> maybe Emacs 28 pretest on Windows (compiling master on Windows is
> unfortunately not feasible for me), and report back. I will also see if
> I can ask someone else to reproduce this in Linux.

Thanks.
V
V
Visuwesh wrote on 18 Feb 2022 13:03
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
87bkz4pekm.fsf@gmail.com
[?????, ???????? 16 2022] Visuwesh wrote:

Toggle quote (35 lines)
> [????????, ???????? 15 2022] Eli Zaretskii wrote:
>
>>> From: Visuwesh <visuweshm@gmail.com>
>>> Cc: 53729@debbugs.gnu.org
>>> Date: Tue, 15 Feb 2022 07:17:12 +0530
>>>
>>> > Looks like the rendering of the file name is affected by the non-Tamil
>>> > text that follows it?
>>> >
>>>
>>> And it looks like it is not just any non-Tamil text that affects it but
>>> only image like extensions? For example, if the extension is .txt, .sh,
>>> .c, .el, .svg, .djvu, then the buffer name is shaped properly. But if I
>>> use the extensions .tiff, .png, .jpeg, .pdf, .jpg, then the buffer name
>>> is not shaped. The file I renamed was a bash script.
>>
>> The display code which renders this doesn't know anything about the
>> meaning of the extensions, it only cares about the characters it needs
>> to display. So if image-file extensions tend to cause this, it's
>> because of the characters that are part of the extensions, or maybe
>> because having an image file visited by the buffer affects the mode
>> line in some other way that causes this issue.
>>
>>> > But I cannot reproduce this on my system, so it is hard to tell what
>>> > is going on here.
>>>
>>> That is really unfortunate.
>>
>> Maybe someone else can reproduce this?
>
> I will try reproducing it in latest master by the end of this week, and
> maybe Emacs 28 pretest on Windows (compiling master on Windows is
> unfortunately not feasible for me), and report back. I will also see if
> I can ask someone else to reproduce this in Linux.

I have not tried reproducing it in other systems but I did try latest
master, and the text is still not shaped. However, I just noticed that
I missed this warning message that was written to stdout (which is
mostly why I did not notice it) when I close Emacs:

Warning: Missing charsets in String to FontSet conversion

I tried quickly grepping for "Missing charsets" and "Warning: " in Emacs
git repo but I don't see this particular warning in the results.

[ I realise that a lot of time could have been saved if I was more
attentive, sorry about that. ]
E
E
Eli Zaretskii wrote on 18 Feb 2022 13:59
(name . Visuwesh)(address . visuweshm@gmail.com)(address . 53729@debbugs.gnu.org)
831r001gab.fsf@gnu.org
Toggle quote (14 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 53729@debbugs.gnu.org
> Date: Fri, 18 Feb 2022 17:33:05 +0530
>
> I have not tried reproducing it in other systems but I did try latest
> master, and the text is still not shaped. However, I just noticed that
> I missed this warning message that was written to stdout (which is
> mostly why I did not notice it) when I close Emacs:
>
> Warning: Missing charsets in String to FontSet conversion
>
> I tried quickly grepping for "Missing charsets" and "Warning: " in Emacs
> git repo but I don't see this particular warning in the results.

This is not an Emacs warning. Searching for it on the Internet brings
these pages, please see if anything there is relevant to your setup:


After skimming those, it seems to be related to the X font setup, so
maybe it is relevant to the display problems you see. Not sure.
V
V
Visuwesh wrote on 18 Feb 2022 14:20
(name . Eli Zaretskii)(address . eliz@gnu.org)(address . 53729@debbugs.gnu.org)
8735kgpaz4.fsf@gmail.com
[??????, ???????? 18 2022] Eli Zaretskii wrote:

Toggle quote (16 lines)
>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 53729@debbugs.gnu.org
>> Date: Fri, 18 Feb 2022 17:33:05 +0530
>>
>> I have not tried reproducing it in other systems but I did try latest
>> master, and the text is still not shaped. However, I just noticed that
>> I missed this warning message that was written to stdout (which is
>> mostly why I did not notice it) when I close Emacs:
>>
>> Warning: Missing charsets in String to FontSet conversion
>>
>> I tried quickly grepping for "Missing charsets" and "Warning: " in Emacs
>> git repo but I don't see this particular warning in the results.
>
> This is not an Emacs warning.

Ah, I assumed it was from Emacs since it contained "fontset" and
"charsets".

Toggle quote (11 lines)
> Searching for it on the Internet brings these pages, please see if
> anything there is relevant to your setup:
>
> https://www.ibm.com/support/pages/using-gui-results-missing-charsets-string-fontset-conversion-warning
> https://access.redhat.com/solutions/409033
> https://superuser.com/questions/1531413/getting-warning-missing-charsets-in-string-to-fontset-conversion
> https://gromnitsky.blogspot.com/2021/05/missing-charsets-in-string-to-fontset.html
>
> After skimming those, it seems to be related to the X font setup, so
> maybe it is relevant to the display problems you see. Not sure.

Hmm, I tried setting LC_CTYPE=C which removed the warning but the
incorrect text shaping still persists... I will try reproducing in
other systems and report back.
J
J
Jai Vetrivelan wrote on 19 Feb 2022 05:20
(name . Eli Zaretskii)(address . eliz@gnu.org)
87pmnjebd1.fsf@gmail.com
Hello,

On 2022-02-15, 16:26 +0200, Eli Zaretskii <eliz@gnu.org> wrote:

Toggle quote (7 lines)
>> > But I cannot reproduce this on my system, so it is hard to tell what
>> > is going on here.
>>
>> That is really unfortunate.
>
> Maybe someone else can reproduce this?

I tried reproducing the above issue and can confirm it.

The text is weirdly shaped when opening a binary file with file
extension. There are no messages in stdout/stderr, however.
Buffer 1: Opening a image file without extension in its file name.
Buffer 2: A ASCII text file.
Buffer 3: (generate-new-buffer "??????? ??????????")
Buffer 4: The same image as in buffer one but with file extension in
the file name.

It looks as if the problem is with the file extension and not the file
itself, I might be wrong:
--
Jai Vetrivelan

In GNU Emacs 28.0.60 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.24, cairo version 1.16.0)
Windowing system distributor 'System Description: Guix System

Configured using:
'configure
CONFIG_SHELL=/gnu/store/pwcp239kjf7lnj5i4lkdzcfcxwcfyk72-bash-minimal-5.0.16/bin/bash
SHELL=/gnu/store/pwcp239kjf7lnj5i4lkdzcfcxwcfyk72-bash-minimal-5.0.16/bin/bash
--prefix=/gnu/store/q9siyz7hlr5vwi9lk04rywh9l8pqw5az-emacs-pgtk-native-comp-28.0.60-215.336a549
--enable-fast-install --with-native-compilation --with-pgtk
--with-xwidgets --with-modules --with-cairo --disable-build-details'

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LIBOTF LIBSELINUX LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY
PDUMPER PGTK PNG RSVG SECCOMP SOUND THREADS TIFF TOOLKIT_SCROLL_BARS XIM
XWIDGETS GTK3 ZLIB

Important settings:
value of $EMACSLOADPATH: /home/anon/.guix-profile/share/emacs/site-lisp:/gnu/store/q9siyz7hlr5vwi9lk04rywh9l8pqw5az-emacs-pgtk-native-comp-28.0.60-215.336a549/share/emacs/28.0.60/lisp
value of $LANG: en_US.utf8
locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t

Load-path shadows:
/gnu/store/0vhklp6c0npzf5xv8jx0iw857sglm77i-emacs-transient-0.3.7/share/emacs/site-lisp/transient-0.3.7/transient hides /gnu/store/q9siyz7hlr5vwi9lk04rywh9l8pqw5az-emacs-pgtk-native-comp-28.0.60-215.336a549/share/emacs/28.0.60/lisp/transient

Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util rmail
rmail-loaddefs auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json map text-property-search time-date
subr-x seq byte-opt gv bytecomp byte-compile cconv mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs
cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils
iso-transl tooltip eldoc paren electric uniquify ediff-hook vc-hooks
lisp-float-type elisp-mode mwheel term/pgtk-win pgtk-win term/common-win
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock
font-lock syntax font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget hashtable-print-readable backquote threads
xwidget-internal dbusbind inotify dynamic-setting system-font-setting
font-render-setting cairo move-toolbar gtk x-toolkit pgtk multi-tty
make-network-process native-compile emacs)

Memory information:
((conses 16 73225 6095)
(symbols 48 6645 0)
(strings 32 19871 2480)
(string-bytes 1 721258)
(vectors 16 14961)
(vector-slots 8 306690 9732)
(floats 8 22 34)
(intervals 56 712 0)
(buffers 992 11))
V
V
Visuwesh wrote on 19 Feb 2022 05:51
(name . Jai Vetrivelan)(address . jaivetrivelan@gmail.com)
87ley71msm.fsf@gmail.com
[???, ???????? 19 2022] Jai Vetrivelan wrote:

Toggle quote (27 lines)
> Hello,
>
> On 2022-02-15, 16:26 +0200, Eli Zaretskii <eliz@gnu.org> wrote:
>
>>> > But I cannot reproduce this on my system, so it is hard to tell what
>>> > is going on here.
>>>
>>> That is really unfortunate.
>>
>> Maybe someone else can reproduce this?
>
> I tried reproducing the above issue and can confirm it.
>
> The text is weirdly shaped when opening a binary file with file
> extension. There are no messages in stdout/stderr, however.
>
>
>
> Buffer 1: Opening a image file without extension in its file name.
> Buffer 2: A ASCII text file.
> Buffer 3: (generate-new-buffer "??????? ??????????")
> Buffer 4: The same image as in buffer one but with file extension in
> the file name.
>
> It looks as if the problem is with the file extension and not the file
> itself, I might be wrong:

Thank you, Jai, for reproducing; I was finding it hard to find someone
who can reproduce it. Can you also see if Telugu and Kannada text are
shaped properly? Here, I don't see them shaped properly.

[ I cannot read either of the languages, so I'm just comparing the
shaping in HELLO file and the modeline but I can get someone who can
read Telugu, and Kannada. ]

Also, can you report your harfbuzz version? I'd like to make sure if my
harfbuzz version (3.0.0) is the problem before I spend time updating my
system, thanks.
J
J
Jai Vetrivelan wrote on 19 Feb 2022 06:22
(name . Visuwesh)(address . visuweshm@gmail.com)
87a6ene8hg.fsf@gmail.com
Hello,

On 2022-02-19, 10:21 +0530, Visuwesh <visuweshm@gmail.com> wrote:

Toggle quote (2 lines)
> Can you also see if Telugu and Kannada text are shaped properly?
> Here, I don't see them shaped properly.
I am NOT familiar with either languages, so I am not sure if they
display correctly.

Toggle quote (2 lines)
> Also, can you report your harfbuzz version?

3.0.0
--
Jai Vetrivelan
J
J
Jai Vetrivelan wrote on 19 Feb 2022 06:31
(name . Visuwesh)(address . visuweshm@gmail.com)
8735kfe82y.fsf@gmail.com
Emacs seems be built with harfbuzz version 2.8.2 and not 3.
--
Jai Vetrivelan
E
E
Eli Zaretskii wrote on 19 Feb 2022 09:23
(name . Jai Vetrivelan)(address . jaivetrivelan@gmail.com)
83o833z2m3.fsf@gnu.org
Toggle quote (9 lines)
> From: Jai Vetrivelan <jaivetrivelan@gmail.com>
> Cc: Visuwesh <visuweshm@gmail.com>, 53729@debbugs.gnu.org
> Date: Sat, 19 Feb 2022 09:50:10 +0530
>
> I tried reproducing the above issue and can confirm it.
>
> The text is weirdly shaped when opening a binary file with file
> extension. There are no messages in stdout/stderr, however.

Thanks. I think I see the problem. If I'm right, file-name
extensions are not related to this.

Please try this much simpler reproducer:

emacs -Q
C-x b ??????? ?????????? RET

At this point you should see a buffer whose name is "??????? ??????????"
and whose mode line displays the buffer name correctly.

M-: (set-buffer-multibyte nil) RET

Now the buffer's name should display incorrectly on the mode line.

IOW, the problem happens when the buffer is unibyte, probably because
we disable text shaping (a.k.a. "character compositions") in that
case.
E
E
Eli Zaretskii wrote on 19 Feb 2022 10:22
(address . 53729@debbugs.gnu.org)
83fsofyzw2.fsf@gnu.org
Toggle quote (20 lines)
> Date: Sat, 19 Feb 2022 10:23:32 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: visuweshm@gmail.com, 53729@debbugs.gnu.org
>
> Please try this much simpler reproducer:
>
> emacs -Q
> C-x b ??????? ?????????? RET
>
> At this point you should see a buffer whose name is "??????? ??????????"
> and whose mode line displays the buffer name correctly.
>
> M-: (set-buffer-multibyte nil) RET
>
> Now the buffer's name should display incorrectly on the mode line.
>
> IOW, the problem happens when the buffer is unibyte, probably because
> we disable text shaping (a.k.a. "character compositions") in that
> case.

I hope I've now fixed this on the master branch, please test.
V
V
Visuwesh wrote on 19 Feb 2022 11:27
(name . Eli Zaretskii)(address . eliz@gnu.org)
87zgmnw3r1.fsf@gmail.com
[???, ???????? 19 2022] Eli Zaretskii wrote:

Toggle quote (22 lines)
>> Date: Sat, 19 Feb 2022 10:23:32 +0200
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: visuweshm@gmail.com, 53729@debbugs.gnu.org
>>
>> Please try this much simpler reproducer:
>>
>> emacs -Q
>> C-x b ??????? ?????????? RET
>>
>> At this point you should see a buffer whose name is "??????? ??????????"
>> and whose mode line displays the buffer name correctly.
>>
>> M-: (set-buffer-multibyte nil) RET
>>
>> Now the buffer's name should display incorrectly on the mode line.
>>
>> IOW, the problem happens when the buffer is unibyte, probably because
>> we disable text shaping (a.k.a. "character compositions") in that
>> case.
>
> I hope I've now fixed this on the master branch, please test.

Can confirm that it is fixed. Thanks a lot, Eli!
E
E
Eli Zaretskii wrote on 19 Feb 2022 13:38
(name . Visuwesh)(address . visuweshm@gmail.com)
837d9ryqsi.fsf@gnu.org
Toggle quote (14 lines)
> From: Visuwesh <visuweshm@gmail.com>
> Cc: jaivetrivelan@gmail.com, 53729@debbugs.gnu.org
> Date: Sat, 19 Feb 2022 15:57:14 +0530
>
> [???, ???????? 19 2022] Eli Zaretskii wrote:
>
> >> IOW, the problem happens when the buffer is unibyte, probably because
> >> we disable text shaping (a.k.a. "character compositions") in that
> >> case.
> >
> > I hope I've now fixed this on the master branch, please test.
>
> Can confirm that it is fixed. Thanks a lot, Eli!

Thanks, closing.
Closed
?