mumi does not correctly display (some?) non-ascii characters

  • Open
  • quality assurance status badge
Details
3 participants
  • Felix Lechner
  • noe
  • Tomas Volf
Owner
unassigned
Submitted by
Tomas Volf
Severity
normal
Blocked by
T
T
Tomas Volf wrote on 25 Feb 14:04 +0100
(address . bug-mumi@gnu.org)
Zds6yhPkZ0Id6SAT@ws
Hi,

when I compare mumi page[0] with debbugs page[1], the from field displays "???"
in mumi, but "???" in debbugs.

Have a nice day,
Tomas Volf


--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEt4NJs4wUfTYpiGikL7/ufbZ/wakFAmXbOsoACgkQL7/ufbZ/
wanbRxAApKKRUtO29oBTdcr2QO99NNsvW2cq5VWAm4iAKEaThMzZneIT5ljX7WHi
qMjAJq4wBJMzRiZpym51Z/ZVJQ4l7AW7fZmzmjcMp7f66WxG1Ob6B5815lE4Iyyv
8Zo+3UBBLOuCfk0zXgWiO/5DhL0XeLs2FqHvdKOlfI++eNjPwuTLY8wxyk1oB31Z
njJyqXZdpxKa4Z54tyv2JIs2FWAptYAR2MXFTmoEmV4RanNDBXOvYfc84XBFaGE0
ObhK0Nn0pjxmZjzR7XIVvnm4Q6krpmSGd/Pqhe4JTLoeFnl2QV27fKExxGtKw8fm
fR1HTPBQ5+XvfiJ4atVhYsoE7gAH2KW+Db2tm0Wp0S3kqibVndQF2jjZxkuaYvqG
Rdqi4krCZMkDzP8i9fapXyJ9FPo60qYHttpl98HLMt/utL2iNV6sDULH/cv0senR
P1Pdk4n/NzPnbiDskkVGc+FkoZVTkhFEDJbrfAQ8GF7hmktMpXZFpbP0BypMVKz6
aoT6czUisgSXzWtT+rm3LMgjIqPd+JHm9IeEDsZ6KCO7W5mpqwXGK0qB7x6OgJ3S
B33GiqattHNeWlWkGOEs/Ptz9AbWbX8oUaiMZ7UhAyBzCpnaZtXbU9CHf6k3MFGY
z1PgvDtqXfqkMfVNGzzI3NLUjqdDtZOm3EUN5l1WE57Yhze7ESI=
=2/8p
-----END PGP SIGNATURE-----


F
F
Felix Lechner wrote on 15 May 01:12 +0200
[PATCH] Convert HTML to UTF-8 ourselves. (Closes: #69381)
(address . 69381@patchwise.org)
20240514231249.18303-1-felix.lechner@lease-up.com
This fixes a host of encoding issues in Mumi, including the diff
problems that are not mentioned in the bug. An example is here:


The procedure version may one day be more efficient but does not work.
Based on comments in the Guile source code, the procedure style may
one day enable more advanced response formats. The author is unclear
as to why the procedure does not work. There may be a complex
interaction involving the response headers.

A preview of this code is live at patchwise.org.

The solution of this bug may depend on the patch in Bug#70907. This
patch furthermore depends on the patch in Bug#70906, but the solution
of the bug may not.
---
mumi/web/render.scm | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

Toggle diff (32 lines)
diff --git a/mumi/web/render.scm b/mumi/web/render.scm
index 316ca4c..9b16f8d 100644
--- a/mumi/web/render.scm
+++ b/mumi/web/render.scm
@@ -28,6 +28,7 @@
#:use-module ((ice-9 textual-ports)
#:select (get-string-all put-string))
#:use-module (ice-9 match)
+ #:use-module (rnrs bytevectors)
#:use-module (web http)
#:use-module (web request)
#:use-module (web response)
@@ -104,13 +105,13 @@
(define* (render-html sxml #:key (extra-headers '()))
(values (append extra-headers
'((content-type . (text/html (charset . "utf-8")))))
- (lambda (port)
- (sxml->html sxml port))))
+ (string->utf8
+ (sxml->html-string sxml))))
(define (render-json json)
(values '((content-type . (application/json (charset . "utf-8"))))
- (lambda (port)
- (scm->json json port))))
+ (string->utf8
+ (scm->json-string json))))
(define (not-found uri)
(values (build-response #:code 404)
--
2.41.0
F
F
Felix Lechner wrote on 15 May 01:15 +0200
(no subject)
(address . control@patchwise.org)
87a5ksvvcc.fsf@lease-up.com
block 69381 by 70906 70907
tags 69381 + patch
thanks
N
[PATCH] web: Use string to avoid losing unicode characters.
(address . 69381@debbugs.gnu.org)(name . Noé Lopez)(address . noelopez@free.fr)
20241102000730.3330-1-noe@xn--no-cja.eu
From: Noé Lopez <noelopez@free.fr>

I don’t really understand why the unicode characters were lost in the
first place, maybe something in the sanitize-response of (fibers web
server)? Specifically, strings and procedures don’t take the same
path there.

* mumi/web/render.scm (render-html): Return string instead of procedure.
---
mumi/web/render.scm | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

Toggle diff (18 lines)
diff --git a/mumi/web/render.scm b/mumi/web/render.scm
index 168f3bc..c28a26f 100644
--- a/mumi/web/render.scm
+++ b/mumi/web/render.scm
@@ -105,8 +105,9 @@
(define* (render-html sxml #:key (extra-headers '()))
(values (append extra-headers
'((content-type . (text/html (charset . "utf-8")))))
- (lambda (port)
- (sxml->html sxml port))))
+ (call-with-output-string
+ (lambda (port)
+ (sxml->html sxml port)))))
(define (render-json json)
(values '((content-type . (application/json)))
--
2.46.0
N
N
Noé Lopez wrote on 2 Nov 01:14 +0100
(address . 69381@debbugs.gnu.org)
87ikt6pk4o.fsf@xn--no-cja.eu
Hi,

Wanted to send this patch separately but had this issue selected in mumi
so it sent it here, oops.

I recognize this solution is not optimal (a hack even), but it should be
heavily considered as the issue is rampant among international users.

I suspect the actual issue lies in fibers, as said in the commit message
and I’ll try to fix it there but this patch is still important in the
meanwhile.

Good night,
Noé
N
N
Noé Lopez wrote on 2 Nov 03:23 +0100
(address . 69381@debbugs.gnu.org)
87froape5f.fsf@xn--no-cja.eu
Small update,

I’ve investigated the issue in fibers and I now blame the guile web
library for the issue. Apparently it sets the port to ISO-8859-1
encoding each time you call read-request, but it acts like « yeah don’t
worry just use utf-8 for your body » in the docs.

That’s fine UNLESS you use chunked transfers (omitting content-length in
fibers), in which case it just decides to blow up :///// (it assumes one
character = one byte)

In the end I’m pretty sure any of this could have been avoided by just
not replacing every character with question marks. Had it kept the
invalid bytes intact they would have translated back with no issue.
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 69381@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 69381
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch