Bug 1112361 - 'snapper list' shows invalid chars in fr_FR.UTF8
'snapper list' shows invalid chars in fr_FR.UTF8
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem
Current
Other Other
: P3 - Medium : Normal (vote)
: ---
Assigned To: Richard Biener
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-10-18 12:13 UTC by Arvin Schnell
Modified: 2022-03-22 13:23 UTC (History)
2 users (show)

See Also:
Found By: Development
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arvin Schnell 2018-10-18 12:13:45 UTC
The output of 'snapper list' can include invalid chars in fr_FR.UTF8:

# LANG=fr_FR.UTF8 snapper ls
  # | Type   | Pre # | Date                            | Utilisateur | Espace utilisé | Nettoyer | Description           | Données utilisateur
----+--------+-------+---------------------------------+-------------+----------------+----------+-----------------------+--------------------
[...]
 6* | single |       | jeu. 18 oct. 2018 11:07:13 CEST | root        |   1�020,00 Kio |          |                       |
Comment 1 Arvin Schnell 2018-10-18 12:19:53 UTC
I can reproduce the problem with a simple C++ program:

int
main()
{
    locale::global(locale(""));
    cout.imbue(locale());
    cout << 1000 << endl;
}

The thousands-separator is not displayed correctly in the locale
is fr_FR.UTF8.

LANG=fr_FR.UTF8 ./test | hexdump -C
00000000  31 e2 30 30 30 0a                                 |1.000.|
00000006

AFAIS e2 alone is not valid UTF-8.

Also see bug #1079630 and #1079855.
Comment 2 Arvin Schnell 2018-10-18 12:27:53 UTC
Richard, what is your opinion here?
Comment 3 Richard Biener 2018-10-18 12:41:12 UTC
It prints

> LANG=fr_FR.UTF8 ./a.out 
1 000

for me on Leap 42.3.  And the same binary prints garbage on Tumbleweed.

glibc issue?  Is the thousands separator a multibyte character?

Because the C++ standard says thousands_sep is a single character.
Comment 4 Arvin Schnell 2018-10-18 12:47:44 UTC
According to bug #1079855 the thousand separator has changed from U+00A0
No-Break Space to U+202F Narrow No-Break Space (for Czech but AFAIK this
was changed for many locales).
Comment 5 Richard Biener 2018-10-18 12:55:44 UTC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87642
Comment 6 Swamp Workflow Management 2018-10-25 15:20:41 UTC
This is an autogenerated message for OBS integration:
This bug (1112361) was mentioned in
https://build.opensuse.org/request/show/644675 Factory / gcc8
Comment 7 Swamp Workflow Management 2018-10-31 11:20:06 UTC
This is an autogenerated message for OBS integration:
This bug (1112361) was mentioned in
https://build.opensuse.org/request/show/645704 Factory / gcc8
Comment 8 Richard Biener 2022-03-22 13:23:25 UTC
Fixed.