[FM Discuss] UTF-8 stated but not used

Tuukka Hastrup Tuukka.Hastrup at iki.fi
Thu Jan 20 01:32:57 PST 2011


Edward Cherlin wrote:
> Our pages claim to be in Unicode UTF-8, but some do not display as
> such. For example,

Fortunately, this problem seems to appear on the chapter indexes only
and the chapter content is fine.

> http://translate.flossmanuals.net/bin/view/ActivitiesGuideSugar_es/WebHome
> �QU� ES SUGAR?
> (Not in an encoding I recognize)
> ¿QUÉ ES SUGAR?

Something weird has happened here: the content *is* UTF-8, but the
non-ASCII characters have been converted to the Unicode character U+FFFD
"Replacement Character". Maybe by cut and paste, or by Python code such
as str_in_latin1.decode("ascii", "replace").



More information about the Discuss mailing list