Aug 17, 2011 3:48 AM

Joined: Apr 2011
Posts: 7
I noticed that sometimes the XML response contains invalid unicode.

For example this call:

returns as one of the titles: "Onegai My Melody Kirara�".
The actual title should be: "Onegai My Melody Sukkiri♪".

Someone one commented that this is probably because the wrong PHP function was used (htmlentities() instead of htmlspecialchars()).

For a more detailed desription see:
Modified by StackedCrooked, Aug 17, 2011 3:54 AM
Aug 17, 2011 3:51 PM

Joined: May 2008
Posts: 4070
Yeah the encoding problems need to be addressed. Switching to htmlspecialchars() would certainly be less disruptive to the unicode portions of the data, and use of htmlentities() is the likely culprit. The xml feed is served as utf-8 so there is no reason to convert characters other than: < > ' " & ...and possibly invalid unicode (control characters and such, that should could just be stripped with iconv() or mb_convert_encoding()... or even a regular expression).

see also: