[BUG] Encoding errors in the API response
Aug 17, 2011 3:48 AM
Joined: Apr 2011
I noticed that sometimes the XML response contains invalid unicode.
For example this call:
returns as one of the titles: "Onegai My Melody Kiraraâ��".
The actual title should be: "Onegai My Melody Sukkiri♪".
Someone one stackoverflow.com commented that this is probably because the wrong PHP function was used (htmlentities() instead of htmlspecialchars()).
For a more detailed desription see: http://stackoverflow.com/questions/7070111/handling-unicode-in-the-http-response-xml
Modified by StackedCrooked, Aug 17, 2011 3:54 AM
Aug 17, 2011 3:51 PM
Joined: May 2008
Yeah the encoding problems need to be addressed. Switching to htmlspecialchars() would certainly be less disruptive to the unicode portions of the data, and use of htmlentities() is the likely culprit. The xml feed is served as utf-8 so there is no reason to convert characters other than: < > ' " & ...and possibly invalid unicode (control characters and such, that should could just be stripped with iconv() or mb_convert_encoding()... or even a regular expression).
see also: http://myanimelist.net/forum/?topicid=289175