Forum Settings
Forums
New
Jul 20, 2009 10:10 PM
#1
Overlord

Offline
Nov 2004
5752
Discuss search API here.
Reply Disabled for Non-Club Members
Jul 20, 2009 10:30 PM
#2

Offline
Jul 2007
3743
Maybe add the score as well like in the normal search. Maybe also genres? (guess you'd need to do those changes we discussed first)
Jul 20, 2009 11:56 PM
#3

Offline
Jan 2009
63
Good job, very neat!

I noticed that you separate synonyms with [;] and that very not XML-like. To present a list of items in XML a very common convention is:

<synonyms>
<synonym>the first</synonym>
<synonym>the second</synonym>
</synonyms>

This is a lot easier to parse too.

As Kotori suggested score will be useful, for example to order the results by score :)
pigozJul 21, 2009 12:09 AM
Jul 21, 2009 12:16 AM
#4

Offline
Jul 2007
3743
; is much faster to parse than child nodes though, I like it
Jul 21, 2009 1:59 AM
#5

Offline
Jan 2009
63
A raw string processing may be faster, but it is not like parsing an XML list is so slow that it need gimmicky solutions to increase performance. Especially using a serial parser like SAX there is no substantial performance difference.

In my humble opinion following standards, convention and best practices is good unless they are unreasonable.
Jul 21, 2009 7:46 AM
#6

Offline
Jul 2007
3743
This is just a special case though for myanimelist syns, none of the names contain a ;. It used to be a "," before, which was a problem since some Anime names contain that. Never heard of SAX; I made my own parser for malu since DOM was crap and very slow. Now it takes ~10-20 ms for parsing a list with ~1k entries. It will still be fast with child nodes, but it's a different case for those parsing with php etc.
KotoriJul 21, 2009 7:53 AM
Jul 21, 2009 11:10 AM
#7

Offline
Jan 2009
63
SAX is very fast and has very low memory footprint, it is good to parse an XML file and do something with the data on the fly (for example displaying it or storing it into your local data structure).

DOM is designed to make it easy for you to edit the XML document so it loads all the document in memory and builds a tree data structure from the XML, which of course is overkill if you do not need to edit the original tree.

A regex parser would be faster if you compile the regex, but it is not worth the effort imho.
Jul 21, 2009 12:05 PM
#8
Overlord

Offline
Nov 2004
5752
Added in <score>

Debating on the <synonyms> thing. I dont' care either way really. Whatever is easiest for you guys.
Jul 23, 2009 3:15 AM
#9
Offline
Jan 2007
5
+1 for <synonyms>. it would be more XML and easier to parse. if there's any changes on the delimiters, our apps will continue running without changing anything.
Jul 23, 2009 9:29 AM
Offline
Apr 2009
100
The problem with synonyms is that the database is full of junk, including mixed delimiters. As the server has no way to cleanly split that into individual titles it can't transmit those to clients, in any format ^^;

The problem with parsing is that the database is full of junk, including invalid characters. As the server directly dumps that into XML all the fast parsers choke and die. Piping through W3C tidy first works fine.
WileJul 23, 2009 9:44 AM
Aug 3, 2009 1:18 AM

Offline
Jan 2009
63
The search API is bugged, it returns text with special characters encoded like HTML (stuff like &amp;quot;). This is wrong, it should return special characters encoded as declared by the XML header (in our case UTF-8).

This actually makes the Cocoa XML Parser to fail, unless I clean the xml before. :|
pigozAug 3, 2009 2:22 AM
Aug 3, 2009 3:16 AM
Offline
Apr 2009
100
pigoz said:
This is wrong, it should return special characters encoded as declared by the XML header

No. There are a few entities XML insists on. The only nuisance is that MAL has them in CDATA as well.

(do more research before using bold ;)

This actually makes the Cocoa XML Parser to fail, unless I clean the xml before.

I doubt it. Probably fails because of all the newlines and tabs outside of tags.

But you should clean anyway. Luckily what I mentioned in my last post is already built in as NSXMLDocumentTidyXML.
Aug 3, 2009 3:49 AM

Offline
Jan 2009
63
By cleaning I actually mean I'm using NSXMLDocumentTidyXML.
Aug 3, 2009 7:49 PM

Offline
Jul 2007
3743
pigoz said:
The search API is bugged, it returns text with special characters encoded like HTML (stuff like &amp;quot;). This is wrong, it should return special characters encoded as declared by the XML header (in our case UTF-8).

This actually makes the Cocoa XML Parser to fail, unless I clean the xml before. :|


Yep, I noticed when desc. failed to load.. few modifications in my custom parser fixed it.

@ Xinil: if synonyms delimiter is changed to child nodes please notify it :)
Aug 4, 2009 6:28 AM

Offline
Jan 2009
63
Actually I read &amp; and similar character are correct in XML, I apologize for the bold and for being rude. Anyway there is still some strange stuff going on in the synopsis field.

for example in http://myanimelist.net/api/anime/search.xml?q=full+metal there is stuff like: &amp;quot;&lt;br /&gt;

where it should be &quot;&alt;br /&gt;
pigozAug 4, 2009 8:39 AM
Aug 5, 2009 8:56 AM

Offline
Jan 2009
63
Something is wrong here too:
http://myanimelist.net/api/anime/search.xml?q=pokemon

Pokémon encoded as Pok&Atilde;&copy;
Aug 16, 2009 5:21 PM
Offline
Aug 2008
1
This keeps getting better. I was wondering if its possible to add anime characters/voice actors to the search list. I was listening to music on rhapsody and it displays random info about the band artists. So i thought wouldn't it be cool to watching an anime and then get to see info on the characters. Just an idea.

<anime>
<title>...</title>
<...etc other data>

<Characters>
<name>...</name>
<alias>...</alias>
<voice>...</voice>
<story> ...</story>
<pic>...</pic>
</Characters>

<Characters>
<name>...</name>
<alias>...</alias>
<voice>...</voice>
<story> ...</story>
<pic>...</pic>
</Characters>

</anime>

Arigato for the nice work Xinil.
Aug 31, 2009 6:51 PM

Offline
Sep 2007
7
It does look like the encoding is borked for Japanese characters. For example: http://myanimelist.net/api/anime/search.xml?q=naruto returns badly encoded Japanese characters for anime ID 936.

Does anyone have a workaround for this?

Also, it'd be nice, but not 100% necessary, if the content type returned from the server is "text/xml" instead of "text/html".

Otherwise, the search API is working fine, thanks for that Xinil.
Sep 18, 2009 2:12 AM
Offline
Apr 2008
48
Indeed, this issue is bugging me aswell.
The standard XML parsers cry when i try to parse the response. The XML tag says it's UTF-8, but obviousely, it's not.. I'm building a small Java App so any hints on how to (temp.) work around this issue would be nice.

I wouldn't mind helping out on this project allthough my time is limited (Full-time Job as a It-Professional during the day)

So.. just give me a PM,

thnx in advanced
Oct 24, 2009 1:16 PM

Offline
Jul 2008
1
Regarding the invalid characters in the XML, the PHP extension Tidy worked out for me.
Although it just converts the obstructing characters into HTML Unicode, it returns valid XML - enough for me anyway.

It utilizes libtidy from the HTML Tidy Library Project (http://tidy.sourceforge.net/), there is also a Java version (http://sourceforge.net/projects/jtidy) linked.
Apr 5, 2010 9:57 AM
Offline
Apr 2009
100
Unicode in titles that works fine on the site is returned as garbage.

http://myanimelist.net/api/anime/search.xml?q=maria+holic

Looks like UTF-8 interpreted byte for byte as latin1 and then entity-encoded. MAL is a treasure trove of this kind of bug ;)
Apr 14, 2010 10:05 PM

Offline
Aug 2007
171
Good thing I checked this group... Explains why I wasted the last 2 hours of my life.

XML does NOT ALLOW ampersands. The search API returns results with ampersands due to special characters. This is the cause of a lot of trouble for some people using it I bet. As soon as you strip the ampersands (I do with regular expressions) everything is fine.

However, this also means you don't get to make use of any special characters using this method, unless you add them back in when you are done parsing your XML.
brass2themaxApr 16, 2010 3:45 PM
Sep 21, 2010 10:52 AM
Offline
Nov 2007
1
Any updates on the Search API, Xinil? The idea of having synonyms and such in child tags is a very good one and very much in the spirit of XML.

The fact that the returned XML needs to be ran through a tidy program is slightly annoying for me, given how I'm trying to do this on Android and outside packages can't be imported as easily. Any easy ways for you to convert ALL ampersands to &amp; before returning the XML? My app is currently breaking under things like &Aring; .

I've also noticed that the API search and the web site search show slightly different results.

Other than these points, good job on the API.
Oct 10, 2010 8:25 AM
Offline
Mar 2009
7
For everybody who has problems with HTML entities in XML (yes, these are illegal, period).
Just force the encoding to "latin1" and use DTD for XHTML-transitional-latin1 to parse HTML entities into valid chars, then recode into UTF-8

The bigger problem is incompatibility between API search and Web site search, for example:

psorcererOct 10, 2010 8:40 AM
Apr 28, 2012 6:43 PM

Offline
Feb 2009
83
Sorry for the necro-bump, but I think I have a fix for the encoding issue. It's quite simple and I've tested it and I have proof it works.

psorcerer said:
For everybody who has problems with HTML entities in XML (yes, these are illegal, period).
Just force the encoding to "latin1" and use DTD for XHTML-transitional-latin1 to parse HTML entities into valid chars, then recode into UTF-8

That is only a band-aid solution to a larger problem. While it works, it doesn't kill the bug.

Now, I've talked with a friend about this problem and we've come to a conclusion about whats causing this and what will fix it.

Anyway, we believe that the API is calling the PHP function "htmlentities($string)" on the data before they're placed into the xml. Now, PHP's documentation says that it defaults to UTF-8 encoding, but it doesn't, it reverts to iso-8859-1 (latin1) instead. That will produce stuff like this:
&aring;&deg;�&aring;&yen;&sup3;&auml;&ordm;&ordm;&aring;&frac12;&cent;
as well all know.

Now the fix for this is actually quite simple. It only involves changing the htmlentities call to add 2 more parameters to it. Transforming it from:
htmlentties($string)
to
htmlentities($string, ENT_COMPAT, "UTF-8")

Let's compare that one change and what it will output:
Currently:
htmlentities($string) --> &aring;&deg;�&aring;&yen;&sup3;&auml;&ordm;&ordm;&aring;&frac12;&cent;
Proposed Change:
htmlentities($string, ENT_COMPAT, "UTF-8") --> 少女人形

If you don't believe that, here is a link to show you that the proposed change works, including script source just as extra proof it does work. Make sure your browser encoding is set to UTF-8, if it isn't change it to that then restart it, and you may need to also set a unicode font in case the text doesn't show up right after making sure your encoding is set to UTF-8.

That change, and handling the response stream as a UTF-8 stream, will fix all the response problems anyone should have. After that, it's just replacing the remaining html entities like &#39;, &quote; and the like in your program.

Edit: I was looking through the PHP doc again and it mentioned htmlspecialchars() working essentially the same way. You'll be able to use this fix I mentioned there as well since the first three parameters of htmlspecialchars() and htmlentities() are the same.
DiablofanApr 29, 2012 10:46 AM


Jul 4, 2012 9:39 AM
Offline
Sep 2010
2
Hi Guys. I am trying to create a MAL app and it's my first time using the unofficial app. See my code

var get_db = function (url,form_data,op) {
$.get(
''http://mal-api.com/anime/search',
{format: 'xml', q:'haruhi'},
function(data) {
$('#result').text(data);
}
);
}

If I remove the format and default it will give me an error:

Origin http://localhost is not allowed by Access-Control-Allow-Origin

If I specify a format it doesn't show the error. But I can't verify if there's data in it. I maybe doing this wrong, so I would like to ask some guidance on this.

Oh. I am using JQUERY in case you didn't know.

Thanks.
Jul 20, 2012 9:53 PM

Offline
Feb 2009
83
Solidad said:
Hi Guys. I am trying to create a MAL app and it's my first time using the unofficial app. See my code

var get_db = function (url,form_data,op) {
$.get(
''http://mal-api.com/anime/search',
{format: 'xml', q:'haruhi'},
function(data) {
$('#result').text(data);
}
);
}

If I remove the format and default it will give me an error:

Origin http://localhost is not allowed by Access-Control-Allow-Origin

If I specify a format it doesn't show the error. But I can't verify if there's data in it. I maybe doing this wrong, so I would like to ask some guidance on this.

Oh. I am using JQUERY in case you didn't know.

Thanks.
You'll want to talk to the person in charge of the unofficial MAL API (club link: http://myanimelist.net/clubs.php?cid=14973 ), as that is the API you're calling from the URL in your post there.

If you want to use the official API, you can read the documentation for it here: http://myanimelist.net/modules.php?go=api


Mar 23, 2013 7:49 AM
Offline
Sep 2011
27
While writing on a Plex Media Server Agent I noticed that you only can search for the Name. It would be nice to search for the ID too so you can get just a single result.
Jan 3, 2014 3:05 AM
Offline
Oct 2013
1
when you log in to mal website and see any particular anime page you see external links far down the page but
the mal search api doesn't provide external links
is there any way to get external links like "official site, animedb and wikipedia" from a mal
sorry for my bad English
Mar 24, 2014 4:43 AM

Offline
Nov 2007
725
When the search method fails, it returns 204 No Content status code and an empty message-body as expected. However, the same response includes either "Content-Length: 10" or "Transfer-Encoding: chunked" header. This behavior is questionable at best.
Jun 2, 2014 7:00 AM
Offline
Mar 2014
3
http://myanimelist.net/api/anime/search.xml?q=naruto

It seems the img tag isn't being closed in the xml file could you fix this.
Jun 8, 2014 5:46 PM
Offline
Oct 2012
2
When I search "Space Dandy" on the page I get the expected results, but over the api I only get results when I search "Space☆Dandy".

EDIT: The broken unicode is that star symbol here: http://myanimelist.net/anime/20057/Space%E2%98%86Dandy

No results: http://myanimelist.net/api/anime/search.xml?q=Space+Dandy
No results: http://myanimelist.net/api/anime/search.xml?q=Space%20dandy
Ok result: http://myanimelist.net/api/anime/search.xml?q=Space☆Dandy
More than I want, but ok: http://myanimelist.net/api/anime/search.xml?q=Space

Correct result on html page: http://myanimelist.net/anime.php?q=space%20dandy

Unrelated:
The api uses '+' as a space character (which is non-standard URL-encoding) while the html page uses "%20". This should be explicitly mentioned on the documentation, or rather unified.

Different problem:
Like erengy, I would like an unified error response.
Sometimes I get "No results" or something with 'Incapsula incident'.
For example when I punch this url into curl on terminal, with correct credentials and userAgent Incapsula: curl ...auth-here... http://myanimelist.net/api/anime/search.xml?q=Space dandy
Okay: curl ...auth-here... http://myanimelist.net/api/anime/search.xml?q=Space+dandy

Still hoping for updates for the API.
NehmuJun 8, 2014 5:51 PM
Aug 27, 2015 10:39 PM

Offline
Mar 2012
158
Yeah, I know this is old, but I do want to correct something here:

Nehmu said:

The api uses '+' as a space character (which is non-standard URL-encoding) while the html page uses "%20". This should be explicitly mentioned on the documentation, or rather unified.


That's not fully correct. For spaces in paths you must use %20. For items in the query string, you can use + to represent a space. This is usually when sending data as "application/x-www-form-urlencoded". You may use %20, but it's not required. Because of the special use of the +, you must percent-escape it when using it literally. It's old usage, but it is valid.

More on URL encoding is over at Wikipedia
Developer, sysadmin, and anime addict.
Have an Android smartphone? Try Atarashii!
Reply Disabled for Non-Club Members

More topics from this board

» Which endpoints support nsfw query param?

crimson-megumin - Apr 21

2 by crimson-megumin »»
Apr 23, 2:29 PM

» Cant build the URL

tamcio_ - Dec 4, 2022

4 by Jotzy »»
Apr 22, 4:35 AM

» I made a new anime recommendation system for MyAnimeList

Asudox - Mar 21

4 by Asudox »»
Apr 11, 2:58 PM

» API Rate Limit?

Asudox - Jan 23

7 by pepeefirat »»
Mar 23, 3:57 AM

» Some Animes Missing on API Reponse

Niccol0 - Jan 26

3 by ZeroCrystal »»
Mar 15, 3:44 PM
It’s time to ditch the text file.
Keep track of your anime easily by creating your own list.
Sign Up Login