Forum Settings
Forums
Pages (3) « 1 [2] 3 »
Post New Reply
Apr 19, 2013 11:37 PM
Offline
Joined: Apr 2013
Posts: 2
Fishi said:
Yeap I did that and it exported it perfectly! Now the problem is trying to get it to import haha ^^....

Edit: Now waiting on anidb to finish importing

Edit 2: Yea nothing was added. What did you guys check when you imported to anidb?
Yeah, it failed to import anything for me either. I think it's expecting an anidb id, which isn't in the exported file. There's a bunch of other fields that also aren't included, but that seems like the most important one.
 
Apr 20, 2013 8:07 AM
DB Administrator
Nyaa~☆

Offline
Joined: Sep 2008
Posts: 17907
Baka_Kaito said:
Last problem that I really have no idea what to do about is when i run the script, I get this:
Traceback (most recent call last):
File "C:ap.py", line 9, in <module>
pageNumber = int (html.find('li','next').findPrevious('li').next.contents[0])
AttributeError: 'NoneType' object has no attribute 'findPrevious'

The problem is probably that your list consists of only 1 page and then the HTML code is a bit different and the script can't find the total number of pages.

Replace this line:

pageNumber = int (html.find('li','next').findPrevious('li').next.contents[0])

with this:

pageNumber = int (html.find('li','next').findPrevious('li').contents[0])

and see if it helps.

@Fishi
@Pellanor
See my reply in post #26.
Modified by Luna, Apr 20, 2013 10:29 AM
 
Apr 20, 2013 9:19 AM

Offline
Joined: Apr 2012
Posts: 1080
Luna_ said:
Baka_Kaito said:
Last problem that I really have no idea what to do about is when i run the script, I get this:
Traceback (most recent call last):
File "C:ap.py", line 9, in <module>
pageNumber = int (html.find('li','next').findPrevious('li').next.contents[0])
AttributeError: 'NoneType' object has no attribute 'findPrevious'

The problem is probably that your list consists of only 1 page and then the HTML code is a bit different and the script can't find the total number of pages.

Replace this line:

pageNumber = int (html.find('li','next').findPrevious('li').next.contents[0])

with this:

pageNumber = int (html.find('li','next').findPrevious('li').contents[0])

and see if it helps.


That is definitely not it. I have 16 pages XD [http://www.anime-planet.com/users/BakaKaito/anime] I did replace the line just to see and I got this:

pageNumber = int (html.find('li','next').findPrevious('li').contents[0])
AttributeError: 'NoneType' object has no attribute 'findPrevious'

Screenshot: http://qs.lc/shfe
Modified by BakaKaito, Apr 20, 2013 9:29 AM
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 9:33 AM
DB Administrator
Nyaa~☆

Offline
Joined: Sep 2008
Posts: 17907
I tested it myself with your username and didn't get an error, so I have no idea what's the problem... I have an older version of BeautifulSoup, maybe some of the functions are a bit different and not working the same.
Anyway, I just uploaded the xml I extracted here, so you can just use this for the next step :)

Edit: The screenshot is strange... did you remove this line?

username = raw_input("Enter your username: ")

Because usually it should show you "Enter your username:" and I don't see it in your screenshot.
 
Apr 20, 2013 10:00 AM

Offline
Joined: Apr 2012
Posts: 1080
Luna_ said:
I tested it myself with your username and didn't get an error, so I have no idea what's the problem... I have an older version of BeautifulSoup, maybe some of the functions are a bit different and not working the same.
Anyway, I just uploaded the xml I extracted here, so you can just use this for the next step :)

Edit: The screenshot is strange... did you remove this line?

username = raw_input("Enter your username: ")

Because usually it should show you "Enter your username:" and I don't see it in your screenshot.


I feel like a complete idiot.... I replaced that with my username in the script. I had no idea that that wasn't what it meant. It finally worked! I changed it back to what it should have been (lol) Thanks for your help and for Uploading my list as well!
Modified by BakaKaito, Apr 20, 2013 10:22 AM
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 10:28 AM
DB Administrator
Nyaa~☆

Offline
Joined: Sep 2008
Posts: 17907
Baka_Kaito said:
I feel like a complete idiot.... I replaced that with my username in the script. I had no idea that that wasn't what it meant. It finally worked! I changed it back to what it should have been (lol) Thanks for your help and for Uploading my list as well!

Ah, I see :) Well, the script doesn't require the user to modify anything in the code (except updating the BeautifulSoup package name now).
You can just execute the script, it asks you to input your name, and the function raw_input() will then read this user input (just explaining for others that might have this problem in the future).


And now I understand the other two people's problem in this thread.
@Fishi
@Pellanor
It is not possible to import this anime-planet XML file directly. The AniDB import requires the AniDB IDs because it uses these IDs to map the series to the MAL IDs.

After extracting the anime-planet data and getting the XML file, you have to import it to anidb.net first and then export the AniDB XML file, see this post:
ErgoSis said:
I have found only way to do this by using the python script to generate MAL-like XML but since internal myanimelist.net importer seems not accepting incomplete <anime> blocks so I imported this to anidb.net first, then exported as xml and then imported as AniDB list here. Kinda freaky way to do it.. but worked for me.
Modified by Luna, Apr 20, 2013 10:33 AM
 
Apr 20, 2013 10:52 AM

Offline
Joined: Apr 2012
Posts: 1080
Pellanor said:
Fishi said:
Yeap I did that and it exported it perfectly! Now the problem is trying to get it to import haha ^^....

Edit: Now waiting on anidb to finish importing

Edit 2: Yea nothing was added. What did you guys check when you imported to anidb?
Yeah, it failed to import anything for me either. I think it's expecting an anidb id, which isn't in the exported file. There's a bunch of other fields that also aren't included, but that seems like the most important one.


I have run into the same problem.


And now I understand the other two people's problem in this thread.
@Fishi
@Pellanor
It is not possible to import this anime-planet XML file directly. The AniDB import requires the AniDB IDs because it uses these IDs to map the series to the MAL IDs.

After extracting the anime-planet data and getting the XML file, you have to import it to anidb.net first and then export the AniDB XML file

That is what I tried and got the same issue as Fishi and Pellanor. Have you tried this method yourself with success?
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 10:58 AM
DB Administrator
Nyaa~☆

Offline
Joined: Sep 2008
Posts: 17907
Baka_Kaito said:
That is what I tried and got the same issue as Fishi and Pellanor. Have you tried this method yourself with success?

I did it long, long ago successfully but I think it should still work. You need an anidb.net account and then go to "My List" and choose "mylist import" in the menu on the right side. There you can upload the anime-planet xml file. After that you can choose "mylist export" and select the template "xml". It can take a while to import/export the data (some minutes, some hours, maybe some days, because they need check it or whatever).
This exported xml file can then be imported here.

Edit: I just saw that I never imported an A-P file to anidb.net myself (just exported the anidb xml), but I've heard from other people that it worked.

Edit2: Seems like people had problems in this thread with importing to anidb.net. Maybe something was changed and it doesn't work anymore. Maybe you can still try it and see if it works or not, what error message you will get etc. If there is one, maybe you can find more information about this in the anidb.net forum.
Modified by Luna, Apr 20, 2013 11:13 AM
 
Apr 20, 2013 11:17 AM

Offline
Joined: Apr 2012
Posts: 1080
Luna_ said:
Baka_Kaito said:
That is what I tried and got the same issue as Fishi and Pellanor. Have you tried this method yourself with success?

I did it long, long ago successfully but I think it should still work. You need an anidb.net account and then go to "My List" and choose "mylist import" in the menu on the right side. There you can upload the anime-planet xml file. After that you can choose "mylist export" and select the template "xml". It can take a while to import/export the data (some minutes, some hours, maybe some days, because they need check it or whatever).
This exported xml file can then be imported here.

Edit: I just saw that I never imported an A-P file to anidb.net myself (just exported the anidb xml), but I've heard from other people that it worked.

Edit2: Seems like people had problems in this thread with importing to anidb.net. Maybe something was changed and it doesn't work anymore. Maybe you can still try it and see if it works or not, what error message you will get etc. If there is one, maybe you can find more information about this in the anidb.net forum.

I'm trying something out. I'll also check AniDB too.
Thanks for the help!
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 12:21 PM

Offline
Joined: Apr 2012
Posts: 1080
Well guys I'm sorry to say that I believe it impossible to import without having the proper Anime ID # ie:

<series_animedb_id>11759</series_animedb_id>

I Successfully found a way to make the import work however with that # lacking nothing is actually updated

Screenshot: [http://qs.lc/grd5]
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 12:31 PM
DB Administrator
Nyaa~☆

Offline
Joined: Sep 2008
Posts: 17907
Yes, you either need the MAL ID or the AniDB ID to import titles. It's impossible without, that's why the other person said to use anidb.net to generate an xml file with AniDB IDs.

I wanted to test something that might help, but I don't have time for it right now. So if you can wait 12-24 hours I might be able to find a solution so you don't have to add everything manually. But I can't promise anything.
 
Apr 20, 2013 1:08 PM

Offline
Joined: Apr 2012
Posts: 1080
Luna_ said:
Yes, you either need the MAL ID or the AniDB ID to import titles. It's impossible without, that's why the other person said to use anidb.net to generate an xml file with AniDB IDs.

I wanted to test something that might help, but I don't have time for it right now. So if you can wait 12-24 hours I might be able to find a solution so you don't have to add everything manually. But I can't promise anything.

My import to AniDB is the problem. It just won't add any anime after importing the list. I even Re-wrote the AP scrape XML to a MAL XML and it still won't work.

I look forward to your results. Even if they bear no fruit, at least you have an idea. I am at a lost. Thanks for all your help!
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 1:57 PM
Offline
Joined: Jul 2010
Posts: 3
Meh, I cannot get my list to export:

This script will export your anime-planet.com anime list and saves it to anime-planet.xml
daftphunk
/Library/Python/2.7/site-packages/bs4/builder/_htmlparser.py:149: RuntimeWarning: Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help.
"Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help."))
Traceback (most recent call last):
File "/Volumes/Macintosh HD/Users/Justin/Downloads/anime-planet.com_xml_anime_exporter.py", line 14, in <module>
html = BeautifulSoup(html)
File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 172, in __init__
self._feed()
File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 185, in _feed
self.builder.feed(self.markup)
File "/Library/Python/2.7/site-packages/bs4/builder/_htmlparser.py", line 150, in feed
raise e
HTMLParser.HTMLParseError: bad end tag: u"</scr'+'ipt>", at line 281, column 181
 
Apr 20, 2013 3:57 PM

Offline
Joined: Apr 2012
Posts: 1080
As the link provided in the error states, you need to install a third party parser. There are a few provided in in that linked page. However, given my lack of expertise, that may or may not solve your problem. Anyways, given how much help I received, I feel the need to pay it forward.

This is about all I can do for you. Here is the AP scrape for your username =]

I also suggest still trying on your own. You may learn quite a lot like I did.
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 5:59 PM
Offline
Joined: Jul 2010
Posts: 3
Baka_Kaito said:
As the link provided in the error states, you need to install a third party parser. There are a few provided in in that linked page. However, given my lack of expertise, that may or may not solve your problem. Anyways, given how much help I received, I feel the need to pay it forward.

This is about all I can do for you. Here is the AP scrape for your username =]

I also suggest still trying on your own. You may learn quite a lot like I did.

Woo, thanks a lot, really appreciate it! Yeah I followed the links instructions to try and install another parser but did not have any luck with it. I kept getting invalid syntaxes and what not; but I will definitely try again later
 
Apr 20, 2013 6:12 PM

Offline
Joined: Apr 2012
Posts: 1080
If anyone is interested, as an experiment in trying to import to AniDB successfully, I have edited the script to include all fields in standard MAL export lists including user info section (Both to be partially edited by user). Sadly the experiment failed however, This does create a more "genuine" list and I myself prefer minor details. This is also the updated script for bs4.

It can be found Here
Modified by BakaKaito, Apr 21, 2013 8:46 AM
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 20, 2013 6:14 PM

Offline
Joined: Apr 2012
Posts: 1080
daftphunk said:
Baka_Kaito said:
As the link provided in the error states, you need to install a third party parser. There are a few provided in in that linked page. However, given my lack of expertise, that may or may not solve your problem. Anyways, given how much help I received, I feel the need to pay it forward.

This is about all I can do for you. Here is the AP scrape for your username =]

I also suggest still trying on your own. You may learn quite a lot like I did.

Woo, thanks a lot, really appreciate it! Yeah I followed the links instructions to try and install another parser but did not have any luck with it. I kept getting invalid syntaxes and what not; but I will definitely try again later


No problem =] and good luck!
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 22, 2013 9:48 AM

Offline
Joined: Apr 2012
Posts: 1080
Given the lack of a solution to importing properly, I found MAL API which can be used to search for MAL ID#'s. I wonder if this could be used in a script to scrape ID#'s and update the list automatically. I have used it to find the ID#s to manually update 1/3 of my list so far. Not sure if it is faster or slower than adding through MAL though. Just putting it out here for anyone who may be able to do something with it.
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 22, 2013 6:46 PM

Offline
Joined: Apr 2011
Posts: 111
Thanks Kaito! I added some lines to the script that takes the MAL ID and adds it to the XML file. When the script works it works. When the script runs into errors it crashes and burns like no other...Overall, it adds a good amount of anime to your list BUT you will still have to look out for errors for instance if you get:

"HTTP Error 502: Bad Gateway"
OR
"'ascii' codec can't encode character u'xe2' in position 29: ordinal not in range(128)"
This basically means the anime it was looking up in the api caused the script to crash and burn... SO to fix that you will have to remove that anime from your AP list... TO DO SO open up XML in your favorite text editor, Notepad++, and scroll down to the last entry and look at the anime title then that'll give you an idea of where in your AP list to look and to delete the anime name that blew the script up... Take note you will have to look at the anime names with special characters... For instance, for me, these are the anime that caused the code to blow up. SO remove this from your AP list FIRST

Ichigo 100%
Ikoku Meiro no Croisée
Yumeiro Pâtissière

You kind of get an idea of which anime to remove in your AP list.. C³ because of the ³, Ichigo 100% because of the %, Ikoku Meiro no Croisée because of the é, Yumeiro Pâtissière because of the â and è

Anyways this script takes a while to gather everything so watch anime or listen to music. Check back every now and then on the script in case something blows up...
I've imported it to anidb and works nicely haven't tried it directly to MAL though.

My fail noob mod script can be found here : http://pastebin.com/DbEP4rTL

OUT OF DATE

Edit: I'll replace this code with another robust method as soon as possible. Curse AP and their maintenance... You can use the above code but it'll cause head aches ^^...

EDIT2: See post #45
Modified by Fishi, Apr 25, 2013 10:07 AM
 
Apr 22, 2013 6:54 PM

Offline
Joined: Apr 2011
Posts: 111
This script will print out two things anime-planet.xml and anime_list.txt. You can import anime-planet.xml to anidb. Where as anime_list.txt are the names of the anime that you will need to add to your MAL manually... No idea how to make it so the query from the API recognizes the anime name 100%... Anyways have fun... I have a long list of anime that I need to manually add...
Modified by Fishi, Apr 22, 2013 11:07 PM
 
Apr 23, 2013 8:47 AM

Offline
Joined: Apr 2012
Posts: 1080
Fishi said:
Thanks Kaito! I added some lines to the script that takes the MAL ID and adds it to the XML file. When the script works it works. When the script runs into errors it crashes and burns like no other...Overall, it adds a good amount of anime to your list BUT you will still have to look out for errors for instance if you get:

"HTTP Error 502: Bad Gateway"
OR
"'ascii' codec can't encode character u'xe2' in position 29: ordinal not in range(128)"
This basically means the anime it was looking up in the api caused the script to crash and burn... SO to fix that you will have to remove that anime from your AP list... TO DO SO open up XML in your favorite text editor, Notepad++, and scroll down to the last entry and look at the anime title then that'll give you an idea of where in your AP list to look and to delete the anime name that blew the script up... Take note you will have to look at the anime names with special characters... For instance, for me, these are the anime that caused the code to blow up. SO remove this from your AP list FIRST

Ichigo 100%
Ikoku Meiro no Croisée
Yumeiro Pâtissière

You kind of get an idea of which anime to remove in your AP list.. C³ because of the ³, Ichigo 100% because of the %, Ikoku Meiro no Croisée because of the é, Yumeiro Pâtissière because of the â and è

Anyways this script takes a while to gather everything so watch anime or listen to music. Check back every now and then on the script in case something blows up...
I've imported it to anidb and works nicely haven't tried it directly to MAL though.

My fail noob mod script can be found here http://pastebin.com/DbEP4rTL

Fishi you are the man! This will definitely help with the 500+ series i have left to add!!

Edit: It's too bad that AP is down at the moment though xD
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 23, 2013 9:27 AM
DB Administrator
Nyaa~☆

Offline
Joined: Sep 2008
Posts: 17907
This was my intention, to make a script that uses the API to get the IDs, I'm just extremely busy at the moment so I couldn't do anything.

@Fishi: I'm not sure if that works properly, it looks like you always take the first ID that you get, but sometimes the list contains several IDs. Also, there might be problems if the titles are different on MAL and AP. Before uploading the created XML file you should still check it, while you can import many titles at once, if you need to remove wrongly added titles, you will have to do this manually one by one.
 
Apr 23, 2013 10:40 AM

Offline
Joined: Apr 2011
Posts: 111
@Luna: Yes that's exactly the problem, when there's a different title the script will not be able to find it on the MAL api Kaito sited. I'm not sure how i'd be able change the query to be able to look that up though... at best I could make a test to see if the query returns { } and if it does shorten somehow change the query... such as removing some un needed characters like the !, (, ) etc... so far it seems like it'll be able to find some of the anime in the api.

But it'll still run into problems with some thing like

Break Blade Movie 1: Kakusei no Toki
the API will be able to find Kakusei no Toki but it won't be able to find Break blade movie 1: There are also problem with seasonal stuff like digimon season 1, poke mon etc.. I'll deal with that later though...

As for the multiple IDs, I didn't know that. How does the import work with multiple IDs? Do I make another tag for the different IDs? Or just paste all the IDs into one tag with like a comma or semi-colon? Actually... I could always make a new anime tag for it haha...

Thanks for the comments Luna, it gave me some ideas. Though, the code will not be optimized, but I think I can make it work ^^...
Modified by Fishi, Apr 23, 2013 11:00 AM
 
Apr 23, 2013 1:31 PM

Offline
Joined: Apr 2012
Posts: 1080
@Fishi
@Luna_

Does AP also have the Japanese titles for their Anime? The site is down so I figured I'd ask. The reason I ask is, if they do would it be possible to to rewrite/edit the script to not just scrape the list data but to actually open each anime title and copy the Japanese title to use to search? I would assume that it would give better more accurate results as opposed to the English translations. I could be wrong, and I cant check until the site is back. Just another idea being thrown out there. Hopefully if it is possible, it won't be more work than it's worth.
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 23, 2013 7:20 PM

Offline
Joined: Apr 2011
Posts: 111
Good news! AP is back up and after de-bugging the code a bit it seems like it's working! Just.. it's taking a while to grab all the ids xD... And it seems like anime with special characters in the name causing no erroers -looks at C³-

Currently at L , but it seems like this will be alright...

Here is the new version more robust version that will take ALL the IDs as Luna pointed out and will not break when the shit hits the fan! (hopefully...)


-disclaimer- Make a copy of your MAL/anidb list just in case the script didn't do something correctly... i.e. export your CURRENT anime list before you import using the XML this script provided!

EDIT: Teehee... i forgot to add an ending tag </myanimelist> when it was finished :P FIXED
As for the results about 800 anime series only 66 could not be found using the API so~ I guess that is good ^^b

EDIT2: Just imported it to anidb and it added some wrong anime... And looking at the job description it seems like I grabbed the wrong ID as well.. OR the MAL id didn't associate with the correct anidb id... Would love to know if there is a quick way to delete your anime list :P

EDIT3: I see the problem... The MAL API search doesn't search for the exact item it does a broad search of the term... for example if you look at the api for To heart http://mal-api.com/anime/search?q=to%20heart and do a To heart search on mal http://myanimelist.net/anime.php?q=to%20heart You get all the anime that the scraper will pull and add to the list... ... ... ... ... ... Yeah :P
Here's the job detail in Anidb that shows the problem...


Feel free to edit the code: http://pastebin.com/Ap67mrXM

Idea to fix it: I suppose you could run a search within the API results itself(queryTitle) for 'To Heart' but... the problem will be what if the exact name doesn't appear in the results... In this case To Heart definitely exists in the results! BUT what if in another example it doesn't... :/

Another one where it doesn't work No. 6 but this can be solved by changing the regex to include : queryName = re.sub('[^A-Za-z0-9]+', '%20', animeName) it'll give it a more unique name for the search... But it'll still cause a problem for a search like Saiyuki Reload it'll bring up http://myanimelist.net/anime.php?q=saiyuki%20reload it'll bring up Saiyuki Reload, YAY~, but it'll also bring up Saiyuki Reload Gunlock, Saiyuki Reload Urasai , and Saiyuki Reload: Burial....

I can see why they added the requirement that you need the unique ID now :/

Mod Edit: Please use spoiler tags for large images.
Modified by Luna, Apr 24, 2013 12:12 PM
 
Apr 23, 2013 9:34 PM
Offline
Joined: Jul 2011
Posts: 4
Well, if you do a search with the MAL API you get a json object. Searching within that object for the anime with the exact title should be trivial.

EDIT1: working on it
Modified by AtomskRenewal, Apr 23, 2013 10:12 PM
 
Apr 23, 2013 10:34 PM

Offline
Joined: Apr 2011
Posts: 111
I'm not to familiar with using python xD... And thanks for working on it xD
 
Apr 23, 2013 10:50 PM
Offline
Joined: Jul 2011
Posts: 4
I think It kinda works, but if I import it directly to MAL I get this error: "There was an error parsing your import file, please fix it before trying again"

Should I try importing it to anidb first?
EDIT: invalid xml file at anidb ._. something else is wrong, but I believe it gets most of the ids right! :P

forget about these errors, I'm an idiot.
Import queued at aniDB, MAL says I didn't set the update_on_import, but I did.


My code: http://pastebin.com/jDqTF25v

Oh, I did it in python 3.3, not 2.7
Modified by AtomskRenewal, Apr 23, 2013 11:12 PM
 
Apr 23, 2013 11:26 PM

Offline
Joined: Apr 2011
Posts: 111
Ahh, haha, thanks makes more sense now. Thanks ^^/
 
Apr 23, 2013 11:34 PM
Offline
Joined: Jul 2011
Posts: 4
I did it.

http://pastebin.com/ZRRNXsER

With this you can import directly to MAL, instructions at the end of the comments block.


Needs a bit of fine tuning (Watched != Completed, etc).


It's funny, I got here searching how to export from anime planet lol
Modified by AtomskRenewal, Apr 23, 2013 11:41 PM
 
Apr 23, 2013 11:44 PM

Offline
Joined: Apr 2011
Posts: 111
Thank god you found this thread xD

Edit: I'm getting this error not sure what to make of it.

Traceback (most recent call last):
File "B:\Users\Joe Nguyen\Programs\APscrap.py", line 41, in <module>
search=json.loads(queryTitle.decode('utf8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 1122: invalid continuation byte
Modified by Fishi, Apr 24, 2013 12:04 AM
 
Apr 24, 2013 12:11 AM
Offline
Joined: Jul 2011
Posts: 4
Did you edit anything on the script?
Are you using python 3.3?


Last version I do today http://pastebin.com/hxeRwBpy

Now the status is almost always right. Specials and ovas are likely to cause trouble. For example:
Hellsing Ultimate is imported as Hellsing :/
Azumanga Daioh: The Very Short Movie is called Azumanga Daioh - The Very Short Movie in anime planet.

It imported ~240 of 360 for me.
Modified by AtomskRenewal, Apr 24, 2013 12:36 AM
 
Apr 24, 2013 11:54 AM

Offline
Joined: Apr 2011
Posts: 111
Looks like Seikimatsu Occult Gakuin caused the error. I'm guessing it's because of this funky character �.... anyways here's the code that'll work with python 2.7 http://pastebin.com/Ap67mrXM

Edit: Got it imported to MAL very nice 714/887 added ^^/ Looks like everything works almost nicely!
Modified by Fishi, Apr 24, 2013 1:06 PM
 
Apr 24, 2013 12:50 PM
DB Administrator
Nyaa~☆

Offline
Joined: Sep 2008
Posts: 17907
Baka_Kaito said:
Does AP also have the Japanese titles for their Anime? The site is down so I figured I'd ask. The reason I ask is, if they do would it be possible to to rewrite/edit the script to not just scrape the list data but to actually open each anime title and copy the Japanese title to use to search? I would assume that it would give better more accurate results as opposed to the English translations. I could be wrong, and I cant check until the site is back. Just another idea being thrown out there. Hopefully if it is possible, it won't be more work than it's worth.

Doesn't look like AP has the Japanese titles. My idea was to use the year, type, and episode numbers (information that is listed in your AP list) to find the correct MAL entry. The type and episode numbers might be a bit different for some titles but maybe you can figure out a way how to deal with these cases.

Also, I strongly recommend not to modify the existing script further. Instead, make a new script that reads the "exported" AP XML file as input. This way, you won't have to scrape your AP list again and again which is much slower than reading an XML file and more importantly, not good for their servers. Actually it can be seen as a script that violates their terms of use, so you shouldn't make it worse with using it again and again.
MAL's terms of use are quite similar, "You agree not to use or launch any automated system [...] that accesses the Service in a manner that sends more request messages to the Company servers than a human can reasonably produce in the same period of time by using a conventional on-line web browser [...]"
I'm not quite sure how this applies to the API, especially since you're using the unofficial API (the official one is here). For large lists it'd probably be good to add a small delay between server accesses, just in case.
 
Apr 25, 2013 5:09 AM

Offline
Joined: Apr 2012
Posts: 1080
@AtomskRenewal
@Fishi

You guys really are awesome. Thanks for all the help and results you guys delivered!
I'll be sure to spread the word to all the others searching for a way to escape AP's deadly clutches!

Edit: @Fishi
I used your 2.7 version and it didn't make an anime_list.txt but only added 530/709.
Was that the function you removed to have it work with 2.7?

Edit 2:
Well my list is officially done!
Thanks again to everyone for all the help!!
Modified by BakaKaito, Apr 25, 2013 8:15 AM
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
Apr 25, 2013 10:02 AM

Offline
Joined: Apr 2011
Posts: 111
Yea it was removed/not implemented. And what?! Your signature got updated? D:
 
Apr 25, 2013 12:21 PM

Offline
Joined: Apr 2012
Posts: 1080
Fishi said:
Yea it was removed/not implemented. And what?! Your signature got updated? D:



okay i was wondering what happened there. If yours isn't updating, I would just make a new one.
Modified by BakaKaito, May 2, 2013 12:41 PM
Hey you, come tell me what you think about My Creations(11/7/14)! ^-^
 
May 25, 2013 1:29 PM
Offline
Joined: Sep 2010
Posts: 1
Okay noob question, how to install BeautifulSoup? I'm sorry but this took me over 4 hours and I still haven't figured it.
Someone give me clear noob steps please.. I'll try to figure the rest myself or ask again if I bump into something, for now I can't even start.
 
May 29, 2013 11:06 AM
Offline
Joined: Nov 2012
Posts: 2
Why there is no "the god of high school " manga to add to my list?
 
May 29, 2013 11:48 AM
Offline
Joined: Dec 2011
Posts: 271
sampleslayer said:
Why there is no "the god of high school " manga to add to my list?


If I remember right. It does not meet database guidelines.
 
Jun 10, 2013 4:38 PM
Offline
Joined: Sep 2009
Posts: 1
Fishi said:
Looks like Seikimatsu Occult Gakuin caused the error. I'm guessing it's because of this funky character �.... anyways here's the code that'll work with python 2.7 http://pastebin.com/Ap67mrXM

Edit: Got it imported to MAL very nice 714/887 added ^^/ Looks like everything works almost nicely!


https://dl.dropboxusercontent.com/u/34143874/ap%20list%20code%20error.png

Getting this error when I try it. Using Windows 7, Python 2.7 and BS4. Any clue what's causing it?
 
Jun 11, 2013 10:13 AM
Offline
Joined: Jun 2013
Posts: 1
Thanks for this Fishi, and of course Ergosis. Except for the special characters one, it successfully imported my friend's (was helping him out) Anime-Planet list.

Is there any way we can see the amount of time spent as on Anime-Planet, after importing to MAL? :c
 
Sep 22, 2013 5:40 PM
Offline
Joined: Jun 2009
Posts: 1
Anime-Planet appears to be doing user-agent blocking, so I modified the script to fake the user-agent. Try this if you get a 403 Forbidden error.

http://pastebin.com/x0gHs2Bb


Modified by EricJ2190, Sep 22, 2013 6:10 PM
 
Oct 12, 2013 12:18 PM
Offline
Joined: Nov 2011
Posts: 1
EricJ2190 said:
Anime-Planet appears to be doing user-agent blocking, so I modified the script to fake the user-agent. Try this if you get a 403 Forbidden error.

http://pastebin.com/x0gHs2Bb


I did get the 403 error, though after using your script i got a 502 error.

Traceback (most recent call last):
File "C:\Users\Sebas\Downloads\beautifulsoup4-4.1.0\beautifulsoup4-4.1.0\xml.p
y", line 38, in <module>
queryTitle = urllib2.urlopen(queryURL + queryName).read()
File "C:\Python27\lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 502: Bad Gateway
 
Oct 25, 2013 1:33 PM
Offline
Joined: Oct 2013
Posts: 1
Sebaku said:
]I did get the 403 error, though after using your script i got a 502 error.

Same error for me. :| Anyone got a fix for this? AP nick LancerFIN if someone can upload my .xml. Thanks
 
Nov 14, 2013 1:37 AM
Offline
Joined: Nov 2013
Posts: 5
I also get this error. I tried going to the url in the browser and it says the site is too busy. I read that there has been issues with DDoS lately, so it might be the cause.

I thought it might also be something to do with the unofficial API (http://mal-api.com/docs/#read_search_anime), so i tried using the official API (http://myanimelist.net/modules.php?go=api) which requires you to authenticate, but i can't seem to authenticate.
 
Nov 19, 2013 2:22 AM

Offline
Joined: Jun 2009
Posts: 29
True, unofficial API is down as it seems.

To authenticate you can simply provide HTTP header:
titlereq = urllib2.Request("http://myanimelist.net/api/anime/search.xml")
titlereq.add_header("Authorization", "Basic %s" % base64.encodestring("%s:%s" % (malusername, malpassword)).replace("n", ""))
titlereq.add_header("User-Agent", "Mozilla/5.0 (Windows NT 6.2; Win64; x64;) Gecko/20100101 Firefox/20.0")
queryTitle = urllib2.urlopen(titlereq, urllib.urlencode({ "q" : animeName })).read()

But if a method to detect MAL anime ID had been found then one shouldn't write XML simply because it's possible to add bunch of series from the same script via native API then.
params = {'id' : animeID, 'data' : xmlData}
url = 'http://myanimelist.net/api/animelist/add/'+animeID+'.xml'
urllib2.urlopen(urllib2.Request(url, urllib.urlencode(params)))
A story has no beginning or end; arbitrarily one chooses that moment of experience from which to look back or from which to look ahead.
 
Nov 19, 2013 4:55 AM
Offline
Joined: Nov 2013
Posts: 5
Ah, it seems that i needed a user agent header.

Though now i'm getting a json error:

Traceback (most recent call last):
File "ap_scrapper.py", line 42, in <module>
search=json.loads(queryTitle,"utf8")
File "C:\Code\Python27\lib\json\__init__.py", line 351, in loads
return cls(encoding=encoding, **kw).decode(s)
File "C:\Code\Python27\lib\json\decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Code\Python27\lib\json\decoder.py", line 383, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Here's the script i used for reference http://pastebin.com/rUm0DmG2

I'm not experienced with python, so i'm not really sure whats wrong.
 
Nov 19, 2013 5:40 PM

Offline
Joined: Jun 2009
Posts: 29
That's because native MAL API doesn't support JSON output. It uses XML with some unknown encoding (because python breaks on it often if it's decoded as UTF-8).

Updated scripts a bit:
You will need python 3.3.3 to run it.
Modified by ErgoSis, Dec 4, 2013 6:33 AM
A story has no beginning or end; arbitrarily one chooses that moment of experience from which to look back or from which to look ahead.
 
Nov 19, 2013 10:33 PM
Offline
Joined: Nov 2013
Posts: 5
Excellent, thanks heaps for that!

I was able to generate an xml of about 80% of my list.

Just importing my list to anidb now before i can import it to MAL.

Thanks again!

EDIT: I imported the list into AniDB fine, but i can't seem to import it into MAL. I exported the AniDB list as an xml as per the instructions on MAL importer page.

Internal Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, root@localhost and inform them of the time the error occurred, and anything you might have done that may have caused the error.

More information about this error may be available in the server error log.
Modified by elementalest, Nov 20, 2013 3:15 AM
 
Top
Pages (3) « 1 [2] 3 »