Forum Settings
Forums

Block GPTbot on some parts of the site the robots.txt doesnt currently block

New
Feb 15, 9:46 PM
#1

Offline
Mar 2008
47119
GPTbot should be blocked on any user input content such as forum boards, forum posts in particular, blog entries, news comments (which is just another place for the forum comments. The news articles are fine to be crawled and scraped though in my opinion), and possibly reviews since users might not want chat GPT hijacking their reviews necessarily (though some might so that's why im unsure on this one).
https://arstechnica.com/information-technology/2023/08/openai-details-how-to-keep-chatgpt-from-gobbling-up-website-data/

I know MAL already has a robots.txt but it only is generically blocking certain components of the site. I think GPTbot has a different context as to why it might be good to block it since it's not just a search engine crawler but a web scraper that places content in databases.
http://myanimelist.net/robots.txt
traedFeb 15, 9:50 PM

More topics from this board

» Correction Slice of life definition

aule10 - Yesterday

2 by aule10 »»
2 hours ago

» Could MAL match users with the most similar watched anime and ratings?

mur_koshka - Apr 7

5 by Euthymia_Gerv »»
2 hours ago

» Add Volume tab for Manga, like the Episode tab for Anime

Readerio - May 11

0 by Readerio »»
May 11, 2:35 PM

» re-watched titles

macmetric - May 10

1 by hacker09 »»
May 10, 8:15 PM

» Are there any plans to revamp mal on mobile?

Akuya - May 10

2 by Alexioos95 »»
May 10, 9:19 AM
It’s time to ditch the text file.
Keep track of your anime easily by creating your own list.
Sign Up Login