Forum Settings
Forums

Block GPTbot on some parts of the site the robots.txt doesnt currently block

New
Feb 15, 9:46 PM
#1

Offline
Mar 2008
46919
GPTbot should be blocked on any user input content such as forum boards, forum posts in particular, blog entries, news comments (which is just another place for the forum comments. The news articles are fine to be crawled and scraped though in my opinion), and possibly reviews since users might not want chat GPT hijacking their reviews necessarily (though some might so that's why im unsure on this one).
https://arstechnica.com/information-technology/2023/08/openai-details-how-to-keep-chatgpt-from-gobbling-up-website-data/

I know MAL already has a robots.txt but it only is generically blocking certain components of the site. I think GPTbot has a different context as to why it might be good to block it since it's not just a search engine crawler but a web scraper that places content in databases.
http://myanimelist.net/robots.txt
traedFeb 15, 9:50 PM

More topics from this board

» @ sign spam/attack

kuroneko99 - Apr 16

4 by traed »»
28 minutes ago

» Add the option to change profile favorites pictures

k1rb - Oct 21, 2022

20 by Astachanna »»
1 hour ago

» An "Anime Franchise" page

_cjessop19_ - 9 hours ago

1 by Astachanna »»
1 hour ago

» Combining every season of an Anime?

Dennisss - Apr 1, 2021

17 by _cjessop19_ »»
9 hours ago

Poll: » Add list setting to make notes private (on public lists)

S_h_a_r_k_93 - Nov 12, 2022

25 by anonymate »»
Apr 24, 9:57 PM
It’s time to ditch the text file.
Keep track of your anime easily by creating your own list.
Sign Up Login