View unanswered posts | View active topics It is currently Sun Feb 09, 2025 9:53 am

← Back to the Calcudoku puzzle page

 [ 1 post ] 
 new crawler bots overloading the site :-( 
Author Message
User avatar

Posted on: Tue Sep 12, 2023 1:21 pm

Posts: 3471
Joined: Thu May 12, 2011 11:58 pm
Post new crawler bots overloading the site :-(
A while after this issue: viewtopic.php?f=18&t=1475,

I'm finding that "bing" is _still_ near the top amongst the bots crawling the site [thumbdn]

There is this "crawl delay" parameter I can use to lower the rate, but only up to a max of 20 seconds (it is now set at 10).

A newcomer is the "GPTbot" crawler (from OpenAI, of ChatGPT fame), with ~ 74000 visits in a week [mad]

The current top 10, approximate number of visits in a week:

1. GPTbot, 74000
2. Ahrefs, 56000
3. Bing, 48000
4. dotbot (mozilla), 29000
5. Yandex, 18000
6. comscore, Bytespider (tiktok), 5000
7. Grapeshotcrawler (Oracle), 4000
8. Googlebot, Amazon bot, 3000
9. Yahoo Japan, 2500
10. peer39, 2000

I'll most likely block some of these to reduce server load.

edit: I blocked a number of them (GPTbot, Ahrefs, dotbot, Yandex, comscore, Bytespider, Grapeshot), so this should help [smile]

With Yandex gone, the site can at some point no longer be found from Russia.

edit 2: the actual Google crawl rate is closer to ~ 25000 (judging from the stats in their "search console").
You can manually set a lower crawl rate there (although the page with that setting is hard to find).
The lower rate automatically expires in 3 months (?!) [huh]

Display posts from previous:  Sort by  
 [ 1 post ] 

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
All forum contents © Patrick Min, and by the post authors.

Forum software phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by STSoftware.