List of most active web crawlers User-Agent strings

Allowing web crawlers to scan your site is a necessity if you want your web pages to appear in Google, Bing or other search results.

But at the same time, excessive traffic caused by non-human visitors can be costly in terms of bandwidth, website stability, and even potential outages. To help you understand web crawlers, bots and spiders visiting your site we released the most recent list of top 20 bots including their user agents.

Get access to a free, trial version of DeviceAtlas.

Start Your Free Trial Today

Detect all web crawlers and spiders

A web bot or robot is a piece of software that runs various automated tasks which can be completed much quicker or cheaper than if carried out manually by a human. Web crawlers which automatically scan online content are deeply ingrained in the online world so that you may be unaware of the amount of web traffic these “machines” generate. Google Analytics doesn't report on bots and crawlers by default and you may not be able to see the entire share of non-human traffic.

DeviceAtlas is a device detection API and a repository of web-enabled device profiles which works through parsing User-Agent strings. All devices visiting websites have UAs, including non-human visitors, hence DeviceAtlas can provide you with a full report on the amount of bot and crawler traffic to your website.

To create a list of most active web crawlers, we used traffic to thousands of DeviceAtlas-powered websites between January and March 2016 (Q1 2016). The following list of common web crawlers and robots is only for your reference and comparison. Your server logs may show a different picture of non-human traffic depending on your audience profile, geographical location, etc.

To help you identify these bots in your server logs, we also included User-Agent strings.

The Complete Guide To User Agents.

Download our free e-book on User Agents to learn:

  • What is a User Agent?
  • How do you parse them?
  • What can you do with them?

Download the Free Guide

The most active bot was a search engine – but not Google

We identified all web crawlers and bots that appeared in our User-Agent-based statistics. Majestic-12 bot was the most active, exceeding the amount of traffic from any other bot including Google and Bing. Majestic is a community-driven project aimed to create a search engine based on a distributed web crawler, the Majestic-12 bot which appears in our stats.

The following table shows the list of top 20 bots generating web traffic including some basic information about their purposes and their User Agent strings.

68.5% | MJ12bot | Search engine | Desktop bot
Mozilla/5.0 (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+)
16.8% | Googlebot | Search engine | Desktop bot
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
3.2% | Googlebot | Search engine | Mobile bot
Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
2.4% | Bingbot | Search engine | Desktop bot
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
1.4% | SimplePie | RSS | Desktop bot
SimplePie/1.3.1 (Feed Parser; http://simplepie.org; Allow like Gecko) Build/20121030175911
0.7% | Bingbot | Search engine | Mobile bot
Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 BingPreview/1.0b
0.6% | Yahoo! Slurp | Search engine | Desktop bot
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
0.4% | Bingbot | Search engine | Mobile bot
Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; http://www.bing.com/bingbot.htm)
0.4% | Googlebot Mobile | Search engine | Mobile bot
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
0.4% | Googlebot Mobile | Search engine | Mobile bot
DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
0.3% | Bingbot | Search engine | Mobile bot
Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
0.3% | AdsBot Google Mobile | Search engine | Mobile bot
Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
0.2% | SiteLockSpider | Security | Desktop bot
SiteLockSpider [en] (WinNT; I ;Nav)
0.2% | OkHttp | Various purposes | Desktop bot
okhttp/2.5.0
0.2% | Curl | Various purposes | Desktop bot
curl/7.35.0
0.1% | Ips Agent | Market research | Mobile bot
Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:14.0; ips-agent) Gecko/20100101 Firefox/14.0.1
0.1% | Googlebot | Search engine | Desktop bot
Googlebot-Image/1.0
0.1% | BLEXBot | Market research | Desktop bot
Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
0.1% | Yandex Bot | Search engine | Desktop bot
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
0.1% | ScoutJet | Search engine | Desktop bot
Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)

Read more about how DeviceAtlas can detect bot traffic here.

Start parsing User-Agent strings

DeviceAtlas is a high-speed solution for parsing User-Agent strings used by some of the largest companies in the online space to:

  • Optimize website content for mobile, tablet, and other devices
  • Boost website loading time and minimize page weight
  • Target ads and analyze web traffic

Get started with User-Agent parsing by testing a locally-installed version of DeviceAtlas at no cost.

Get started