Here is all the bot/indexer/scanners for the first 4 hrs of today from just one webserver:
13639 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) << Way too high for less then 4 hours of logging
656 Mozilla/5.0 (compatible; DotBot/1.1; http://www.dotnetdotcom.org/, crawler@dotnetdotcom.org) << Way too high for less then 4 hours of logging
305 Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
270 Baiduspider+(+http://www.baidu.com/search/spider.htm)
236 Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
205 Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
156 Mozilla/5.0 (compatible; Purebot/1.1; +http://www.puritysearch.net/)
101 Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
86 StackRambler/2.0 (MSIE incompatible)
85 The Incutio XML-RPC PHP Library — WordPress/3.1
79 Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13
55 Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)
40 Sosospider+(+http://help.soso.com/webspider.htm)
38 Mozilla/4.0 (compatible; Powermarks/3.5; Windows 95/98/2000/NT)
36 Mozilla/4.7 (compatible; OffByOne; Windows 2000) Webster Pro V3.4
34 Googlebot-Image/1.0
33 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)
30 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; XMPP Tiscali Communicator v.10.0.2; .NET CLR 2.0.50727)
28 Mozilla/5.0 (compatible; discobot/1.1; +http://discoveryengine.com/discobot.html
24 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MRA 4.6 (build 01425))
24 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0)
23 WordPress/3.1; http://thegeekoftheworld.com
23 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; TheFreeDictionary.com; .NET CLR 1.1.4322; .NET CLR 1.0.3705; .NET CLR 2.0.50727)
23 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1)
22 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Deepnet Explorer 1.5.0; .NET CLR 1.0.3705)
20 Mozilla/5.0 (compatible; MJ12bot/v1.3.3; http://www.majestic12.co.uk/bot.php?+)
18 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) Babya Discoverer 8.0:
17 Mozilla/3.0 (compatible; WebCapture 2.0; Auto; Windows)
16 Mozilla/4.0 (compatible; MSIE 4.01; Digital AlphaServer 1000A 4/233; Windows NT; Powered By 64-Bit Alpha Processor) << Cool
16 HuaweiSymantecSpider/1.0+DSE-support@huaweisymantec.com+(compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR ; http://www.huaweisymantec.com/en/IRL/spider)
15 Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)
15 Mozilla/5.0 (compatible; BlogScope/1.0; +http://www.blogscope.net/; U of Toronto)
14 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; KTXN)
13 Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)
12 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Crazy Browser 2.0.0 Beta 1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)
12 Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; MSIECrawler)
11 Zoundry Raven (www.zoundry.com); zpypatch.xmlrpclib.py/1.0.1
11 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1 + FairShare-http://fairshare.cc)
11 MLBot (www.metadatalabs.com/mlbot)
10 SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
9 DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
8 Snapbot/1.0 (Snap Shots, +http://www.snap.com)
8 Rome Client (http://tinyurl.com/64t5n) Ver: 0.9
8 Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)
7 PostRank/2.0 (postrank.com; 1 subscribers)
6 xpymep.exe
6 Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector; +http://yandex.com/bots)
6 KSCrawler/Nutch-1.0 (http://www.kindsight.net/en/kscrawler; crawler@kindsight.net)
6 BlogPulseLive (support@blogpulse.com)
5 Mozilla/5.0 (compatible; YandexMedia/3.0; +http://yandex.com/bots)
5 Mozilla/5.0 (compatible; SiteBot/0.1; +http://www.sitebot.org/robot/)
5 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR ; http://www.talktalk.co.uk/products/virus-alerts/)
4 Yeti/1.0 (NHN Corp.; http://help.naver.com/robots/)
4 WordPress/3.0; http://myirctools.info
4 Twitterbot/0.1 << Twitter has a bot? ok…
4 Mozilla/5.0 (compatible; Twingly Recon; twingly.com)
4 Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.13) Gecko/20101203 YFF35 Firefox/3.6.13 ( .NET CLR 3.5.30729; .NET4.0C) SearchToolbar/1.2
4 Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.9.0.13) Gecko/2009073022 Firefox/3.5.2 (.NET CLR 3.5.30729) SurveyBot/2.3 (DomainTools)
4 Java/1.6.0_16
4 Jakarta Commons-HttpClient/3.1
3 ping.blo.gs/2.0
3 Y!J-BRO/YFSJ crawler (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.co.jp/help/jp/search/indexing/indexing-15.html; YahooFeedSeekerJp/2.0)
3 SocialSearcher/0.1 Mozilla/5.0
3 R6_FeedFetcher(www.radian6.com/crawler)
3 NSV Player/0.0 (ultravox/2.0)
3 Mozilla/5.0 (compatible; MSIE 6.0b; Windows NT 5.0) Gecko/2009011913 Firefox/3.0.6 TweetmemeBot
3 Mozilla/5.0 (compatible; Birubot/1.0) Gecko/2009032608 Firefox/3.0.8
3 Moreoverbot/5.1 (+http://w.moreover.com; webmaster@moreover.com) Mozilla/5.0
3 Jakarta Commons-HttpClient/3.0
2 wikiwix-bot-3.0
2 webcollage/1.135a
2 ping.wordblog.de/ping/1.0
2 msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)
2 msnbot/2.0b (+http://search.msn.com/msnbot.htm)._
2 ichiro/5.0 (http://help.goo.ne.jp/door/crawler.html)
2 ichiro/4.0 (http://help.goo.ne.jp/door/crawler.html)
2 Voyager/1.0
2 SeznamBot/3.0-beta (+http://fulltext.sblog.cz/), I
2 SaladSpoon/ShopSalad 1.0 (Search Engine crawler for ShopSalad.com; http://shopsalad.com/en/partners.html; crawler AT shopsalad.com)
2 R6_CommentReader(www.radian6.com/crawler)
2 PycURL/7.18.2
2 Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )
2 Mozilla/5.0 (compatible; Windows NT 6.0) Gecko/20090624 Firefox/3.5 NjuiceBot
2 Mozilla/5.0 (compatible; Najdi.si/3.1)
2 Mozilla/5.0 (compatible; Exabot-Images/3.0; +http://www.exabot.com/go/robot)
2 Mozilla/5.0 (compatible; Butterfly/1.0; +http://labs.topsy.com/butterfly/) Gecko/2009032608 Firefox/3.0.8
2 Mozilla/5.0 (compatible; Bender; http://sites.google.com/site/bendercrawler)
2 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 OneRiot/1.0 (http://www.oneriot.com)
2 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) Speedy Spider (http://www.entireweb.com/about/search_tech/speedy_spider/)
2 Mozilla/4.0 (compatible; Netcraft Web Server Survey)
2 Morfeus Fucking Scanner
2 Lynx/2.8.8dev.5 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/2.8.6
2 HuaweiSymantecSpider/1.0+DSE-support@huaweisymantec.com+(compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR ; http://www.huaweisymantec.com/cn/IRL/spider)
2 Asynchronous WinHTTP Demo/1.0
1 anonymous
1 Yahoo! Slurp China
1 Wget/1.9+cvs-stable (Red Hat modified)
1 Wget/1.12 (linux-gnu)
1 Twingly Recon
1 Tsinghua AI Lab Robot 2.0
1 SocialSearcher/0.1 Mozilla/5.0
1 Mozilla 5.0 (compatible; Feedsky crawler /1.0; http://www.feedsky.com)
1 Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Wireless Transcoder) Version/3.1 Safari/525.13QA
1 Mozilla/5.0 (compatible; suggybot v0.01a, http://blog.suggy.com/was-ist-suggy/suggy-webcrawler/)
1 Mozilla/5.0 (compatible; Evrinid Iudex 1.0.0; +http://www.evri.com/evrinid)
1 Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
1 Mozilla/4.0 (compatible; ICS)
1 Mozilla/3.0 (compatible; Indy Library)
1 Java/1.6.0_14
1 Baiduspider+(+http://www.baidu.jp/spider/)
1 Aynchronous WinHTTP Demo/1.0
As you can see Google and dotcomdotnet bot’s are connection alot and are requesting alot of data…
Cause how the backend is setup Network URLs like: irc.irc-coolness.tld would brake the database so to fix thie issue we are using hashing for the &server=
Later on I will update the Whole system and give the front end some much needed love.
URL’s like: http://myirctools.info/api/rss/?server=irc.serverhere.tld&channel=channelhere Become: http://api.myirctools.info/rss/irc.serverhere.tld/channelhere This: http://myirctools.info/channellist/rss/?topiccolor=FFFFFF&bgcolor=000000 Becomes: http://api.myirctools.info/channellist/irc.serverhere.tld/FFFFFF:000000 And more coming soon.
If you have a wordpress website and would like to vist a page in ssl and not have the links on the page be non-ssl then try the following.
*Note that the following ties into the core of wordpress meaning there is no bulky plugins doing this.
Look inside of you wp-config.php file and locate the following:
define('WP_HOME', 'http://websitename.tld'); // blog url define('WP_SITEURL', 'http://websitename.tld'); // site url
remove it and add the following:
if($_SERVER['HTTPS']){ $connecttype = "https"; }else{ $connecttype = "http"; } define('WP_HOME', 'http://websitename.tld.'); // blog url define('WP_SITEURL', $connecttype.'://websitename.tld'); // site url
There you go all of you plugins and pages will stay in SSL if you enter ssl on you site at anytime.
Also try adding:
define('FORCE_SSL_ADMIN', true);
If you would like the admin part to always be in ssl.