How to Detect Bot Traffic On Your Website

No matter how big your website is, you’re almost guaranteed to receive bot traffic at some point. These bots are often up to a range of different things on your site, from indexing web pages to scraping your content. With so many different bots out there, how can you detect bot traffic on your website? And should you be concerned?

To help you see if bots are affecting your website and it’s performance, we’re taking a look at how you can detect bot traffic.

If you’ve noticed an increase in page load times, a higher bounce rate or lower average time on page, then you could have bots visiting your site. While not all robots should be blocked, there are plenty out there that could be doing malicious things to your site.

Before we dive into how to detect these bots on your site, let’s take a look at the various types of bots out there.

The Different Type Of Bots

types of bots

To explain bots simply, they can be defined as: software applications written to perform specific repetitive tasks. This could be anything from checking the prices on a website every minute to posting a new comment every hour. The whole aim of a bot is to take a repetitive task and automate it. The type of task the robot does can help define if the bot is good or bad. As you’ve probably realised, not all online bots are bad and some of them are actually essential to keeping the internet running.

If you read our last post on What Percentage of Internet Users are Robots? Then you should know about the various different types of robots. To give you a quick summary, internet robots can be split into 2 groups, good bots and bad bots.

Good bots do things such as index your website in search engines, monitor your website’s health, and fetch RSS feeds.

Bad bots, on the other hand, do things such as scrape your links and content, post spam messages, or attempt to disrupt your site.

With so many new bots being made and launched almost every day, keeping track of bots can be a tricky task. But how exactly can you tell if you’ve been visited by a good or bad bot?

Detecting Bot Traffic

arm removing a bot

When it comes to detecting bot traffic, there are actually several ways you can check to see who’s visiting your site. Some methods are easy and will give you a quick overview if you’re being swarmed by bots, while other methods take a lot longer to analyse the data. Here are some of the most effective means to detect bot traffic on your site:

The first way to check your website for bots is to check your Google Analytics stats for any inconsistencies. By paying attention to the number of page views, average session duration and referrers, you can quickly work out if bots are visiting you, and how frequently.

One of the most obvious things you’ll notice when being visited by bots is a sharp increase in the number of page views. If a robot is crawling your entire website, then they’ll load up countless pages at the same time. If your average page visit per user is 3 and suddenly you see someone visit all 50 pages of your site, then they’re probably a bot.

Other metrics you should be checking on your site should be the average page duration and bounce rate. If you notice your average page duration decreasing and your bounce rate increasing, then this is a sign you’re being visited by bots.

Since bots are incredibly quick, they usually only take a few seconds to crawl your site and get the information they need. Compare this to a typical user, and the bot’s page duration is likely to be a lot lower. Once the bot has finished crawling the page, the bot will leave and move onto the next site. This will have a big effect on your bounce rate. By leaving without visiting another page, Google will class the bot visit as a bounce, even though they were never a real user! Over time, this can wear down you Google metrics. By paying attention to a change in these metrics this can give you a heads up that your website is experiencing significant bot traffic.

Another way to detect bot traffic on your website is to be aware of the speed of your site. If you’re experiencing a massive influx of bots then you’ll probably notice your site loading slower. One bot might not make much difference, but having several bots at the same time can start to strain your server. There’s actually a chance that malicious bots are attempting to overrun your server and take it offline! Also known as a DDOS attack, these attacks can have devastating effects on businesses. Especially when their website is the primary source for doing business and receiving orders.

If you’ve checked your analytics and noticed some unusual metrics or slow page load times, then you could be under attack by bots. Luckily there are several ways you can ban them from your site forever.

How to Stop Bots Visiting Your Site

locked computer

When it comes to blocking bots from visiting your website you have several options. The first option is to create a robots.txt file and specify which robots you don’t want entering your website. However, this only works for well-behaved bots. Some bots completely ignore this file and visit your website regardless if you’ve told them not too. In order to block the naughty bots, you’ll want to use another form of protection that shields your website. Let’s take a look at creating a robots.txt file first.

Block Bots With Robots.txt

google robots

For those who haven’t heard of a robots.txt file then it’s easy to understand. A robots.txt file basically tells robots what they can and can’t visit. If a robots.txt file doesn’t exist, then any robot will be able to visit your website and crawl your pages. But if you do have a robots.txt file then most robots will check it first to see what they can and can’t crawl.

The primary reason for having robots.txt is to tell search bots which pages you do, and don’t want indexed. Since Google’s Googlebot always searches for a robots.txt before it does any crawling, this tells the bot which pages it can access. If you have a private section of your website that you don’t want crawling by Google, then you simply disallow the directory within the robots.txt.

If you want to take it one step further and block ALL robots from accessing your site, then you can set up a robots.txt like this. This tells every bot that accesses your site to go away. It might sound like a good thing, but be careful, if Googlebot doesn’t crawl your website then you’ll disappear from Google!

To define which bots can and can’t access your website, you need to specify them with their user agent. Luckily there’s an excellent guide by KeyCDN on how to create a robots.txt file and block robots. To find the bad or annoying bots of user agents, you’ll have to do some further research into who’s visiting you. A useful trick is to copy parts from other website robots.txt until you have a nice list of user agents you want to block.

However, it’s important to remember that not all bots will pay attention to this file. Some bots will still visit your website regardless if you let them or not. If you want to block them, then you’ll need to use a particular type of software.

Blocking Bots With Anti-DDOS

dds protection example

To block the bad bots from accessing your site, you’ll need some sort of DDOS protection service. This service basically shields your website like a firewall and checks every incoming request. If the IP address and user agent matches that of a known bad bot, then the request will be blocked. If the request is from a genuine user who has no previous malicious activity, then they’ll be allowed in.

One of the most popular free DDOS protection services is from CloudFlare.com. Their free package gives your site protection from a lot of annoying bots. In addition to this, if you combine it with the robots.txt file mentioned earlier, you’ll be blocking around 90%+ of the bad bots out there.

By blocking most of the malicious bots from visiting your website, you should notice faster load times, lower bandwidth costs, and better analytic metrics.

Protect Your Ads

Protecting your site from malicious robots is one thing, but protecting your ads is another challenge altogether.

Unfortunately, not every website owner knows how to protect their website from robots. This means if you’re running a Google display network campaign on AdWords then your adverts could be at risk of click fraud.

Just think about it, all those bad bots out there scraping millions of sites a day. Eventually, they’ll end up scraping a website that has your ads displayed on it. Since they crawl every clickable link on a web page it’s highly likely they’ll be clicking your ads. Not only does this cost you money, but it also ruins your statistics. It makes you think people are interested in your advert and lures you into believing you are getting genuine clicks.

Luckily, click fraud can be prevented with the right protection tools. To see how much you can save on your AdWords campaigns, try out our free 30-day trial below.