If you are a publisher, no doubt you have invalid traffic on your site. The definition of invalid traffic (IVT) covers a wide range of traffic, both legitimate and fraudulent. It can encompass non-human traffic like search engine spiders and bots, malicious programs like malware, or even human activity that produces fraudulent traffic like false clicks on advertisements.
The MRC breaks down IVT into two main categories: General Invalid Traffic and Sophisticated Invalid Traffic. Simple definitions of these two types of IVT are as follows:
General IVT: Known sources of invalid traffic like data center traffic and spiders or crawlers.
Sophisticated IVT: Bots or spiders masquerading as legitimate users, hidden ads, hijacked devices, hijacked sessions, hijacked ad tags, hijacked creatives, adware, malware, manipulation of measurements and much, much more.
Detecting and filtering IVT versus eliminating or preventing IVT from entering the site requires different methods of implementation.
Detecting and filtering IVT has been mainstreamed by programs such as Moat, comScore AdEffx, IAS, DoubleVerify, and others. These programs are great at identifying the IVT and then filtering it out of ad metrics. This helps advertisers understand the human traffic that has seen or clicked on their ads. With the human traffic identified, advertisers only need pay for real traffic.
For publishers however, detecting IVT is only the first step to eliminating it from a website. eHealthcare Solutions has been monitoring IVT on our vast healthcare network since early 2015; we’ve seen an increase of IVT over the past three years. This increase could be attributed to an increase in bad traffic, or the detection technology is now better able to track and find IVT. Never have we found one of our publishers purposefully inflating traffic. More is the case that the publisher just doesn’t know how to eliminate these bad actors from their sites. This is because it can be a complex, technical undertaking.
Nevertheless, it is important that publishers take steps to understand how to stop or block this bad traffic. Last year, I wrote an article that contains important first steps to securing a website: Publishers, Have You Been Hacked?.
However, more recently, we discovered that blocking certain IP addresses within the .htaccess file is the next step to eliminating IVT. This is a more sophisticated method, but extremely important to the security and protection of your website. Here are the steps for blocking IP addresses:
NOTE: Don’t try this at home, kids. This process should not be entered into lightly and should be handled by an IT professional.
- Obtain a list of bad IP addresses from your IVT filtration program (Moat or others).
- If you cannot obtain this, you may find them through your log files. Here is an article that explains how to find the “User Agents” that are hitting the site most often: 3 Steps To Find And Block Bad Bots. These are going to be the offenders you wish to block.
- Once the bad IP addresses have been identified, you will want to block them from accessing your site. If your site is on an Apache web server, you can block the offending IP addresses in your .htaccess file. Here is an article that explains this process: How to Block Unwanted Bots from Your Website with .htaccess.
- Here are a few bad bot lists that you may also want to implement:
Blocking robots on your web page – the list of 1800 bad bots
Blocking Bad Bots and Scrapers with .htaccess
If you are a publisher within the EHS network and you are concerned about your IVT, please reach out to your Manager of Partner Relations.
Good luck and godspeed!