The pay per click system is an enormous source of income for Google, but also for many other web sites. The companies who choose to promote their website with the pay per click system need to be careful regarding the amounts for which they are billed. There are a growing number of frauds and they take miscellaneous forms.

Some web sites or affiliates can pay people from remote places like Botswana to make fraudulent clicks on an ad in order to inflate their customer's bills. Since the year 2006 click frauds are not limited to such methods, now malwares like click bots can be used for such a goal. These small pieces of code can be spread like viruses on many computers in order to generate clicks from different IP addresses. The most intelligent scams involve a malware that adopts a low profile and generates only a few clicks per computer in order to avoid detection. These bots are generally controlled remotely by the person who wishes to limit the clicks to ads that can generate a real profit.

Of course Google is not involved behind such a scam even if it happens often that PPC users ask themselves questions regarding the amount for which they are billed. For example you may wish to avoid paying too much if a visitor (let say a competitor) is clicking many times on the same ad from the same IP address in order to exhaust your budget. Google is doing itself an effort to reduce the number of click fraud cases, but this company is posing itself as a judge even if it is also making a profit from PPC. An extra tool is not useless if you wish to see in details how Google is billing your clicks.

A log analyzer like Expert Data Miner has much functionality that makes it unique. This software allows you to detect many cases of click fraud but also to understand in details the results for any referrer. EDM will check for the duplication of hits, verify if the IP address of a visitor is an anonymous proxy, check if the visitor's browser fetched the images associated with your target page (normally bots do not), verify if there is a pattern with cookies and perform a statistical analysis according to countries. Getting all those information IS important. You can try to get a refund from Google for click fraud cases that have already occurred, but reacting quickly when a dangerous pattern is underway is certainly wiser. From your Google account you can use two of their filters to prevent clicks from your competitors or some dishonest Internet users. The filter on IP ranges will work fine for visitors who have a static IP address, the filters on locations (like cities) can be used in the worst case if you target a broad market and your competitor is coming from a small city. In addition you can prevent the diffusion of your ads from some websites in the Content Network (or the Display Network) or lower your bid for the worst performers.

Lets take the report dealing with  Google Syndication  or either Google AdWords, Display Network and the Search Network: In this report, EDM gives you the list of all the referring websites and the number of visitors who were sent to your site. If you right click on one of those referrers, you get the list of all the visitors and their unique IP or cookie. In the following screen a popup (the click trail) appears when one of the lines is chosen:

  Google AdWords and the Pay Per Click

When someone clicks on one of the lines in this popup he gets the details for this visitor:

Requested pages by a visitor

Here this visitor started to ask for the page download.asp;  then he clicked on a link inside it and fetched the root page at 00:11:03 (the "/" is the root page ). If you store the cookies of your visitors in your log files the software allows you to fetch their behavior during a subsequent visit (the button Next Visit at the bottom of the window near the button Close ). Since this visitor didn't come back the button is grayed. So it is easy to detect suspicious behaviors associated with click frauds. If you fetch the DNS you can sometimes get the name of the server of your visitor (in some cases it can be a competitor). You can also sort your visitors according to their IPs or their cookies if you click on the header of the relevant column in the popup.

It is also possible to fetch the same kind of information for external websites who will charge you for an ad ( not Google Ad Words here) in another report. In the fact this information is available by keywords, landing pages, referrers, etc...

It can be somewhat more difficult to detect a click bot. If such a malware has spread itself you'll get many clicks from miscellaneous IP's and some click bots will adopt a low profile and perform a few clicks per day to avoid detection. However these robots have some features that distinguish them from human visitors, and the periodicity of the clicks is one of them. The probability to see a human being clicking on your ad every 900 seconds 4 times in row is weak. But bots have also another characteristic, except if they hijack a real browser they do not support JavaScript, so the Google Analytics cookies in general. They are not true browsers. If you configure your server to store the cookies that you assign to your visitors at the end of each line in your log files ( something easy for Google Analytics cookies) it is possible to detect indirectly an abnormal behavior for a set of visitors.

Indeed, Expert Data Miner allows you to apply filters based on a segment of the name of your pages, your whole pages, or a segment of the referrer. It is thus possible to isolate those who find your site with PPC and those who find it through an organic search - a search that doesn't cost you a penny -. It is the presence of a abnormal fraction of visitors whose browsers refuse cookies that will raise a concern. In many reports, you can build a column from scratch and one of them concerns the number of visitors whose browser accepted your cookies. This column is available as a percentage but if you right click you get an historical chart for any of your pages:

detecting click bots

The blue curve gives the percentage of the visitors who ask for the page "/" (the root page) and whose browser accept cookies. This percentage varies from 31.43 to 64.19% in this example. In the fact this percentage, for human visitors, should be between 80% and 90%.

But you can also get details regarding click frauds from the report about the Content Network. If you press the F6 key, the software will scan for strange patterns and provide you a description of such cases (including the IP address of the visitors). EDM will regroup your visitors with their IP, their cookie, or provide useful data regarding the top countries. If 1.4% of your visitors are coming from a country but own 18% of the PPC clicks, this is quite abnormal. If a competitor clicks on your ads 10 times a day and repeats the same operation the day after (when his provider assign him a new dynamically allocated IP address) you can often know it from the cookie.

Most of the fraud cases come from unscrupulous webmasters who use Adsense (the Content or Display Network); some of them click on your ads to boost their profits. In a growing number of cases the task can be delegated to third world countries, especially if your bids per clicks are high. There are very little cases of fraud from paid searches on keywords (Google) except from some competitors. But still if you have a doubt, if your competitor is not using cookies and doesn't use a static IP address, you can still get his ISP name and get the IP range from this provider. Especially when it's a small ISP you can apply a filter in EDM to see what's going on. The report on Search Phrases displays by default all search queries (organic or ppc clicks) from Google and some other search engines. However when you press F6 on that report only the paid hits are displayed and analysed.

search phrases

The report on Search Phrases. Pressing F6 gives you several extra reports regarding the paid clicks

Two other useful reports are also available when you launch the task 'SCAN FOR CLICK FRAUD'. One will scan for IP addresses related to anonymous proxies. The second report is mainly targeting potential click bots. Most bots will not load the images associated with a target page; EDM can check what is the maximum number of images that at least one visitor fetched with that page and give you the IP address, the cookie and the time associated with any visitor who did not ask for the images or resource files (or a small fraction of them).

detecting referrers who cheat with PPC

An HTML report from EDM dealing with the Content Network. The original referring websites names were masked.

Referrers from the Display Network and their performance

One of the most important report concerns the Content Network and the performance of each refering website. In a column you get the percentage of proxies for the dispatched visitors; a low percentage of 2% is quite normal since a small fraction of the Internet users go through a proxy for other reasons, not necessarely to change their IP in order to click on your ads. In the above picture the referrers have been masked to preserve their anonimity.
In this report you get also, for each referrer, the percentage of the visitors whose browser doesn't ask all the images embedded in your target page and the percentage of those who perform more than one paid click during a visit. This helps you to spot non profitable websites from the Content Network. For example if an html page contains 6 images most clickbots will not try to fetch those bigger files; there is no advantage. Also even if a human visitors disable his cookies or change his IP to remove the traces, the cache of his browser will still store the images that are frequently requested from a website. So your log file will contain the requests for the target page but not the usual hits for the associated images for some visitors at least.

Even when click fraud is not involved, it is very important to asses the quality of the clicks. EDM allows you to insert a custom column in most reports, including the report on the Display Network. It provides you a chart about the time spent on your site by the visitors dispatched by a specific referrer.

the time spent on each page of the websiye

Obviously the referring website is sending bored users who have little chances to purchase anything. If you implement the javascript that comes with the software, you will see a striking difference between the referring sites, in some cases the distribution is centered around 8 minutes rather than 30 seconds, even if both types of visitors request almost the same number of pages. In the first case they do not only ask for 2 pages, they read them.

EDM can also process the visitors who use the ppc from Yahoo, Bing or Facebook. Another approach is used to spot anomalies; a statistical comparison is performed between the browser of the visitors who use the PPC and the ones who use an organic search. If a browser version is overrepresented in the PPC group, a possibility can be that a user who erased his cookies and got a dynamically allocated IP is coming back again and again, or either that a clickbot is used. The presence of proxies or the restricted geographic area associated with such IPs can give further clues.

Getting the ROI in a Pay Per Click campaign

Someone may wish also to optimize his investment and suppress the keywords that do not generate enough profits or to know the percentage of the visitors who purchase something. In Web Analytics this is called the conversion rate. Unlike many other log analyzers, EDM doesn't limit itself to the conversion rate for the current session, it can also check if a referred visitor purchased something several days later. Lets take the report 'pages accessed from a search engine':

Landing pages from a search engine

If you click on the button with a hammer and a screwdriver, you get the configuration screen for the current report:

Defining a column to avoid click fraud

If you click on the button Define Action you get the following page:

conversion rate with pay per click

The fields were filled to tell the software that:

1) Asking for the page /buy.asp ( ''/" being the root page of the site) is considered as a purchase. In the fact it would be more accurate to select a page that is displayed once a visitor paid for something (Thank you for purchasing with us!).

2) The result will be displayed as a percentage.

In short, you ask the software to build a new colun that will display the percentage of the visitors who asked for the page buy.asp during their session regardless of their landing pages. In other reports this column could be available for search keywords, the referring websites, etc...

But you don't want just to know if these visitors purchased something during their first visit but also if they bought something some days or weeks after. Since you store the cookies of these people in your log files you just need to go in the box scope and choose 'Multiple Sessions' rather than 'Current Session' from the combo box and click on the button Scope Properties.

conversion rates using cookies

Let say that you wish to find the visitors who found your website with a search engine between November,11 2007 and November, 16 of the same year. You want to isolate those who purchased your products at most 6 days after their first visit. Since you are using the PPC system and not organic search your invoice is determined either by the number of clicks (even if the same visitors is asking often for the same page) either by the number of visitors who performed at least one click. From the second combo box you can choose how much it costs you in two ways: if the URL is asked in such a way : you assign an amount of 1.76$ or euros when you tell the software to use the variable 'cost' each time that it appears in your log file. You can also assign a fix cost for each referring website. Revenues are calculated the same way, you can assign a fix amount each time that buy.asp is called (let say your average sales) or either modify your pages so that a variable 'price' appears when someone calls that page, like in  /buy.asp?price=43.44. This variable will be ignored by your application but stored in your log files.

One can build several columns like this in several reports and play with some parameters like the initial date, the necessary time to achieve a conversion, etc....

Once you re-analyze your log files you get the following results:

conversion rates over several weeks

9.32% of those who found the page /download.asp of this website between the 11 th and the 16 th of November with a search engine did ask for the page buy.asp within the following 6 days. Those who entered your website directly on /buy.asp obviously asked for the same page in 100% of the cases. In case of doubt, you can always right click on a line to get the list of the visitors, their IPs, etc...

If you press the F9 key, it's no longer the percentage that is displayed but the net revenue:

Is the pay per click profitable?

Note that such results could be obtained in other reports where the first column displays referring websites, search phrases or either or Google AdWords. When you press on F9 again you get the ROI, then the cost before to fall back on the percentage of the visitors who ask for the target page.

The demo version of Expert Data Miner doesn't allow you to get conversion rates based on cookies; this feature is available only in the Enterprise version.

