Behavioural targeting helps online advertising – our study confirmsJuly 5, 2011 by Radek Maciaszek
How can behavioural targeting help online advertising?
I often asked myself this question while studying towards an MSc in Cognitive and Decisions Sciences. This course covered a broad range of topics, ranging from computer science, AI and neuroscience, to psychology and philosophy. I was always tempted to see if I could use some of this research in our work.
You may ask what cognitive science has in common with online advertising. The answer is quite a lot, as some of our customers have already noticed. It can help to better understand the interests of online visitors and their decision processes, and can ultimately help website owners to serve more relevant web content and advertising. I decided to explore this question in more detail during my MSc thesis.
This project was motivated by a paper published by scientists from Microsoft Research (Yan et al., 2009) who found that behavioural targeting in search advertising could yield up to a 670% increase in the overall CTR (click-through ratio). We performed a systematic study of the clickstream logs of a commercial ad network and found that the overall CTR could be increased by as much as 909%.
This study considered the impact of behavioural targeting techniques on online display advertising. Specifically, we investigated whether simulating delivery of traffic to chosen clusters of users would increase the overall CTR of all ads. We examined the data using different evaluation metrics, such as user similarity, precision, recall and F-measure, then we used the t-test to confirm the significance of the results. The experimental design was implemented with the help of scalable data mining libraries, which allowed a successful analysis of the large body of data.
You may download the paper here (the source code is included): MSc Thesis – How much behavioural targeting can help online advertising
I used many data technologies in this project, starting with Hadoop and Hive and Amazon Cloud. All programming was done with Java and Python. The biggest challenge was posed by the amount of data which needed to be clustered, but thanks to Apache Mahout this project was finished in less than a month.
Abstract
Online advertising has exploded during the past few years; the current UK market (as of the middle of 2010) is evaluated at more than £3.5 billion. Such advertising grew dramatically — by about 2200% — during the 2000s. Behavioural targeting (BT) is largely regarded as one of the most effective techniques in optimizing online advertising. However, despite the impressive numbers involved in this industry, there are only a few academic studies performed on real world click-stream data (e.g. Yan, Liu, Wang, Zhang, Jiang & Chen 2009; Ratnaparkhi 2010; Chen, Pavlov, & Canny 2009). This may be linked to the extreme demands on system resources caused by the huge amount of advertising data available.
Yan et al. (2009) confirmed that BT could significantly increase the effectiveness of one specific type of online advertising (so-called search advertising). In this work we investigate whether techniques linked to BT may be beneficial to online display advertising. Using data from a major commercial ad network, we show that a simple BT technique (such as user clustering) could improve click-through ratio by more than 900%.
Furthermore, from a software engineering perspective, we provide support for using distributed open source technologies to tackle the complex analysis of advertising data.
References
- Yan , J., & Liu, N., & Wang, G., & Zhang, W., & Jiang, Y., & Chen, Z. (2009). How much can Behavioural Targeting Help Online Advertising? Proceedings of the 18th international conference on World Wide Web. Madrid, Spain.
- Ratnaparkhi, A. (2010). Finding predictive search queries for behavioral targeting. In ADKDD’10, The 4th International Workshop on Data Mining and Audience Intelligence for Advertising.
- Chen, Y., Pavlov, D., & Canny, J.F. (2009). Large-scale behavioral targeting. In KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 209-218.