Team, Visitors, External Collaborators
Overall Objectives
Research Program
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
XML PDF e-pub
PDF e-Pub

Section: New Results

Measurement and Detection of Web Tracking

Detecting Web Trackers via Analyzing Invisible Pixels

The Web has become an essential part of our lives: billions are using Web applications on a daily basis and while doing so, are placing digital traces on millions of websites. Such traces allow advertising companies, as well as data brokers to continuously profit from collecting a vast amount of data associated to the users.

Web tracking has been extensively studied over the last decade. To detect tracking, most of the research studies and user tools rely on consumer protection lists. EasyList  [23] and EasyPrivacy  [24] (EL&EP) are the most popular publicly maintained blacklist of know advertising and tracking domains, used by the popular browser extensions AdBlock Plus  [20] and uBlockOrigin  [28]. Disconnect  [22] is another very popular list for detecting domains known for tracking, used in Disconnect browser extension  [21] and in integrated tracking protection of Firefox browser  [25]. Relying on EL&EP or Disconnect became the de facto approach to detect third-party tracking requests in privacy and measurement community. However it is well-known that these lists detect only known tracking and ad-related requests, and a tracker can easily avoid this detection by registering a new domain or changing the parameters of the request.

In this work, to detect trackers, we propose a new technique based on the analysis of invisible pixels (By “invisible pixels” we mean 1x1 pixel images or images without content.). These images are routinely used by trackers in order to send information or third-party cookies back to their servers: the simplest way to do it is to create a URL containing useful information, and to dynamically add an image HTML tag into a webpage. Since invisible pixels do not provide any useful functionality, we consider them perfect suspects for tracking.

By using an Inria cluster and setting up a distributed crawler, we have collected a dataset of invisible pixels from 829,349 webpages. By analyzing this dataset, we observed that invisible pixels are widely used: more than 83% of pages incorporate at least one invisible pixel.

Overall, we made the following key contributions:

This working paper [19] is currently under submission at an international conference.

A survey on Browser Fingerprinting

This year, we have conducted a survey on the research performed in the domain of browser fingerprinting, while providing an accessible entry point to newcomers in the field. We explain how this technique works and where it stems from. We analyze the related work in detail to understand the composition of modern fingerprints and see how this technique is currently used online. We systematize existing defense solutions into different categories and detail the current challenges yet to overcome.

A browser fingerprint is a set of information related to a user's device from the hardware to the operating system to the browser and its configuration. Browser fingerprinting refers to the process of collecting information through a web browser to build a fingerprint of a device. Via a script running inside a browser, a server can collect a wide variety of information from public interfaces called Application Programming Interface (API) and HTTP headers. An API is an interface that provides an entry point to specific objects and functions. While some APIs require a permission to be accessed like the microphone or the camera, most of them are freely accessible from any JavaScript script rendering the information collection trivial. Contrarily to other identification techniques like cookies that rely on a unique identifier (ID) directly stored inside the browser, browser fingerprinting is qualified as completely stateless. It does not leave any trace as it does not require the storage of information inside the browser.

The goal of this work is twofold: first, to provide an accessible entry point for newcomers by systematizing existing work, and second, to form the foundations for future research in the domain by eliciting the current challenges yet to overcome. We accomplish these goals with the following contributions:

This work has been submitted for publication at an international journal.

Measuring Uniqueness of Browser Extensions and Web Logins

Web browser is the tool people use to navigate through the Web, and privacy research community has studied various forms of browser fingerprinting. Researchers have shown that a user's browser has a number of inherent “physical” characteristics that can be used to uniquely identify her browser and hence to track it across the Web. Fingerprinting of users' devices is similar to physical biometric traits of people, where only physical characteristics are studied.

Similar to previous demonstrations of user uniqueness based on their behavior, behavioral characteristics, such as browser settings and the way people use their browsers can also help to uniquely identify Web users. For example, a user installs web browser extensions she prefers, such as AdBlock, LastPass, or Ghostery to enrich her Web experience. Also, while browsing the Web, she logs in her preferred social networks, such as Gmail, Facebook or LinkedIn. In this work, we study users' uniqueness based on their behavior and preferences on the Web: we analyze how unique are Web users based on their browser extensions and logins.

In this work, we performed the first large-scale study of user uniqueness based on browser extensions and Web logins, collected from more than 16,000 users who visited our website Our experimental website identifies installed Google Chrome extensions via Web Accessible Resources. and detects websites where the user is logged in by methods that rely on URL redirection and CSP violation reports. Our website is able to detect the presence of 13K Chrome extensions (the number of detected extensions varied monthly between 12,164 and 13,931), covering approximately 28% of all free Chrome extensions (The list of detected extensions and websites are available on our website: . We also detect whether the user is connected to one or more of 60 different websites. Our main contributions are:

We furthermore show that browser extensions and web logins can be exploited to fingerprint and track users by only checking a limited number of extensions and web logins. We have applied an advanced fingerprinting algorithm  [30] that carefully selects a limited number of extensions and logins. For example, we show that 54.86% of users are unique based on all 16,743 detectable extensions. However, by testing 485 carefully chosen extensions we can identify more than 53.96% of users. Besides, detecting 485 extensions takes only 625ms.

Finally, we give suggestions to the end users as well as website owners and browser vendors on how to protect the users from the fingerprinting based on extensions and logins.

This paper has been published at at WPES international workshop affiliated with ACM CCS 2018 [14].