Tracking Prevention in WebKit
WebKit has implemented tracking prevention technologies, spanning from 2003 with Safari 1.0 until today. Most of them are on by default. This document describes shipping behavior including Intelligent Tracking Prevention (ITP).
You can learn more about why we prevent cross-site tracking and how we handle the inherent tradeoffs by reading our Tracking Prevention Policy.
Terminology
Let’s define what we mean by a few things first.
- A registrable domain is a website’s eTLD+1 or effective top-level domain plus one label. Effective top-level domains are defined in the Public Suffix List.
- Website or site. A website is a registrable domain including all of its subdomains. Others define site to also include the scheme, making http://news.example and https://news.example be two different sites. For the purposes of this document, we consider http and https to be be same site, since cookies can (still) span schemes.
- Cross-site. The user can be navigated across different websites or a website can load subresources form a different website. These are referred to as cross-site navigations and cross-site loads. When it comes to tracking, cross-site means tracking across different websites.
- First and third-party. If news.example is shown in the URL bar and it loads a subresource from adtech.example, then news.example is first-party and adtech.example is third-party. Note that different parties have to be different websites. sub.news.example is considered first-party when loaded under news.example because they are considered to be the same site.
- Third-party cookies. There is no special kind of cookie that constitutes a third-party cookie. Instead, it’s about content having access to its cookies when it’s loaded from a third-party. Let’s say your browser is loading an image from the third-party adtech.example on a webpage from the first-party news.example. If your browser allows it, a third-party request to adtech.example may include cookies and the subsequent response from adtech.example may set new cookies. Both those capacities – sending existing cookies and accepting new cookies for third-party content – are what’s referred to as third-party cookies.
- User interaction is a user click, tap, or keyboard entry on a website. Some refer to it as a user gesture. Scrolling is not considered user interaction.
- Partitioning is a technology to allow third-parties to use storage and stateful web features, but have those isolated per first-party website. Let’s say adtech.example is a third-party under both news.example and blog.example and that adtech.example uses LocalStorage. With partitioned LocalStorage, adtech.example will get unique storage instances under news.example and blog.example which removes the possibility to do cross-site tracking through LocalStorage.
- Ephemeral. When we say ephemeral storage, we mean the storage does not persist to disk and goes away with the application, for instance when the user quits the browser or reboots their device.
The Default Cookie Policy
The default cookie policy for WebKit on Apple’s iOS, macOS, iPadOS, tvOS, and watchOS is to disallow a third-party to set new cookies unless it already has cookies. This means that to be able to use cookies at all as third-party, the domain first has to become first-party and set its initial cookie(s) there. This default cookie policy has been in effect since Safari 1.0 and is still in effect today as part of the “Prevent cross-site tracking” setting.
Private Browsing Mode
The basis for what in Safari is Private Browsing Mode is an ephemeral session which ensures that cookies and other stateful things are not persisted and go away when the user closes the tab, quits the browser, or reboots their device. Safari’s Private Browsing Mode uses a new ephemeral session for each tab the user opens to isolate tabs from each other.
Partitioned Third-Party Storage
Third-party LocalStorage and IndexedDB are partitioned per first-party website and also made ephemeral.
Partitioned Service Workers
Third-party Service Workers are partitioned and their cache and IndexedDB is partitioned too.
Partitioned Third-Party HTTP Cache
HTTP cache entries for third-party content is partitioned per first-party website.
Anti Fingerprinting
Fingerprinting involves measuring the uniqueness of static device configuration (e.g. built-in hardware), dynamic device or browser configuration (e.g. user settings or installed peripherals), and user browsing data (e.g. checking which sites the user is logged in to, so-called login fingerprinting).
As we implement new web features, we look for fingerprinting vulnerabilities and opportunities to improve user privacy. We aim to collaborate with other implementers through the web standards process to advocate for users, and ensure that the specifications allow for, or preferably require, the protections we have added.
Here are examples of already existing such behavior changes:
- Require a user permission for websites to access the Device Orientation/Motion APIs on mobile devices, because the physical nature of motion sensors may allow for device fingerprinting.
- Prevent fingerprinting of attached cameras and microphones through the Web Real-Time Communication API (WebRTC).
- Changed font availability to web content to only include web fonts and fonts that come with the operating system, but not locally user-installed fonts. Web fonts and the common set of web-safe fonts, as well as other OS-bundled fonts, are still available.
- Altered the user agent string to not change with minor software updates. The string only changes with the marketing version of the platform and the browser.
Our next line of defense is to remove existing fingerprinting vectors where possible. The last few years, we’ve made these changes:
- Removed the Do Not Track flag, which ironically was used as a fingerprinting vector, adding uniqueness to the users who had enabled it.
- Removed support for any plug-ins on macOS. Other desktop ports may differ. (Plug-ins were never supported on iOS.)
Finally, if we find that features and web APIs increase fingerprintability and offer no safe way to protect our users, we will not implement them until we or others have found a good way to reduce that fingerprintability. We continue to have open discussions with other browser makers through the web standards process, many of whom share these concerns. Here are some examples of features we have decided to not yet implement due to fingerprinting, security, and other concerns, and where we do not yet see a path to resolving those concerns:
- Web Bluetooth
- Web MIDI API
- Magnetometer API
- Web NFC API
- Device Memory API
- Network Information API
- Battery Status API
- Web Bluetooth Scanning
- Ambient Light Sensor
- HDCP Policy Check extension for EME
- Proximity Sensor
- WebHID
- Serial API
- Web USB
- Geolocation Sensor (background geolocation)
- User Idle Detection
Intelligent Tracking Prevention (ITP)
Full Third-Party Cookie Blocking
ITP by default blocks all third-party cookies. There are no exceptions to this blocking. Third-party cookie access can only be granted through the Storage Access API and the temporary compatibility fix for popups.
Cookie Blocking Latch Mode
Once a request is blocked from using cookies, all redirects of that request are also blocked from using cookies.
Downgraded Third-Party Referrers
All third-party referrers are downgraded to their origins by default. This applies to both HTTP referrer headers and document.referrer
. For example, if the full referrer is https://www.social.example/feed?clickID=123456, it will show up as https://www.social.example/.
Blocked Third-Party HSTS
HSTS, or HTTP Strict Transport Security, can only be set by the first-party website and only for the current host/domain and the website’s registrable domain. Further, HSTS is not applied to third-party requests that don’t carry cookies and since all third-party cookies are blocked by default, so is third-party HSTS.
Classification as Having Cross-Site Tracking Capabilities
Beyond across-the-board blocking of third-party cookies and downgrades of third-party referrers, ITP collects statistics on resource loads and matches it with known patterns of cross-site tracking. If a registrable domain matches at least one such pattern, it is classified as having cross-site tracking capabilities.
One such pattern is showing up as third-party resource under several first-party websites. A machine learning model decides when these three numbers leads to classification of domain.example
:
- The number of unique websites
domain.example
has been seen as third-party subresource under. - The number of unique websites
domain.example
has been seen as third-party iframe under. - The number of unique websites
domain.example
has been seen doing cross-site redirects under.
Another pattern that is detected as the capability to track cross-site is top frame redirects, often referred to as bounce tracking. ITP counts the number of unique such redirects that domain.example
does, and classifies based on that number. ITP will count it as a bounce even if the redirect is delayed by landing on a webpage and triggering a navigation a couple of seconds later.
The third pattern that is detect is called tracker collusion. If domain.example gets classified as having cross-site tracking capabilities, a check is made to see which other domains have previously redirected to domain.example
and all of them get classified too. Then the process repeats recursively through the graph of redirects.
Action Taken Against Classified Domains
All website data is deleted for classified domains which have not received user interaction as first-party or been granted storage access as third party through the Storage Access API (see below) in the last 30 days of browser use. Such website deletion happens at an interval so as to not cause too much disk I/O.
Classified domains which have received user interaction as first-party or been granted storage access, but are found to engage in bounce tracking (top frame redirects) may have their cookies rewritten to SameSite=strict.
Verified Partitioned Cache
When a partitioned cache entry is created for a domain that’s classified by ITP as having cross-site tracking capabilities, the entry gets flagged for verification. After seven days, if there’s a cache hit for such a flagged entry, WebKit will act as if it has never seen this resource and load it again. The new response is then compared to the cached response and if they match in the ways we care about for privacy reasons, the verification flag is cleared and the cache entry is from that point considered legitimate. However, if the new response does not match the cache entry, the old entry is discarded, and a new one is created with the verification flag set, and the verification process starts over.
Detection of Cross-Site Tracking Via Link Decoration
Some trackers add so called “click IDs” as URL parameters in links and pick them up through JavaScript on the link destination website. Then they store the click IDs in one of the storage forms available. That way they can establish a user identity across websites. This is called cross-site tracking via link decoration.
ITP detects such link decoration and caps the expiry of cookies created in JavaScript on the landing webpage to 24 hours.
7-Day Cap on All Script-Writeable Storage
Trackers executing script in the first-party context often make use of first-party storage to save and recall cross-site tracking information. Therefore, ITP deletes all cookies created in JavaScript and all other script-writeable storage after 7 days of no user interaction with the website. The latter storage forms are:
- IndexedDB
- LocalStorage
- Media keys
- SessionStorage
- Service Worker registrations and cache
CNAME and Third-Party IP Address Cloaking Defense
ITP detects third-party CNAME cloaking and third-party IP address cloaking requests and caps the expiry of any cookies set in the HTTP response to 7 days.
Third-party CNAME cloaking is defined as a first-party subresource that resolves through a CNAME that differs from the first-party domain and differs from the top frame host’s CNAME, if one exists.
This table explains the seven possible scenarios (1p means first-party, 3p means third-party):
1p host, e.g. www.blog.example | 1p subdomain other than the 1p host, e.g. track.blog.example | Capped cookie expiry? |
---|---|---|
No cloaking | No cloaking | No cap |
No cloaking | other.blog.example (1p cloaking) | No cap |
No cloaking | tracker.example (3p cloaking) | 7-day cap |
abc123.edge.example (cloaking) | No cloaking | No cap |
abc123.edge.example (cloaking) | abc123.edge.example (matching cloaking) | No cap |
abc123.edge.example (cloaking) | other.blog.example (1p cloaking) | No cap |
abc123.edge.example (cloaking) | tracker.example (3p cloaking) | 7-day cap |
Home Screen Web Application Domain Exempt From ITP
The first-party domain of home screen web applications is exempt from ITP’s 7-day cap on all script-writeable storage, i.e. ITP always skips that domain in its website data removal algorithm. In addition, the website data of home screen web applications is kept isolated from Safari and thus will not be affected by ITP’s classification of tracking behavior in Safari.