Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Topics API #111

Closed
annevk opened this issue Dec 20, 2022 · 2 comments
Closed

The Topics API #111

annevk opened this issue Dec 20, 2022 · 2 comments
Labels
concerns: internationalization This proposal doesn't sufficiently account for different languages or locales concerns: interoperability This proposal creates interop risk, e.g. due to vagueness concerns: privacy This proposal may cause privacy risk if implemented from: Google Proposed, edited, or co-edited by Google. position: oppose topic: privacy venue: PATCG Private Advertising Technology Community Group

Comments

@annevk
Copy link
Contributor

annevk commented Dec 20, 2022

Request for position on an emerging web specification

Information about the spec

Design reviews and vendor positions

Anything else we need to know

Feedback was previously requested and given on webkit-dev, but colleagues and I wanted to reproduce it here due to formatting concerns and it generally being useful to have more of our positions here. This also gives others a chance to weigh in if they missed it the first time around.

@annevk annevk added topic: privacy venue: none from: Google Proposed, edited, or co-edited by Google. concerns: internationalization This proposal doesn't sufficiently account for different languages or locales concerns: interoperability This proposal creates interop risk, e.g. due to vagueness concerns: privacy This proposal may cause privacy risk if implemented labels Dec 20, 2022
@annevk
Copy link
Contributor Author

annevk commented Dec 20, 2022

Our analysis of the proposal assumes full per-top-level-site partitioning and no high entropy device fingerprinting such as IP address available cross-site. It’s important that any pre-existing privacy deficiencies on the web not be used as excuses for privacy deficiencies in new specs and proposals.

We do not think Topics API is a good addition to the web platform. Here’s why:

  • Cross-site data. We don’t think cross-site data about the user’s browsing behavior should be exposed in APIs. We’ve been working for ten years in the opposite direction, partitioning data per-top-level-site.
  • Cross-site sharing default. We don’t think cross-site data sharing should be on by default as a web platform feature. Users must have agency over expressing their personal interests to websites and third parties. A browser exposing this data by default is not acting as a user agent. Further, using the user’s browsing history as the basis of determining interests undermines users’ trust in the browser as their agent.
  • Cross-site targeting by default. We don’t think cross-site targeting of ads should be on by default as a web platform feature. Put another way, we don’t think cross-site targeting of ads should be the default experience on the web.
  • Safe to roam. The web should be safe to roam and the user agent should be working in that direction. By default exposing cross-site data to facilitate personalized ad targeting would make the web less safe to roam. Users would have to always think twice about which sites they visit and how that can be used to manipulate or target them.
  • Enrichment of user profiles. Websites which already know a lot about a user can learn more through cross-site data APIs like Topics API. Prime examples of such sites are the user’s search engine or social networking sites. Worse, topics connected to the user’s browsing will evolve over time, allowing continuous enrichment of the user profile as an ongoing privacy exposure. An example: The user was interested in honeymoons, then baby clothing, then lawyers.
  • Sensitive topics. What’s sensitive information differs between for instance cultures, religions, ages, communities, and individuals. It is therefore not just hard but also foolish to think that browser vendors can come up with a safe set of personalized topics to expose to ad networks.
  • Topic bias. We understand that the current set of topics is not the one intended to be used in production. However, the set shows a concerning affluent western lifestyle bias and we worry that the eventual standardized taxonomy will contain such biases too. A prime example in the current taxonomy is “World Music” as a term for all non-Western music.
  • Hidden patterns. We believe that technologies like machine learning will be able to glean personal data and patterns out of something like Topics API that go far beyond whatever “safe” set of topics that browser vendors define.
  • Advantages established players. The Topics API will only provide cross-site topic data to callers who called the API in the past for this particular user and on a site about that topic. This benefits entities that have scripts or frames embedded on many sites, e.g., already prevalent ad trackers, or owners of embeds with an ostensibly non-ad-related purpose such as social or video. And it perpetuates the incentive for more embedding solely for the purpose of cross-site data usage and not for any clear user benefit, thus needlessly hurting performance and battery life.
  • Who will classify sites? The open questions at the end of the explainer suggest that a taxonomy should be produced, and that it should become an industry standard. A sample taxonomy is available. But a taxonomy (at least as presented) is just a list of categories. Who decides which sites or pages are in which category? Is this a globally maintained list? Would it be a Google-provided service that requires Google’s permission to access? Would each browser do it separately (and perhaps differently?) Would sites self-label? Perhaps the intent is that the industry standard taxonomy would bucket sites or pages in the categories, but if so that’s not clear from the explainer, and if not, it seems like a major problem left unaddressed.

As such I suggest we label this "position: oppose" on January 6, giving additional time due to the holidays.

@hober
Copy link
Member

hober commented Mar 23, 2023

Closing as we've identified our position.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concerns: internationalization This proposal doesn't sufficiently account for different languages or locales concerns: interoperability This proposal creates interop risk, e.g. due to vagueness concerns: privacy This proposal may cause privacy risk if implemented from: Google Proposed, edited, or co-edited by Google. position: oppose topic: privacy venue: PATCG Private Advertising Technology Community Group
Development

No branches or pull requests

2 participants