I’ve been giving some thought to proposed “Do Not Track” legislation. The proposals, currently being considered by the FTC and the legislature, seek to protect user privacy by empowering us to tell online services not to track us in a way that has teeth. The adopted approach would express some way for users to communicate their preference not to be tracked, and oblige service providers to honour that instruction.
Although the name evokes the FTC’s Do Not Call list, the appropriate implementation would be somewhat different. It is difficult or impossible to implement a list like Do Not Call, since there is no fundamental, persistent online identifier like a phone number. The best candidate – IP addresses – change frequently, and are often shared between several users. There have been various suggestions, but a commonly accepted approach is the x-do-not-track HTTP header. Without too much detail: when a browser accesses a website, it sends certain headers, letting the site know what language it wants, what sort of encoding to use, and so on. x-do-not-track would just be another optional header that some browsers communicate, indicating a binding request not to be tracked.
This is actually a pretty robust approach to this problem, though there remain a few unanswered concerns. Other commentators like Harlan Yu at Freedom to Tinker, and Arvind Narayanan at 33bits have suggested that this would result in a two-tiered web. That is: some services would refuse to provide users with content unless they disable x-do-not-track. I don’t find this to be the most compelling of possible concerns, since it can be solved legislatively with a provision like:
It shall be an offence under this act to refuse service on the basis of the instruction not to track. Any service, or part thereof, which can be provided to an untracked user, and is provided to trackable users, must be provided to an untracked user on the same terms as it is provided to trackable users.
I see a greater issue in the provision itself, that is: Do Not Track. My concern is reminiscent of Justice Black’s famous statement that “‘no law’ means no law“. If there are some users which one cannot track, then one cannot keep any meaningful record of their use of the service. That means no accurate count of how many users access a service, nor even an estimation of what fraction of users request not to be tracked. For non-interactive content sites, this presents something of a concern. The New York Times, for instance does not need to track users in order to show them articles. How then, should the Grey Lady, bill its advertisers, since it can certainly no longer user the number of impressions?
The above paragraph does make one slight assumption. Although it may not be possible to determine what fraction of a site’s visitors are untrackable through automated means, it is still possible to get this information other ways – such as by asking nicely. It only takes one daring social scientist or market research firm to survey users, in order to produce reliable data about various demographics’ use of x-do-not-track. Then it just takes a little statistical analysis for a service to infer its untrackable users on the basis of its tracked population.
This actually has the potential to be good news for such services. If users can now use their sites confident in their anonymity, they are less likely to block their number one source of tracking: advertising. Surely a world where the privacy-conscious see and click on ads is better than their current habit of disabling them altogether?