Apple Versus Governments, Apple’s Legitimate Privacy Claims, Privacy and Paranoia

Good morning,

Today’s Daily Update is a follow-up to yesterday’s Weekly Article Apple’s Mistake, which came out a bit later in the day yesterday; read that first if you missed it.

On to the update:

Apple Versus Governments

Apple published an FAQ in response to the ongoing pushback to its “Expanded Protections for Children”; it’s worth a read, but there are two questions that I wanted to focus on as follow-up.

Could governments force Apple to add non-CSAM images to the hash list?

Apple will refuse any such demands. Apple’s CSAM detection capability is built solely to detect known CSAM images stored in iCloud Photos that have been identified by experts at NCMEC and other child safety groups. We have faced demands to build and deploy government-mandated changes that degrade the privacy of users before, and have steadfastly refused those demands. We will continue to refuse them in the future. Let us be clear, this technology is limited to detecting CSAM stored in iCloud and we will not accede to any government’s request to expand it. Furthermore, Apple conducts human review before making a report to NCMEC. In a case where the system flags photos that do not match known CSAM images, the account would not be disabled and no report would be filed to NCMEC.

I absolutely believe Apple as far as the United States is concerned. First off, yes, Apple has absolutely demonstrated a willingness to stand up to government demands, and second, the 4th Amendment expressly forbids the government from mandating searches and seizures without a warrant. And, to be honest, I probably believe Apple as far as China is concerned as well: after all, the government already has access to Chinese iCloud accounts, so how much is to be gained by mandating Apple’s CSAM-scanning mechanism introduce new hashes of interest to the Chinese government? That noted, it’s worth pointing out that Apple is making a promise here it can’t necessarily keep: the rights of Apple users to not be searched is ultimately a function of the laws in the country in which they reside; to put it another way, the fact that Chinese iCloud data is accessible by the government is ultimately a function of their rights or lack thereof in China, not Apple’s decision-making.

What is far more interesting, and, as I noted yesterday, likely played a role in Apple’s decision to build this feature, is Europe. Both the UK and EU are well on their way to mandating scanning for CSAM, and while Apple previously changed the iCloud terms of service to give itself permission to scan for content on the server side, the low number of reports suggests that its scanning was quite limited (perhaps just to email?). Apple likely felt it had to expand its efforts, and it was better to act preemptively and in a way that it clearly believed was better.

Apple’s Legitimate Privacy Claims

This is where the 2nd FAQ I want to highlight comes in:

Why is Apple doing this now?

One of the significant challenges in this space is protecting children while also preserving the privacy of users. With this new technology, Apple will learn about known CSAM photos being stored in iCloud Photos where the account is storing a collection of known CSAM. Apple will not learn anything about other data stored solely on device.

Existing techniques as implemented by other companies scan all user photos stored in the cloud. This creates privacy risk for all users. CSAM detection in iCloud Photos provides significant privacy benefits over those techniques by preventing Apple from learning about photos unless they both match to known CSAM images and are included in an iCloud Photos account that includes a collection of known CSAM.

This is what I was referring to as Apple’s “particular vision of privacy”; that vision of privacy has two central components:

  1. First party data collection is much more private than data that is shared between third parties.
  2. Not collecting any data is the most private of all.

I have spent a lot of time discussing point 1 in the context of App Tracking Transparency (ATT); the implication of Apple’s approach is that collecting data about and targeting ads at users is fine as long as that data is all first party, which works to the benefit of companies like Amazon, Google (in part) and Apple itself. Facebook, on the other hand, and the multitude of smaller publishers and merchants that rely on its platform, are harmed, which is why Facebook is working to move merchants in particular onto its platform so that more data is first-party and thus allowed under Apple’s approach.

There are certainly arguments to be had about the validity of point 1; is the means by which a large corporation gets your data a meaningful differentiator, or does it just so happen to be the case that Apple’s definition works to its own advantage (and those of its closest partners)?

Point 2, on the other hand, seems much more straightforward: certainly it is better to not collect any data at all, right? Without question! Apple, justifiably, talks proudly of the fact it collects less data than other tech companies, even when folks (including yours truly) fretted that the company’s machine learning efforts would struggle for lack of data. In the case of photos specifically Apple often highlights the fact that all of its photo analysis is done on device, not in the cloud, as evidence of both its commitment to privacy and the capabilities of its systems-on-a-chip.

Apple’s argument is that its CSAM scanning is superior for broadly similar reasons: cloud scanning means looking at every photo, and requires in-depth investigation of every potential positive match; on-device scanning, on the other hand, means that Apple only ever sees suspicious photos, and even then only if the number of suspicious photos passes some sort of threshhold. In other words, Apple is collecting much less data.

In fact, if you think about an iPhone’s long-existent photo analysis capability, it’s obvious how Apple’s approach to both iMessage nudity warnings and its decision to scan for CSAM feel like obvious next steps. The former is likely using the same sort of routines used to organize your photo library; the latter is using a perceptual hash solution from NCMEC to match photos against a pre-existing set (to be very clear, the processes are very different technically). In both cases, though, your phone is just doing a bit of work that you didn’t expressly authorize.

There is, though, a line that was crossed: the outcome of your phone’s photo analysis is a better user experience when it comes to managing your photos; the outcome of iMessage scanning is a warning and potentially a message sent to your parents; the outcome of CSAM scanning is the suspension of your account and law enforcement at your door. Both of these are obviously good things in many respects, but what I think is provoking so much pushback is the sense that the phone is, at least from a certain perspective, acting against the interests of its owner who has no say in the matter (again, even if this is a good thing societally speaking).

In other words, what users care about is not just privacy but also control, at least in terms of their device.

Privacy and Paranoia

There is one other interesting angle to this controversy; recall that ealier this year Google faced widespread backlash of its own with its introduction of Federated Learning of Cohorts as an alternative to cookies. The entire idea of FLoC is that instead of collecting user data in the cloud the user’s devices would build profiles locally that wouldn’t be shared with anyone; from the GitHub Readme (emphasis mine):

A FLoC cohort is a short name that is shared by a large number (thousands) of people, derived by the browser from its user’s browsing history. The browser updates the cohort over time as its user traverses the web…

The browser uses machine learning algorithms to develop a cohort based on the sites that an individual visits. The algorithms might be based on the URLs of the visited sites, on the content of those pages, or other factors. The central idea is that these input features to the algorithm, including the web history, are kept local on the browser and are not uploaded elsewhere — the browser only exposes the generated cohort. The browser ensures that cohorts are well distributed, so that each represents thousands of people. The browser may further leverage other anonymization methods, such as differential privacy. The number of cohorts should be small, to reinforce that they cannot carry detailed information — short cohort names (“43A7”) can help make that clear.

I am not, to be clear, saying that FLoC, a technology developed to support online advertising, is in the same moral universe as a technology developed to find and report the most horrendous content there is; moreover, many of the objections to FLoC were because it violated privacy in new and novel ways. What is worth pointing out, though, is that this year we have now seen two proposals from tech companies to replace cloud-based tracking/scanning in favor of on-device replacements, and both generated huge amounts of pushback. Sure, some people don’t like the fact that websites track you when you visit, or scan all of your content, but that seems much less objectionable then your own device selling you out!

Again, FLoC had its own set of privacy problems, while the biggest problem with Apple’s latest announcement is arguably that they fumbled the PR roll-out. Still, I do wonder about the potential long-term shift to device-based advertising; there has always been a gap between the protestations of privacy advocates and regulators about the alleged harms of tracking and general public sentiment, which mostly tends towards apathy. What if what actually bothered people all along was not 3rd-party sites and services tracking them — after all, who trusts them? — but paranoia that their own devices are spying on them (think about all of those people worred that their phones are listening to them)? Both Apple and Google are headed in that direction — for privacy-preseving reasons! — which is to say both may be on a path to much more of a backlash than they expected.

This Daily Update will be available as a podcast later today. To receive it in your podcast player, visit Stratechery.

The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly.

Thanks for being a supporter, and have a great day!