ProPublica, which is typically known for its excellent journalism, published a particularly terrible piece earlier this week that fundamentally miscast how encryption works and how Facebook vis-a-vis WhatsApp works to keep communications secured. The article, “How Facebook Undermines Privacy Protections for Its 2 Billion WhatsApp Users,” focuses on two so-called problems.
The So-Called Privacy Problems with WhatsApp
First, the authors explain that WhatsApp has a system whereby recipients of messages can report content they have received to WhatsApp on the basis that it is abusive or otherwise violates WhatsApp’s Terms of Service. The article frames this reporting process as a way of undermining privacy on the basis that secured messages are not kept solely between the sender(s) and recipient(s) of the communications but can be sent to other parties, such as WhatsApp. In effect, the ability to voluntarily forward messages to WhatsApp that someone has received is cast as breaking the privacy promises that have been made by WhatsApp.
Second, the authors note that WhatsApp collects a large volume of metadata in the course of using the application. Using lawful processes, government agencies have compelled WhatsApp to disclose metadata on some of their users in order to pursue investigations and secure convictions against individuals. The case that is focused on involves a government employee who leaked confidential banking information to Buzzfeed, and which were subsequently reported out.
Assessing the Problems
In the case of forwarding messages for abuse reporting purposes, encryption is not broken and the feature is not new. These kinds of processes offer a mechanism that lets individuals self-identify and report on problematic content. Such content can include child grooming, the communications of illicit or inappropriate messages or audio-visual content, or other abusive information.
What we do learn, however, is that the ‘reactive’ and ‘proactive’ methods of detecting abuse need to be fixed. In the case of the former, only about 1,000 people are responsible for intaking and reviewing the reported content after it has first been filtered by an AI:
Seated at computers in pods organized by work assignments, these hourly workers use special Facebook software to sift through streams of private messages, images and videos that have been reported by WhatsApp users as improper and then screened by the company’s artificial intelligence systems. These contractors pass judgment on whatever flashes on their screen — claims of everything from fraud or spam to child porn and potential terrorist plotting — typically in less than a minute.
Further, the employees are often reliant on machine learning-based translations of content which makes it challenging to assess what is, in fact, being communicated in abusive messages. As reported,
… using Facebook’s language-translation tool, which reviewers said could be so inaccurate that it sometimes labeled messages in Arabic as being in Spanish. The tool also offered little guidance on local slang, political context or sexual innuendo. “In the three years I’ve been there,” one moderator said, “it’s always been horrible.”
There are also proactive modes of watching for abusive content using AI-based systems. As noted in the article,
Artificial intelligence initiates a second set of queues — so-called proactive ones — by scanning unencrypted data that WhatsApp collects about its users and comparing it against suspicious account information and messaging patterns (a new account rapidly sending out a high volume of chats is evidence of spam), as well as terms and images that have previously been deemed abusive. The unencrypted data available for scrutiny is extensive. It includes the names and profile images of a user’s WhatsApp groups as well as their phone number, profile photo, status message, phone battery level, language and time zone, unique mobile phone ID and IP address, wireless signal strength and phone operating system, as a list of their electronic devices, any related Facebook and Instagram accounts, the last time they used the app and any previous history of violations.
Unfortunately, the AI often makes mistakes. This led one interviewed content reviewer to state that, “[t]here were a lot of innocent photos on there that were not allowed to be on there … It might have been a photo of a child taking a bath, and there was nothing wrong with it.” Often, “the artificial intelligence is not that intelligent.”
The vast collection of metadata has been a long-reported concern and issue associated with WhatsApp and, in fact, was one of the many reasons why many individuals advocate for the use of Signal instead. The reporting in the ProPublica article helpfully summarizes the vast amount of metadata that is collected but that collection, in and of itself, does not present any evidence that Facebook or WhatsApp have transformed the application into one which inappropriately intrudes into persons’ privacy.
ProPublica Sets Back Reasonable Encryption Policy Debates
In suggesting that what WhatsApp has implemented is somehow wrong, it becomes more challenging for other companies to deploy similar reporting features without fearing that their decision will be reported on as ‘undermining privacy’. While there may be a valid policy discussion to be had–is a reporting process the correct way of dealing with abusive content and messages?–the authors didn’t go there. Nor did they seriously investigate whether additional resources should be adopted to analyze reported content, or talk with artificial intelligence experts or machine-based translation experts on whether Facebook’s efforts to automate the reporting process are adequate, appropriate, or flawed from the start. All those would be very interesting, valid, and important contributions to the broader discussion about integrating trust and safety features into encrypted messaging applications. But…those are not things that the authors choose to delve into.
The authors could have, also, discussed the broader importance (and challenges) in building out messaging systems that can deliberately conceal metadata, and the benefits and drawbacks of such systems. While the authors do discuss how metadata can be used to crack down on individuals in government who leak data, as well as assist in criminal investigations and prosecutions, there is little said about what kinds of metadata are most important to conceal and the tradeoffs in doing so. Again, there are some who think that all or most metadata should be concealed, and others who hold opposite views: there is room for a reasonable policy debate to be had and reported on.
Unfortunately, instead of actually taking up and reporting on the very valid policy discussions that are at the edges of their article, the authors choose to just be bombastic and asserted that WhatsApp was undermining the privacy protections that individuals thought they have when using the application. It’s bad reporting, insofar as it distorts the facts, and is particularly disappointing given that ProPublica has shown it has the chops to do good investigative work that is well sourced and nuanced in its outputs. This article, however, absolutely failed to make the cut.