Alibaba’s Jack Ma Urges China to Use Data to Combat Crime

Bloomberg reporting on Alibaba’s Jack Ma:

In his speech, Ma stuck mainly to the issue of crime prevention. In Alibaba’s hometown of Hangzhou alone, the number of surveillance cameras may already surpass that of New York’s, Ma said. Humans can’t handle the sheer amount of data amassed, which is where artificial intelligence comes in, he added.

“The future legal and security system cannot be separated from the internet and big data,” Ma said.

In North America, we’re trialling automated bail systems, where the amount set and likelihood of receiving bail is predicated on big data algorithms. While it’s important to look abroad and see what foreign countries are doing we mustn’t forget what is being done here in the process.


This Mathematician Says Big Data Punishes Poor People

This Mathematician Says Big Data Punishes Poor People:

O’Neil sees plenty of parallels between the usage of Big Data today and the predatory lending practices of the subprime crisis. In both cases, the effects are hard to track, even for insiders. Like the dark financial arts employed in the run up to the 2008 financial crisis, the Big Data algorithms that sort us into piles of “worthy” and “unworthy” are mostly opaque and unregulated, not to mention generated (and used) by large multinational firms with huge lobbying power to keep it that way. “The discriminatory and even predatory way in which algorithms are being used in everything from our school system to the criminal justice system is really a silent financial crisis,” says O’Neil.

The effects are just as pernicious. Using her deep technical understanding of modeling, she shows how the algorithms used to, say, rank teacher performance are based on exactly the sort of shallow and volatile type of data sets that informed those faulty mortgage models in the run up to 2008. Her work makes particularly disturbing points about how being on the wrong side of an algorithmic decision can snowball in incredibly destructive ways—a young black man, for example, who lives in an area targeted by crime fighting algorithms that add more police to his neighborhood because of higher violent crime rates will necessarily be more likely to be targeted for any petty violation, which adds to a digital profile that could subsequently limit his credit, his job prospects, and so on. Yet neighborhoods more likely to commit white collar crime aren’t targeted in this way.

In higher education, the use of algorithmic models that rank colleges has led to an educational arms race where schools offer more and more merit rather than need based aid to students who’ll make their numbers (thus rankings) look better. At the same time, for-profit universities can troll for data on economically or socially vulnerable would be students and find their “pain points,” as a recruiting manual for one for-profit university, Vatterott, describes it, in any number of online questionnaires or surveys they may have unwittingly filled out. The schools can then use this info to funnel ads to welfare mothers, recently divorced and out of work people, those who’ve been incarcerated or even those who’ve suffered injury or a death in the family.

The usage of Big Data to inform all aspects of our lives, with and without our knowledge, matters not just because it dictates the life chances that are presented or denied to us. It also matters because the artificial intelligence systems that are being developed and deployed are learning from the data is collected. And those AI systems, themselves, can be biased and inaccessible to third-party audit.

Corporations are increasingly the substitutes for core state institutions. And as they collect and analyze data in bulk and hide away their methods of presenting data on behalf of states (or in lieu of past state institutions) the public is left vulnerable not just to corporate malice, but disinterest. Worse, this is a kind of disinterest that is difficult to challenge in the absence of laws compelling corporate transparency.


Policy – Privacy Paranoia: Is Your Smartphone Spying On You?

Policy – Privacy Paranoia: Is Your Smartphone Spying On You?:

Privacy alarmism is one act in a bigger spectacle. In alarmists’ minds, something could go terribly wrong, and although it never has nor is it likely to happen, we should change the world and imposed new political and bureaucratic order to prepare for it. Privacy concerns in general are fertile breeders of this pattern, and have already inflicted on us useless and expensive laws like HIPPA and FERPA. Now, privacy alarmism has set its sights on the biggest prize: the shrinking of Big Data.

While I’m glad that the author has apparently never suffered an issue linked to a privacy infringement, the same cannot be said for an enormous percentage of the world’s population. Mass intrusion, with and without consent, into communications privacy is a prominent issue internationally because of how private and public bodies alike exploit information that is collected.

We are functionally experimenting on the entire population when collecting and applying math to enormous datasets: to say that there has been no harm, ever, to date is possible. But doing so functionally depends on ignoring the lived reality of many of the persons impacted by big data and digital technology.


We agree that Cloud Computing, the Internet of Things, and Big Data analytics are all trends that may yield remarkable new correlations, insights, and benefits for society at large. While we have no intention of standing in the way of progress, it is essential that privacy practitioners participate in these efforts to shape trends in a way that is truly constructive, enabling both privacy and Big Data analytics to develop, in tandem.

There is a growing understanding that innovation and competitiveness must be approached from a “design-thinking” perspective — namely, viewing the world to overcome constraints in a way that is holistic, interdisciplinary, integrative, creative and innovative. Privacy must also be approached from the same design-thinking perspective. Privacy and data protection should be incorporated into networked data systems and technologies by default, and become integral to organizational priorities, project objectives, design processes, and planning operations. Ideally, privacy and data protection should be embedded into every standard, protocol, and data practice that touches our lives. This will require skilled privacy engineers, computer scientists, software designers and common methodologies that are now being developed, hopefully to usher in an era of Big Privacy.

We must be careful not to naively trust data users, or unnecessarily expose individuals to new harms, unintended consequences, power imbalances and data paternalism. A “trust me” model will simply not suffice. Trust but verify — embed privacy as the default, thereby growing trust and enabling confirmation of trusted practices.

I’m generally sympathetic to the arguments made in this article, though there are a series of concerns I have that are (I hope) largely the result of the authors trying to write an inoffensive article that could be acted on by large organizations. To begin, while I understand that Commissioner Cavoukian has developed her reputation on working with partners as opposed to tending to radically oppose corporations’ behaviours I’m left asking: what constitutes ‘progress’ for herself and her German counterpart, Dr. Dix?

Specifically, Commissioners Cavoukian and Dix assert that they have no intention to stand in the way of progress and (generally) that a more privacy-protective approach means we can enjoy progress and privacy at the same time. But how do the Commissioners ‘spot’ progress? How do they know what to oppose and not oppose? When must, and mustn’t, they stand in the way of a corporation’s practices?

The question of defining progress is tightly linked with my other concern from this quoted part of their article. Specifically, the Commissioners acknowledge that a ‘positive-sum’ approach to privacy and progress requires “skilled privacy engineers, computer scientists, software designers and common methodologies that are now being developed, hopefully to usher in an era of Big Privacy.” That these groups are important is true. But where are the non-engineers, non-software designers, and (presumably) non-lawyers? Social scientists and arts and humanities scholars and graduates can also contribute to sensitizing organizations’ understandings of privacy, of user interests, and the history of certain decisions.

Privacy isn’t something that is only understandable by lawyers or engineers. And, really, it would be better understood and protected if there were more people involved in the discussion. Potential contributors to the debates shouldn’t be excluded simply because they contest or demand definitions of ‘progress’ or come from a non-lawyerly or computer-development background. Rather, they should be welcomed as expanding the debate outside of the contemporary echo chamber of the usually-counted disciplinary actors.


Shaping ideas is, of course, easier said than done. Bombarding people with ads only works to a degree. No one likes being told what to think. We grow resistant to methods of persuasion that we see through—just think of ads of yesteryear, and of how corny they feel. They worked in their day, but we’re alert to them now. Besides, blanket coverage isn’t easy to achieve in today’s fragmented media landscape. How many channels can one company advertise on? And we now fast-forward through television commercials, anyway. Even if it were possible to catch us through mass media, messages that work for one person often fail to convince others.

Big-data surveillance is dangerous exactly because it provides solutions to these problems. Individually tailored, subtle messages are less likely to produce a cynical reaction. Especially so if the data collection that makes these messages possible is unseen. That’s why it’s not only the NSA that goes to great lengths to keep its surveillance hidden. Most Internet firms also try to monitor us surreptitiously. Their user agreements, which we all must “sign” before using their services, are full of small-font legalese. We roll our eyes and hand over our rights with a click. Likewise, political campaigns do not let citizens know what data they have on them, nor how they use that data. Commercial databases sometimes allow you to access your own records. But they make it difficult, and since you don’t have much right to control what they do with your data, it’s often pointless.

This is why the state-of-the-art method for shaping ideas is not to coerce overtly but to seduce covertly, from a foundation of knowledge. These methods don’t produce a crude ad—they create an environment that nudges you imperceptibly. Last year, an article in Adweek noted that women feel less attractive on Mondays, and that this might be the best time to advertise make-up to them. “Women also listed feeling lonely, fat and depressed as sources of beauty vulnerability,” the article added. So why stop with Mondays? Big data analytics can identify exactly which women feel lonely or fat or depressed. Why not focus on them? And why stop at using known “beauty vulnerabilities”? It’s only a short jump from identifying vulnerabilities to figuring out how to create them. The actual selling of the make-up may be the tip of the iceberg.


We can draw a distinction here between Big Data—the stuff of numbers that thrives on correlations—and Big Narrative—a story-driven, anthropological approach that seeks to explain why things are the way they are. Big Data is cheap where Big Narrative is expensive. Big Data is clean where Big Narrative is messy. Big Data is actionable where Big Narrative is paralyzing.

The promise of Big Data is that it allows us to avoid the pitfalls of Big Narrative. But this is also its greatest cost. With an extremely emotional issue such as terrorism, it’s easy to believe that Big Data can do wonders. But once we move to more pedestrian issues, it becomes obvious that the supertool it’s made out to be is a rather feeble instrument that tackles problems quite unimaginatively and unambitiously. Worse, it prevents us from having many important public debates.

As Band-Aids go, Big Data is excellent. But Band-Aids are useless when the patient needs surgery. In that case, trying to use a Band-Aid may result in amputation. This, at least, is the hunch I drew from Big Data.


Dr. Pentland, an academic adviser to the World Economic Forum’s initiatives on Big Data and personal data, agrees that limitations on data collection still make sense, as long as they are flexible and not a “sledgehammer that risks damaging the public good.”

He is leading a group at the M.I.T. Media Lab that is at the forefront of a number of personal data and privacy programs and real-world experiments. He espouses what he calls “a new deal on data” with three basic tenets: you have the right to possess your data, to control how it is used, and to destroy or distribute it as you see fit.

Personal data, Dr. Pentland says, is like modern money — digital packets that move around the planet, traveling rapidly but needing to be controlled. “You give it to a bank, but there’s only so many things the bank can do with it,” he says.

His M.I.T. group is developing tools for controlling, storing and auditing flows of personal data. Its data store is an open-source version, called openPDS. In theory, this kind of technology would undermine the role of data brokers and, perhaps, mitigate privacy risks. In the search for a deep fat fryer, for example, an audit trail should detect unauthorized use.

So, I don’t really get how Pentland’s system is going to work any better than the Platform for Privacy Preferences (P3P) work that was done a decade ago. Spoiler alert: P3P failed. Hard. And it was intended to simultaneously enhance users’ privacy online (by letting them establish controls on how their personal information was accessed and used) whilst simultaneously giving industry something to point to, in order to avoid federal regulation.

There is a prevalent strain of liberalism that assumes that individuals, when empowered, are best suited to control the dissemination of their personal information. However, it assumes that knowledge, time, and resourcing are equal amongst all parties. This clearly isn’t the case, nor is it the case that individuals are going to be able to learn when advertisers and data miners don’t respect privacy settings. In effect: control does not necessarily equal knowledge, nor does it necessarily equal capacity to act given individuals’ often limited fiscal, educational, temporal, or other resources.

Big data: the greater good or invasion of privacy?

Chatterjee has a good, quick, article on the significance of ‘big data.’. Note experts warning that, as a result of massive data aggregation, almost all individuals will have secret or sensitive information about themselves stored, traded, or used in the course of companies’ daily activities. This information isn’t necessarily about anything illegal, but legality is not the sole benchmark for whether humans want others to know things about them: embarrassing, shameful, or similar information that may not break the law could be financially, personally, or emotionally damaging should it be provided to third-parties.

Also, take note of Ohm’s warning that we should slow down and think about what is happening with regard to massive data aggregation and mining; we shouldn’t just commit ourselves to pushing the ‘privacy envelope.’ Headlong rushes and acceptance of novel technical structures that invisibly affect billions, with little clear accountability for corporate data mining practices, is a recipe for constructing futural harms.

Why I’m quitting Facebook

I left Facebook a long time ago, before many of the current realities of that ecosystem. Rushkoff didn’t leave for the same reasons I did (which stemmed from philosophical conceptions of temporality, time, and privacy) but his reasons echo those I keep hearing from undergrads. It isn’t just that Facebook isn’t ‘cool’; they’re spending less time on the site because the company is increasingly seen as manipulative, secretive, and portrays users in ways antithetical to how the users perceive themselves.

What is perhaps most concerning is what will happen to all the data the company has amassed if/when it implodes like MySpace did. What if, in five or seven years, Facebook effectively closes shop: who will get the mass of data that the company has collected, and how will they subsequently disseminate or manipulate it? It’s this broader concern about long-term use of incredibly intimate data that leaves me most leery of corporate-hosted social media platforms, and it’s an issue that I really don’t think people appreciate. But, then, I guess not a lot of people really remember the dot com crash…


The Big Threats to Internet Security

Dan Goodin has a good piece on one of Bruce Schneier’s recent talks. From the top of the article:

Unlike the security risks posed by criminals, the threat from government regulation and data hoarders such as Apple and Google are more insidious because they threaten to alter the fabric of the Internet itself. They’re also different from traditional Internet threats because the perpetrators are shielded in a cloak of legitimacy. As a result, many people don’t recognize that their personal information or fortunes are more susceptible to these new forces than they ever were to the Russian Business Network or other Internet gangsters.

The notion that government – largely composed of security novices – large corporations, and a feudal security environment (where were trust Apple, Google, etc instead of having a generalizable good surveillance footprint) are key threats of security is not terribly new. This said, Bruce (as always) does a terrific job in explaining the issues in technically accurate ways that are simultaneously accessible to the layperson. Read the article; it’s well worth your time and will quickly demonstrate some of the ‘big’ threats to online security, privacy, and liberty.