Yes, the Apache Foundation Should Dump Accumulo

This post follows on from my previous one, which has the background and links. In brief, the Apache Foundation is hosting the Accumulo project. Accumulo is software created by the NSA and handed to Apache, and it is at the heart of the NSA’s surveillance technology stack. Now that we know about the use of the technology, Apache has the opportunity to distance itself from the NSA surveillance scandal, and should do so.

How should we think about the role of Apache in the NSA surveillance scandal? Perhaps a good place to look is the work of respected open internet advocates like the OpenNet Initiative. So let’s do that.

A couple of years ago Helmin Noman and Jillian York of the OpenNet Initiative published a bulletin called West Censoring East: The Use of Western Technologies by Middle East Censors, 2010-2011. The bulletin documented network filtering of the internet by national governments, and “the use of American- and Canadian-made software for the purpose of government-level filtering in the Middle East and North Africa”. The goal of the report was to inform a “genuine discussion of the ethics and practice of providing national censorship technology and services”. Just to be clear, and for what little it is worth, the report seems admirable to me. The ethical stances it takes were reiterated by Rebecca MacKinnon when she wrote about it last year in her influential book “The Consent of the Networked”. What’s interesting now is to read the report, read the ethical stances it takes regarding the provision of services by Western companies to authoritarian actions by national governments, and apply those lessons to Apache and the NSA. The parallels are, I hope, obvious.

The report concludes that “Western companies are playing a role in the national politics of many countries around the world. By making their software available to the regimes, they are potentially taking sides against citizens and activists who are prevented from accessing and disseminating content thanks in part to filtering software.” The authors complain that “companies appear to have done little to curb the use of their tools–if not offering them outright for that purpose–for government-level censorship. These companies seem not to have adopted policies and procedures to safeguard freedom of expression in the event that states rather than parents and schools use their tools, as their products are being openly used by several state-run ISPs to limit what citizens can and cannot access online.” The final sentence states that “Such companies must recognize the role their tools play in the international landscape and set forth policies that protect Internet users’ right to free expression–or at least put them on record about the role that they play.”

The technologies that the companies are providing are general purpose technologies: almost everyone would agree that internet filtering technologies have valid uses by parents and schools, for example. It’s not the technology itself that is offensive, at least to anyone who is not happy with the idea of kindergarten kids stumbling across violent pornographic images. It’s the relationship between the companies and their customers: the companies are providing a service, knowing the use to which it is going to be put. The report expects companies to think about the use of their tools and to take action to prevent them being used in ways that curb freedoms. It expects companies to limit the use of their tools.

The role of Apache as the host of the NSA-initiated Accumulo project is directly parallel to the role of western companies providing filtering software that is used by authoritarian regimes to curb freedom of speech. So, in the light of the OpenNet report, how would the continued hosting of Accumulo look?

  • Is Apache providing a service to the NSA? Yes it is. Some people have been telling me that it’s not, or that it is but it’s unimportant. Both of which seem positively bizarre to me. The NSA took a deliberate decision, after developing Accumulo, that the best way forward was to open source it and look to a private vendor (sqrrl) to continue to provide a distribution that matches their needs. Apache is instrumental in carrying forward that plan.
  • The NSA could get their software some other way. This is irrelevant. The OpenNet Report does not let McAfee off the hook because Symantec provides a similar service, and we should not let Apache off the hook either.
  • The Accumulo software is general purpose: does that matter? No it doesn’t matter. First, it’s not that general purpose: it’s not like lightbulbs, it is general purpose data collection and data analysis software in the middle of a controversy over data collection and data analysis, and it’s general purpose for anyone who has a data centre and a few petabytes of data to process, and who requires detailed access controls over who can see that. That’s not very general. Second, Apache now knows the uses to which the software is being put, just like the companies providing software to the governments of the Middle East knew how their software was being used once OpenNet reported on it.
  • Why go after Apache, when they are one of the good guys? Because their declared mandate and their broad membership makes it more likely that they will take a stand. It’s not “going after Apache”, it’s getting Apache to do the right thing. It won’t stop the NSA but it limits the breadth of collaboration. I don’t particularly think of Apache as “one of the good guys” because the whole good guys/bad guys way of thinking seems to lead naturally to double standards, but I’m not out to get them, I just hope they do the right thing now they see how their efforts are being used.

Especially for people outside the USA, putting pressure on an international organization seems a useful way to go. If anyone is interested in taking this up, maybe we can put together a petition at least. Please contact me in the comments if you are interested.

Created: 2013-06-15 Sat 14:44

Emacs 24.3.1 (Org mode 8.0.3)

Should the Apache Foundation delist Accumulo?

The Apache Foundation hosts the Apache Accumulo project, which is a data storage and retrieval system for big data created by the NSA in 2008 and submitted to Apache in 2011. Derrick Harris at GigaOm describes Accumulo as “The technological linchpin to everything the NSA is doing from a data-analysis perspective”; it is probably part of the BoundlessInformant open source stack (see this presentation [PDF]) that stores and analyzes the Verizon FISA data.

The Apache Foundation “provides support for the Apache community of open-source software projects, which provide software products for the public good.” It looks to me like Accumulo is outside that mandate.

The Apache Foundation may, because of its membership, be more open to pressure than other organizations involved in the NSA’s big data effort. Are there grounds for a campaign to pressure Apache into removing Accumulo from its list of projects?

There may also be questions about more general-purpose projects that complement Accumulo, like Apache Hadoop, Apache Zookeeper, and Apache Thrift, but these were not designed so specifically for the NSA’s data handling needs as Accumulo.

Meanwhile, of course, stopwatching.us.

Update June 15: Follow-up post, yes it should.

Open Wide: Me at The New Inquiry

I am thrilled to have an essay at The New Inquiry, a great publication that usually features really provocative writing from people who are half my age and twice as well-read.

The essay is called Open Wide and is about the difficult relationship between commons and private capital, particularly digital commons. It was inspired by David Harvey’s 2012 book of essays Rebel Cities; Harvey doesn’t talk about digital commons at all, but he has a lot of enlightening things to say about urban commons. While he sees commons as “spaces of hope for the construction of… a vibrant anti-commodification politics”, Harvey is far more hard-headed than most about the challenges that face commons-based production and about the effects that private capital can have on commons. Much of what he had to say has clear implications for the world of digital production, where leading thinkers have systematically ignored the issues Harvey raises.

The New Inquiry publishes challenging and difficult pieces, and this is not a particularly easy read: an attempt to be theoretical and literary at the same time. I owe a particular debt to Rob Horning, whose editing made a huge difference: any bite and focus that it has is due largely to Rob’s engagement. Faults and errors are, of course, my own.

Free Software and Surveillance

There is much that is moving and challenging in Jacob Appelbaum’s 29C3 keynote from December 2012, about the surveillance state, and Appelbaum has earned the right to be listened to from his work on the Tor Project. But…

 

At several places Appelbaum asserts that creating free software is a way of acting against institutions such as the NSA, and a way of building a better world. So at 12’01”: “It is possible to make a living making free software for freedom, instead of closed source proprietary malware for cops”, and at 40’50”: “everyone that’s worked on free software and open source software… these are things we should try to focus on… When we build free and open source software… we are enabling people to be free in ways that they were not. Literally, people who write free software are granting liberties.”

The picture of hackers versus spooks, positioning free and open source software as an alternative to the surveillance technologies of the NSA, just doesn’t hold up. Appelbaum must know that the NSA has a long history of engagement with open-source software, so “closed source proprietary malware for cops” mischaracterizes the technology of surveillance. The NSA Boundless Informant data-mining tool proclaims that it “leverages FOSS technology”, specifically the Hadoop File System, MapReduce (perhaps built on the Apache Accumulo project, which was created by the NSA and contributed to the Apache Foundation), and CloudBase.

These are just the most recent examples: the NSA holds Open Source  industry Days, like this one last year; it developed the SELinux mechanisms for supporting access control security policies that has been integrated into the mainline Linux kernel since version 2.6. There’s a good chance that the NSA’s huge new data center in Bluffdale, Utah, which Appelbaum describes at the beginning of his talk, is running open source SELinux software on every computer.

And beyond the NSA, the new set of “Big Data” technologies associated with data acquisition and analysis has strong open source roots. These are the file systems, data storage systems and data-processing systems built to manage data sets that span so many disks that routine failure of servers is expected, and tolerance for such failure must be built into the system. While much of the initiative came from proprietary systems built at the web giants (Google File System, Google Big Table, Amazon Dynamo), the open source implementations of Hadoop File System, Hadoop Map Reduce, and databases such as MongoDB and Cassandra are becoming the industry norm. Surveillance is as much an open source phenomenon as a closed source one.

These are all well-known facts, but maybe they need to be restated. Let me be clear: I’m not reversing Appelbaum’s claim. There is a great deal of closed source software around the surveillance and internet control landscape as well. And (full disclaimer), my salary is paid by (but I do not speak for) a company that mainly makes its money off closed source software, so I’m not claiming a moral high ground here. But to trumpet free and open source software as an alternative to the surveillance  systems  it has helped to build is nothing but wishful thinking.

—-

Date: 2013-06-10 Mon. Emacs 24.3.1 (Org mode 8.0.3)

FutureEverything: Notes Against Openness

I’m really looking forward to being part of FutureEverything in Manchester next week, where I’ll be a panellist at Open Data Manchester on Tuesday and at Policies and Politics of Open Data on Thursday. Each event starts with five-minute lead-ins from the panel members. Some of the panellists are real experts who know more than I do about open data, but “in for a penny, in for a pound”: so on Tuesday I’ll use my five minutes to argue against standards (and especially universal standards), and on Thursday I’ll argue that openness is an idea that has outlived its usefulness.

Here are notes for Thursday’s opening remarks, which will be familiar to regular readers. I think I’ll have to cut them down a bit for time.


We all know that the ideas and actions around “Open Government Data” have created a very wide umbrella that covers many different agendas. It covers civil liberties campaigners, civic activists, startups, politicians from across the political spectrum, and major international corporations. And we all know that those agendas and groups are a bit uncomfortable being in such close proximity. But like “freedom”, “openness” is something that everyone can agree on, and it’s served to paper over the cracks between these disparate interests.

Unfortunately, it looks to me increasingly as if the language of transparency, the language of non-commercial civic engagement, and the romantic language of rebellion are being used to provide an exciting and appealing facade for an agenda that has nothing to do with transparency, nothing to do with civic participation, and a lot to do with traditional power politics and profit making.

It’s time to get out from under the umbrella and to acknowledge that we are in different camps with different goals. And to do that we need to get rid of the idea that “openness” is an unalloyed virtue.

Here are two examples of how openness is being misused.

The first is about openness and transparency, and it’s from Canada where I live and of which I am a citizen. The Government of Canada has an active open data program. It’s a member of the Open Government Partnership, now chaired by Francis Maude; if you look in Capgemini’s recent white paper on The Open Data Economy you’ll see Canada together with the UK, the USA, France, and Australia as one of the government trendsetters. Last October Jonathan Rosenberg of Google posted an article on the company web site titled “The Future is Open“, in which he wrote:

Claims to governmental transparency are one thing – moves like the one Canada made recently, with its formal Open Government Declaration, are another. The document recognises that open is an active state, not a passive one – it’s not just that data should be free to citizens whenever possible, but that an active ‘culture of engagement’ should be the goal of such measures.

So three cheers for open government Canada? Of course, that’s only one side of the story. Here’s a list of other events in Canada around openness and transparency.

  • Library and Archives Canada, which is the equivalent of the British Library, has seen its acquisition and lending programs cut back. Its historical item spending has been cut from $385K (’08-’09) to $12K (’12-’13) as its overall budget has been cut from $173M to $108M. (Toronto Star, March 10, 2013)
  • The Government is “muzzling its scientists” according to the BBC. A protocol introduced in 2008 requires that “all interview requests for scientists employed by the government must first be cleared by officials. A decision as to whether to allow the interview can take several days, which can prevent government scientists commenting on breaking news stories. Sources say that requests are often refused and when interviews are granted, government media relations officials can and do ask for written questions to be submitted in advance and elect to sit in on the interview.”
  • Cuts to Statistics Canada: in response to yet another wave of cuts, a group of concerned academics recently wrote that “For many of us, it started with the census. In a controversial move, our government switched from a mandatory to a voluntary census in the summer of 2010. The former Statistics Canada chief, the media and the research community reacted with shock and largely opposed the change to no avail … We have now halted the collection and analysis of our most informative longitudinal information on our labour force, on the workplace, on health and health care, and on child well-being. Add to this our universal census of the population. How might Canada expect to meet the policy challenges of the future when we no longer have the ability to understand where we are today?” (University of Manitoba)
  • The move to packaging legislation in so-called “Omnibus bills” that cover many different initiatives in a single, perhaps several hundred page, package has severely curtailed public debate over new initiatives and major legislative changes.

If there’s a message here, it’s just that openness cannot be measured in bytes. And if someone is measuring it in bytes, then you have to wonder what the motives are. So the CapGemini report (above) looks at the Open Data Economy simply by comparing the open data portals that each nation has produced. This is datawashing.

A brief second story. If you look at what kind of new economic possibilities are being promoted by open data, CapGemini highlights Zillow, a Real Estate Advertising network based in California, which uses open tax data, county records, and home-for-sale listings. If there is one industry who has proved able to use the language of openness and disruption to great effect, it’s the Silicon Valley venture capital industry. But whereas when Linus Torvalds started Linux “openness” was a tool for individuals to build something to compete with large enterprises, now “openness” is a tool for large enterprises with a lot of funding to hammer smaller non-profit groups. We hear the language of openness and disruption coming in education, where Coursera and Udacity can go to Davos and paint themselves as radicals, to Uber and AirBnB, whosee millionaires claim to be part of a “sharing economy” disrupting nightmare overlords like the Bed & Breakfast industry or the taxi cartels. We are seeing the emergence of a winner-take-all economy in which small organizations and small businesses are severely handicapped against those with capital behind them. All in the name of openness.

If we see civic participation as an end in itself, which I do, then we need to treat civic computing like a cultural activity. That means we need to build some barriers to protect civic-scale groups from large companies who have advantages of scale, and who can deliver “efficiency” but not participation. Tony Ageh of the BBC, speaking at this conference, describes a vision of public domain data as a “commons” but I think he gets it wrong. A commons is not a free-for-all, where anyone can come and take anything they want. A commons suggests a group of people who all have an interest in maintaining and cultivating a shared resource, and that suggests limits to access from outside. There is room for a number of models of providing mixed access to data, from non-commercial licenses, to closed partnerships between cities and citizen groups, to non-standard formats for sharing that reflect the quirks of individual cities and groups. Each of these seems to break the idea of “openness” in one way or another, but we should be prepared to do so. Openness in and of itself is not enough to hold together a worthwhile coalition and it’s time to get over it.

Evgeny Morozov’s “To Save Everything, Click Here”

Everybody loves Jane Jacobs.

I love Jane Jacobs. “Austrian” economists with whom I disagree, like Alex Tabarrok, love Jane Jacobs. You probably love Jane Jacobs. Steven Johnson says he loves Jane Jacobs in his recent book Future Perfect  — but so does Evgeny Morozov at the beginning of To Save Everything, Click Here, and Morozov is arguing against Johnson. Someone has to be getting Jane Jacobs wrong. Much of this essay is an attempt to see why Morozov gets Jacobs right, while Johnson and others are missing something important.

~ ~ ~

From 2005 to 2007, Evgeny Morozov tells us, he thought that digital technology might be a way to rid the world of autocratic regimes. His disillusionment was channelled into his influential first book, The Net Delusion, a full-on attack on “the sheer callousness and utopianism” of the “Internet Freedom” project (p 354).

This time around, Morozov’s target is much broader, but still centred in the world of digital technologies, and particularly the Internet. He takes aim at the ideologies that have grown up around the Internet, and their many manifestations.

Chapter 7 is typical of the book. Here is a collection of people who record and track their everyday lives online, and then analyze and quantify their existence, from toothbrushing to reading to fecal contents. These “datasexuals” now have a social movement, of a sort, which they call the “Quantified Self” movement. It would be easy to dismiss the Quantified Selfers as harmless eccentrics if they did not have a significant presence among the opinion shapers and leading lights of Silicon Valley, and if the mindset they embody was not clearly present, if in moderated form, in the wider digital world, and if the assumptions and goals were not oozing out over the rest of us. From quantifying oneself in a private context it is a short step to the presentation of self through these numbers, and the use of them as a basis for optimization and refinement. So Morozov cites Reid Hoffman, founder of LinkedIn, who says that self tracking is a way to “acknowledge that you have bugs, that there’s new development to do on yourself” (237) so that we can algorithmically measure, tweak, and refine ourselves and our self-presentation to the world.

From here it is just one more short step to the buying and selling of our personal data: to insurers in return for lower premiums, to advertisers in return for better deals. Our personal data becomes a new “asset class” and executives respond by “trying to shift the focus [of debate] from purely privacy to what we call property rights” (235). New social pressures emerge as the digitizers follow their path of bits, algorithms and markets (career counsellors now routinely recommend that building a strong presence on LinkedIn is a route to a better job), and we can replace debates about privacy with reassurances about personal choice. “Privacy is mostly an illusion, but you’ll have as much of it as you want to pay for” says Kevin Kelly (236). New companies emerge to optimize our self-presentation on the web (reputation.com), new norms emerge as “If you’re going out with someone, and they don’t have a Facebook profile, you should be suspicious” (Slate’s Farhad Manjoo, quoted on p. 239). Why would you not share your real-time blood alcohol levels with your employer if you don’t have anything to hide? (240).

The impact of the digital on our lives is such that, while the social consequences of self-tracking seem immense, they are just one thread among many of the digital revolution. In separate chapters, Morozov investigates new developments in policing, arts and culture, politics, government, social engineering, civic life, health, the workplace, and the increasingly designed, architected environments in which we live. There is no aspect of life that isn’t ready to be tweaked, nudged, hacked and filtered into optimal performance.

How to respond to such a flood of changes? One is tempted to define oneself by an attitude to digital technologies themselves: to be unequivocally pro- or anti-technology. But to reject or to accept technology wholesale has no future: wholesale rejection entails rejection, not just of integrated circuits, but of the people connected by them: shaping the use of technology lies not in the realm of individual choice, but of social choice. Wholesale acceptance seems fatalistic – abandoning the possibility of having any say in the forces shaping the societies in which we live.

Morozov undertakes two projects, one successfully and one less so. The first is to provide a framework in which to think about the new inventions that are being sold to us, and the patterns of thought behind them. Morozov identifies a twin-tracked ideology behind the inventions and inventiveness of the digital world. One track is “Internet-centrism” – the practice of “taking a model of how the Internet works and applying it to other endeavours”. Writers have imbued the Internet with “a way of working”; it has a “grain” to which we must adapt; it has a culture, a “way it is meant to be used”, and it comes with a mythology in which iTunes and Wikipedia become models to think about the future of politics, and Zynga is a model for civic engagement (15). The second track is “solutionism”: the recasting of social situations as problems with definite solutions; processes to be optimized (23).

Morozov does a fine job of articulating Internet-centrism and solutionism as two facets of a single Silicon Valley ideology, whose followers include the Valley’s software industry leaders, venture capitalists, conferences and “thought leaders”, as an evolution of the “Cyberselfish” ideology identified a decade ago by Paulina Borsook. The common assumptions, shared biases, and individualistic predilictions give a cohesiveness and homogeneity to the new ideas and inventions, actively constructing and shaping the digital environment from which they claim to draw their inspiration. The insistence on “disrupting” our social and environmental lives; the idea that the solutions inspired by and enabled by the Internet mark a clean break from historical patterns, a never-before-seen opportunity – these mean that the only lessons to learn from history are those of previous technological disruptions. The view of society as an institution-free network of autonomous individuals practicing free exchange makes the social sciences, with the exception of economics, irrelevant. What’s left is engineering, neuroscience, an understanding of incentives (in the narrowly utilitarian sense): just right for those whose intellectual predispositions are to algorithms, design, and data structures. Morozov argues that these orthodoxies have had “a corrosive effect on public discourse and on reform projects” (16) and it’s difficult to argue otherwise.

Morozov’s approach to unpicking the hidden assumptions of solutionism, and the unpalatable consequences of its application, is impressive but less successful. In order to avoid a blanket technopessimism he makes two moves. The first is to adopt a broadly social constructionist approach to the world of digital technologies. The Internet does not shape us, it is shaped by the society in which it is growing. He is with Raymond Williams, against Marshall McLuhan. His stance here is blunt: he refuses to see “the Internet” as an agent of change, for good or bad. “The Internet” is not a cause; it does not explain things, it is the thing that needs to be explained. Chapter 2 is titled The Internet Tells Us Nothing (Because It Doesn’t Actually Exist).

The second, more surprising move, is to adopt a critique that was first described in a pejorative sense by Albert Hirschmann. “In his influential book The Rhetoric of Reaction, Hirschmann argued that all progressive reforms usually attract conservative criticisms that build on one of the following three themes: perversity (whereby the proposed intervention only worsens the problem at hand), futility (whereby the intervention yields no results whatsoever), and jeopardy (whereby the intervention threatens to undermine some previous, hard-earned accomplishment)” (6). Morozov does not see himself as a conservative, but instead places himself in the tradition of other thinkers who have stood against programs of organized efficiency; “Jane Jacobs attacks on the arrogance of urban planning, Michael Oakeshott’s rebellion against rationalists in all walks of human existence, Hans Jonas’s impatience with the cold comfort of cybernetics; and, more recently, James Scott’s concern with how states have forced what he calls ‘legibility’ on their subjects” (7). The list is an interesting one because, as I mentioned at the beginning, it features the same cast of characters that the solutionists — those whom Morozov opposes so implacably — routinely invoke as their own inspirations.

The Hirschmann framework provides Morozov with a recipe for how to think about the many solutionist initiatives he tackles, and many of the passages in the book have a similar structure. Let’s return to self-tracking for a moment. Morozov’s first line of critique is Hirschmann’s “jeopardy”: he invokes the ‘technostructuralists’ to ask not just what individual choices self-tracking offers, but to ask how it changes the environment we inhabit. A decision not to share becomes a tacit acknowledgement that you have something to hide. The danger is that “if you are well and well-off, self-monitoring will only make things better for you. If you are none of these things, the personal prospectus could make your life much more difficult, with higher insurance premiums, fewer discounts, and limited employment prospects” (240). It erodes privacy, the ability to make a clean start, and erodes risk-taking behaviour given the consequences of failure. A second line of critique is to ask what, as our quantifiable aspects become the focus of attention, is missing in the quantified portrait that emerges: what intangible aspects of ourselves become invisible. Do these numbers, he asks, miss meaning? Where do ethics and aesthetics go to in a world of numbers? Morozov surveys the centuries-old debates over the virtues and perils of quantification. Here the critique stumbles, as Morozov rolls out thinker after thinker in a parade of reasons to doubt the benefits of quantification. From Nietzsche to Nussbaum, from nutritionism (the quantification of food) to water-metering and the evolution of clothes-washing norms, to the benefits of friction and dissonance in our everyday lives, there is no doubt he covers an impressive amount of ground, but the argument is scattershot; disjoint. The end result is an erudite and widely-sourced list of the ways in which technologies may lead to bad outcomes – but it is still a list, and it lacks the force of a strong central thesis behind it.

The other chapters follow a similar pattern: the perversity, futility, and jeopardy of solutionist agendas show a breadth of investigation that should shame many of his more populist opponents, and provide valuable contexts in which to think about technological programmes. In particular, his insistence on seeking out historical precedents for today’s arguments is a welcome change from the language of “rupture” that many solutionists prefer.

If there is a unified point of view behind the critique, it can be traced back to the “anti-solutionists” with whom Morozov identifies. Like Morozov and like Steven Johnson, I’m a big admirer of Jane Jacobs’s Death and Life of Great American Cities, and James Scott’s Seeing Like a State: which makes me wonder how can they end up in such different camps. The fault, you will not be surprised to hear, belongs with the solutionists.

One of the remarkable insights of computer scientists (and social scientists and natural scientists in the computer age) is an understanding of how great complexity and diversity can be generated by populations of simple agents following simple rules. Just as schools of fish and flocks of starlings create sweeping artistic displays by pursuing simple individual rules, so the rich tapestry of city life emerges from simple everyday interactions. The ideas of network theorists lend themselves to talk of self-organization, non-hierarchical structures, and informational cascades. Computer scientists take ideas such as the “Game of Life”, the stunning images of fractal shapes, and the rich behaviour of networks to illustrate how complexity arises from simplicity. From spin-glasses in magnets to the sorting and emergence of patterns revealed by Schelling and his intellectual descendants, simple “micromotives” give rise to surprising and intricate patterns of “macrobehaviour”. Such agent-based thinking seems at first to mesh perfectly with Jacobs’s closely observed studies of city life. She famously focused her piercing, analytical eye on the details of every day life in large cities, and used her observations to challenge and then triumph over the grand visions and arrogance of top-down city planners. It’s the bottom-up nature of her approach that inspires: the planners are trying to impose patterns on populations from above but they miss the relationship between the large and the small. It is tempting, then, to take the descriptions of Jacobs’s cities and encode them in algorithms: agent-based simulations of the effects of block size on pedestrian traffic patterns seem almost mandated, so obvious a next step do they seem from Jacobs’s chapter on the topic.

Yet this step, I increasingly believe, is a mistake. Solutionism is ultimately central planning by another name. The arrogance of the urban planner reappears as the arrogance of the agent-based modeller and the Internet entrepreneur: the plan is still monolithic, but now takes the shape of a network. As Steven Johnson says, when his “peer progressives” see a social problem, they design a peer network to solve it. But what has happened to the citizens in this network? They have been reduced to dumb followers of simple rules. The richness and complexity – all the interest, in fact – lies in the structure of the network. If the outcome isn’t what you want, well tweak the incentives, adjust the topology of the network, provide an additional option for the nodes (sorry, people) to choose from. For all its talk of bottom-up, decentralized thinking, the Internet-centric solutionists end up with an impoverished perspective of individual behaviour.

Just because complex and rich behaviours can arise from simple rules doesn’t mean that people are simple beings. Any theory that applies both to murmurations of starlings or spin-glasses of magnetic ions as well as to cities of humans is, almost by definition, missing the distinctive features of human societies. Complexity can arise from simplicity at the small scale, but macro-level complexity also arises from micro-level complexity. The subtle and ill-understood nature of our own needs and interactions will defeat the best efforts of solutionist planning, just has it has defeated those of central planning and of free markets.

In his final chapters, Morozov appeals to this particularist view of the world, in which each node of a network is different from others, and in which general solutions don’t exist. To discard the importance of the details of our daily interactions, as the solutionists inevitably do, is to inevitably provoke unexpected responses, unintentional side effects, and unanticipated breakdowns of the solutionist schemes. When Brian Chesky of AirBnB complains that there are 30,000 different cities in which he wants to operate, and that it’s just not practical to negotiate with each one, he is not designing a bottom-up solution, he is imposing a top-down network. He is demanding that cities become “legible” in James Scott’s terminology, to his overarching (and simplistic) algorithms.

To reach for an alternative vision, Morozov looks to artists who have engaged in “adversarial design” to illustrate the importance of acknowledging micro-level complexity. But to look to the artificiality of the arts is second-best here; there is enough variation and richness of detail in the normal everyday world to illustrate the importance of variation and local knowledge and unanticipated interactions.

But despite these minor complaints, “Click Here” is an admirable and significant achievement. It identifies and makes a valuable and intellectually adventurous assault on what is becoming an increasingly obvious problem: the appropriation of democratic and “bottom-up” visions by those who seek to impose their own top-down networks on the rest of us, and who reduce us to simplistic nodes in the process. This is a valuable book: now if only someone could make a TED Talk of it.

Written using Org version 7.6 with Emacs version 23.

Notes on Identity, Institutions, and Uprisings

Introduction

Finishing up what I said I’d finish a couple of months ago, this is a shorter version of a paper on “Identity, Institutions, and Uprisings” with less mathematics, no references (see the link above) and more opinionating. Also, a longer version of what I’m going to say at Theorizing the Web 2013 in a few days.

There is a theoretical side to the “Facebook Revolution” debate about the role of digital technologies in the 2011 “Arab Spring” uprisings, and it boils down to two ways of looking at things: the micro and the macro. On the one hand, we have the rational choice, agent-based approach and on the other we have more traditional sociological approaches based on larger-scale social structures.

If you look at some of the key characteristics of the uprisings, it looks like a win for the micro side.

Theories, and North African uprisings.
Event Micro Macro
Sudden uprising (cascade) Y N
Lack of strong opposition movement Y N
Network technologies Y N
Score 3 0

The single most dramatic thing about the “Arab Spring” uprisings was their unexpected suddenness. They fit the “information cascade” models developed by Timur Kuran, Suzanne Lohmann and others to describe the equally dramatic and sudden 1989 uprisings in Eastern Europe. Nothing on the “macro” side matches the elegant explanation of sudden, discontinuous change given by the micro-theorists.

Related to this suddenness is the lack of a strong opposition movement before the uprising. It’s not that there was no opposition, but there was nothing of the strength to indicate a coming crisis. The cascade theories have no need, or even place, for organizations or movements: these are population dynamics models, with no structure bigger than the networked individual. Meanwhile, the leading sociological theories deal with movements, organizations, and resource mobilization. Score two to the micro-theorists.

Finally, we have the role of digital technologies, which segues naturally to network models of society. Talk of information technologies leads equally naturally to a focus on information diffusion across networks, in which increased connectivity lowering barriers to collaboration, discussion, and information sharing. And the macro-theorists again seem to have little on their side to cope with these kinds of ideas.

It looks like a shut-out win for the micro-theorists; the language of networks and information replaces the language of social movements and repertoires of performance, and with that comes the inevitable idea that with a new kind of theory we are we seeing a new kind of uprising, in which self-organizing networks based on digital technologies take centre stage…

But you will have realized by now that this is a setup for me to argue that there’s another way of looking at these events, so let’s get to it.

The key success of micro-level theory is the explanation of cascades, which is a natural consequence of any model that has multiple equilibria. Just because of that success, we don’t need to go whole hog and take on board the ideas of information-driven and network-sustained change. I want to argue that we can take the concepts that sociological research has shown to be important, and move them into the realm of rational choice models. And when we do, we not only get population dynamics and cascades, but we also get explanations for several other aspects of dissent and uprisings that networks and information-based theories can’t deliver.

Is there a downside? Of course there is. Behind the scenes, it’s often the case that rational choice theorists like long equations while sociologists love long words. Rational choice theorists see the sociologists’ concepts as fuzzy, while the sociologists see the incentives of rational choice models as simplistic. What I have to offer demands both long equations and long words, and is open to being criticism for being simultaneously simplistic and fuzzy. Ah well.

Facebook as a “free space”

Let’s start with a question. Zeynep Tufekci is a sociologist who was in Egypt right after the January 2011 uprisings, interviewing participants. Here’s what she says:

When I was in post-Mubarak Cairo, my hosts kept pointing in amazement to various street corners where fierce political discussions were being held and often whispered, before remembering they could now speak up and adjusting their voice, “You never saw this. Nobody ever discussed politics openly, ever.” Then they would pause and add, “Well, except online, of course. We all discussed politics online.”

So the question is that final sentence. Why is it that, prior to the revolution, people could discuss politics online but not elsewhere? What made “online” a venue where those discussions could take place? It’s not just ease of communication, because if you want to communicate you can stand on a busy street corner – as people were doing when Tufekci visited.

The key thing is that communication online was, for some reason, safe, while communication on the street was not. It’s not just that communication among like-minded people was possible, but that the “online” spaces were a venue where such communication did not have the same  consequences. Somehow, the speech was hidden from those in power. It was a trusted environment.

Now while the logic of networks is a good way to explain easy communication, it doesn’t lend itself to discussions of trust. Fortunately sociologists have long been aware of the importance of these “free spaces” in which dissenting voices can communicate. Here are Francesca Polletta and James Jasper in a 2001 paper:

Concepts of “submerged networks”, “halfway houses”, “free spaces”, “havens”, “sequestered social sites”, and “abeyance structures” describe institutions removed from the physical and ideological control of those in power, for example the black church before the civil rights movement and literary circles in communist Eastern Europe. Such institutions… represent a “free space” in which people can develop counterhegemonic ideas and oppositional identities.

So these notions of “free spaces” have been around for some time and surely fit something about the way that online political discussion worked in Egypt. Free spaces are institutions (in a broad sense of the term) that are not outlawed, but which appeal to outsiders of society rather than to those who identify with the powers-that-be. They manage to be transparent to their members while being opaque to officialdom.

More generally, following Charles Tilly and Sidney Tarrow, we can think of institutions in  authoritarian states as being of three kinds.

Types of institution in authoritarian states
Institution High Status Low Status
Prescribed Y Y/N
Tolerated N Y
Forbidden N N
  • Prescribed institutions are the mainstream and establishment institutions of society. They may include the education system, organizations like the army, and also things like national celebrations. Some of these institutions include people of all levels of status, while some are restricted to high-status individuals and families.
  • Tolerated institutions are legal, but their membership is limited to lower-status individuals. In some countries these would include religious institutions associated with minority groups, perhaps some artistic and cultural institutions, and workplace organizations in countries where they can exist outside official control. These are the venues that, according to Polletta and Jasper, can provide spaces for dissent. Obviously there is a wide range of what institutions are tolerated and what are forbidden. North Korea has a lot fewer “tolerated” institutions than 1980 Poland.
  • Forbidden institutions are those that are not permitted in authoritarian societies. Opposition political parties, independent unions, that sort of thing.

But how do these institutions become “removed from the physical and ideological control of those in power”? The answer lies in what Polletta & Jasper call “collective identity”. Tolerated institutions –whether subcultures, groups, or whatever – build up their own practices to establish autonomy.

Collective identity is “an individual’s cognitive, moral, and emotional connection with a broader community, category, practice, or institution.” It gets expressed in “cultural materials—names, narratives, symbols, verbal styles, rituals, clothing, and so on.” And these expressions provide boundary-setting rituals and institutions that separate challengers from those in power, and so can strengthen internal solidarity.

Examples of “free spaces” in authoritarian societies abound. In his book Exit-Voice Dynamics and the Collapse of East Germany, Steven Pfaff highlights the importance of some very narrow institutions that he calls “Niche society”. These are “pockets of private life, around home, car and allotment” where people could voice their disenchantment and cynicism. A broader form of dissent took place in institutions of youth culture: despite party efforts to establish bands and music venues for German youth, many sought out more alternative forms of music, and clashes took place  between fans and police at concerts. Music events are not, at least publicly, political events and so while the events might not be forbidden, you would not find party supporters taking part. Finally, Pfaff notes that “Dissent could only take place in gaps in the system of social control that dissidents could exploit. In the GDR this principally meant the churches.” Again, churches are an example of an institution that was legal, but which naturally separated society’s outsiders from those in power.

Connecting Identity to Rational Choice?

So now we seem to have two separate sets of ideas. On the one hand we have a theory of uprisings that makes no use of sociological concepts. On the other hand, to explain pre-uprising dissent we need to look at sociological ideas such as institutions and identity. Obviously there is a bridge that must be built if we are to connect these seemingly separate theoretical islands. Can the gap be bridged? Well yes it can, thanks to the “identity economics” work of George Akerlof and Rachel Kranton, who argue that identity provides a key motivation for many social situations. They  take the concept of identity seriously, and simplify it to fit it into a tractable micro-level model. Identity, they say, has three parts to it.

  • First is a set of social categories: for us, those categories are “government supporter” or “opponent”.
  • Next, each of these identities has a set of attributes associated with it. These vary from society to society. Economic status is one, religious or ethnic or gender identities are others.
  • And finally, each identity has a set of norms of behavior: in this case we simplify the options to “conform” to society’s expectations or “dissent”.

Individuals then have two choices to make. First, they need to adopt an identity: Government or Opposition? Next, if they are oppositional they need to decide whether to engage in active dissent or to conform to official expectations. If we arrange the population according to status, then those at the lower status choose O (O has a higher utility), and people with higher status choose G. Here is a graph that shows a case where the switchover appears at the mid-point.

b-utility

Utility and identity in an authoritarian society

In some times and places, no one gives a hoot which identity you adopt, while at other times and places it can be a matter of life or death. I’ll call this scaling of the difference between O and G the identity polarization of society, and we’ll be needing this concept a lot.

Identity is one of the two things we need to explain “free spaces” but before we go to the other, let’s take a short detour. One of the key successes of information-driven rational choice models was the fact that they yield cascades. Can our identity-driven model also give cascades? Funny you should ask…

Identity Cascades

Here is a cascade.

b-cascade

A cascade. The yellow line and the yellow dot are equilibria of the model.

If you want to know the gory details, including what the “hegemony” label on the x axis means, you have to go and read the paper. But see that there are two equilibria here. One is a stable authoritarian state, with zero activity (the yellow line at the right) and a high government hegemony. The other is a state in crisis, with a high level of dissent (the yellow dot where the lines cross). And a small change in society can lead to a sudden  discontinuous switch from one to the other: a cascade.

To generate multiple equilibria you need some form of externality: some way in which one person’s actions influence those around them. This model generates cascades by asserting that active dissent increases the identity polarization of society: the more active dissent there is, the more it matters which side you are on. It’s not so much an information cascade as an identity cascade.

Although this is a rational choice model, it does not invoke networks, and information is not central to the argument. In most cascade models the cascade is generated by two things:

  • active dissent reveals information, about the state of the society or about the beliefs of other people. This is the “preference falsification” argument.
  • there is safety in numbers: the more people protesting, the safer it is to protest.

I’ve criticized these ideas here, but is there any evidence to suggest that identity does get polarized as a result of dissent? Anecdotally, there is. Here is a Marxist radical speaking about Paris, 1968:

I was completely surprised by 1968… I had an idea of the revolutionary process and it was nothing like this. I saw students building barricades, but these were people who knew nothing of revolution. They were not even political. There was no organisation, no planning.

In the lead-up to 1968, French students were not revolutionaries who had falsified their true preferences in order to conform to society’s expectations. What happened was that during the riots, identity (status quo or radical?) became a central issue, and individuals had to decide “which side are you on?”, and many students switched their identities from mildly status-quo to enthusiastic barricade-builder.

A switch in identity happens when people are pulled along by those around them. As Dennis Chong (1991) writes of the US Civil Rights movement: “friendship and familial, religious, and professional relationships create an array of ongoing exchanges, obligations and expectations that individual.”

In his book on the fall of the GDR, Steven Pfaff repeatedly invokes the “preference falsification” model, but he often steps outside it too. In fact, my biased reading of it is that he sometimes resorts to the preference falsification model because of a lack of alternatives, not because the evidence pulls him that way. But when he writes that “By 1989 official socialist ideology, along with its clear articulation of the nature of injustice, had become a threat to the system it was meant to legitimate” he is talking about a crisis of identity. The crisis served to “focus diffuse grievances”, uniting “a host of disparate concerns into ‘moral anger”‘. This is the crystallization of identities into the two polar choices: “Which side are you on?”

The “identity cascade” model also makes a closer connection between dynamics and the efforts of protesters. I’ll return to this later, but one of the things that protesters do in uprisings is lay claim to the symbols of national identity. Whether it’s Gandhi’s Salt March or GDR protesters choosing the 40th anniversary of the founding of the country, struggles over the meaning of identity become central at times of crisis. If information revelation was all that were needed, there would be no role for the displays of “worthiness, unity, numbers and commitment” that characterize political protest. An identity-driven approach makes this link clear, within a rational choice framework.

(Another nice thing is that within the identity cascade model there is a natural categorization of the kind of events that can precipitate a cascade. A shock to the norms associated with opposition, a change in socio-economic conditions that places more people into the “outsider” category, or a change in state policy (perestroika) all emerge as triggers for cascades. See the paper for more.)

Free Spaces and Screening

With that diversion over, let’s return to the topic of free spaces. How do we get from the language of Polletta and Jasper to the world of rational choice? There is a natural correspondence in the concept of screening: a mechanism that imposes differential costs for two different groups, so that (in a “separating” equilibrium) one group finds it worthwhile to pay the cost, while the other does not. Here, the identity-driven costs of being a member of “tolerated” institution screens out those with the status quo (G) identity.

Just as Akerlof and Kranton simplified identity so that it could be squeezed into a rational choice picture, so we have to simplify the idea of an institution. Henceforth, then, an institution I is characterized by three things:

  • Status (x): This is the natural membership of the institution. We can say that the identity of the institution is the optimal identity of an individual with status x
  • Breadth (δ): Individuals with status in [x – δ, x + δ] are members of I. The “niche society” institutions of the GDR have a very narrow breadth, while events such as national celebrations include all of society.
  • Membership discrimination (m): Some institutions do not discriminate between the two identities, but some do. A discriminating institution demands a cost of membership for individuals whose identity differs from the identity of the institution.

With this idea, you can build a model in which there is a range of institutions that even a strong state will not monitor, because the cost of monitoring is greater than the benefit in terms of dissent that is quieted. These institutions provide the free space for dissent to persist even under conditions of strong government.

Here are some screening institutions

b-screening-institutions

Screening institutions.

The screening institutions are those inside the lozenge shapes. Along the x axis is the status, so all these institutions are “tolerated” in that they are entirely within the “outsider” low-status zone. The broader the reach of the institution (that’s the “δ” in the graph) the less scope there is for these institutions. Finally, and it’s beyond what I can explain in this part, there is a limit to the size of the “public sphere” that also limits the available institutions.

So what this shows is that the economic concept of screening brings to the identity-driven rational choice model the idea of free spaces, well established within the sociological literature. To go back to the beginning of this essay, the existence of such spaces is something that the network models, with their focus on costs of communication, don’t seem well equipped to describe. So now we have a single theory that covers both uprisings and pre-revolutionary dissent, instead of two (one micro, one macro). We can now see that the “free spaces” of online dissent are similar to, and exist for the same reasons as, other free spaces that have existed in the past. Even in Egypt, the role of the Ultras football fans can fit within this model, the football stadium terraces providing a “tolerated” institution within which dissent could be expressed. The model also argues that the key facet of online spaces is not their technological nature, but the fact that they were adopted by, and associated with, the broadly anti-establishment demographic of urban youth. Navigating the discussion spaces of the online world is easy if you have friends who are taking part: not so easy if you are a government official trying to pose as a disenfranchised youth. The technology of social media is epiphenomenal. In broad strokes, this is an argument I made some time ago here: it’s only taken two years for me to work it out properly.

Institutions and Challenges

The final case to look at is when a social movement challenges a weak government. The goal is to put the government in a “dictator’s dilemma”. The idea that clamping down on dissent has the possibility of drawing attention to it, and perhaps fanning the flames, is an old one. Here is a recent statement:

[S]ometimes repression inspires more mobilization; and sometimes it effectively quashes movements or pushes them underground. Sometimes repressive forces are successful in characterizing protesters as legitimate targets of repression, and other times they deligitimize the State and increase the legitimacy of the social movements.
– Cristina Flesher Fominaya and Lesley Wood

Or, going back a little further:

Censorship makes every banned text, bad or good, into an extraordinary text.
– Karl Marx

The contrasting fortunes of the GDR protests and the Tienanmen Square protests in 1989 are the best known example of this dual possibility.

When they believe the time is right, social movements may actively seek to provoke a crisis (contrary to the “safety in numbers” cost minimization that the information cascade theorists tend to favour). Famously, here is Gandhi:

The function of a civil resistance is to provoke response and we will continue to provoke until they respond or change the law. – M. K. Gandhi

We can bring this idea of provocation into a micro model if we bring in a unitary social movement and invoke an interdependency between identity polarization and government coercion. Again, the mathematics is in the paper.

The question we ask is “if you were an organized opposition, what institution would you target, so that a clampdown would cause polarization?” The idea is that clampdown on a mainstream institution would be more likely to polarize society, by disturbing even the government’s own supporters, than clamping down on an “outsider” institution. Again, the opposition has to make a payment to appropriate a mainstream institution, because of membership selectivity. They have to pass with the identity of a status quo supporter. They need to appeal to mainstream sensibilities and to establish legitimacy. Under the right circumstances, an opposition will pay the cost of provocation, because they anticipate that a government response will weaken, not strengthen, the government’s level of control. Here is a figure showing a set of institutions that can be used by an opposition to provoke a crisis.

Institutions that may provoke a crisis

Institutions that may provoke a crisis

The institutions that may provoke a crisis are those within the central closed shape, bounded clockwise by light blue (on top), green, red, and purple. Some of these institutions are “tolerated” institutions to the left of the x* line, with oppositional identities; for these institutions there is no membership cost to be paid by the opposition. Others are “prescribed” institutions that have a mainstream identity. The opposition must pay the price of appropriating these institutions: participating in them in such a way as to provoke the government.

An example of this behaviour comes again from the GDR uprisings of 1989, as described by Steven Pfaff. The opposition chose the celebrations of the GDR’s fortieth anniversary – a mainstream institution – in which to provoke a response. The government did respond, but “its brutal attacks on peaceful protesters during the fortieth anniversary … probably activated what might have otherwise remained despairing, but inert, citizens.”

The opposition made explicit attempts to portray themselves as mainstream Germans, adopting the simple slogan of “Wir sind das volk” (“We are the people”).

“Wir sind das volk” [was] a thin claim, but an uncomplicated “us versus them” message, a claim to political identity that could bridge lines of class, education, neighborhood, and so on. – Steven Pfaff

In previous times, other uprisings have explicitly chosen mainstream or sometimes tolerated institutions as a means of provocation. Gandhi’s use of the Salt March, the Chinese students’ use of the death of Hu Yaobang and Tienanmen Square, the Egyptian protesters appropriation of National Police Day and Tahrir Square all follow this pattern.

There are claims that digital technologies at times of crisis can act in this manner. Ethan Zuckerman has popularized the idea as a “Cute Cat” theory: that mainstream institutions provide a venue for dissent that cannot be shut down without polarizing society. The theory here provides at best limited and conditional support for the idea. Digital technologies were not used as a mechanism of provocation, but played a supporting role. The “cute cat” idea has credence only if the government is not able to silence dissent in a more selective manner than shutting down the entire internet or phone service within the country.

My favourite example is the French “Banquet Campaign” of 1848. Republicans were campaigning for universal male suffrage against an intransigent government that had banned political meetings. Faced with the problem of organizing an opposition in such an environment, they organized banquets. On the 18th of July in Mâcon, Burgundy, five hundred tables were set up for three thousand guests with stands for three thousand more, ostensibly as a celebration of local literary star Alphonse de Lamartine. Lamartine was not just a literary star though, he was also a well-known republican, and the authorities knew that the banquet was a cover for political agitation. But the authorities judged that interfering with the banquets would inflame the situation rather than succeed in suppressing the protest, and so let the banquet proceed. With the success of the Mâcon banquet, the “Campagne des banquets” was launched, and banquets were held around the country. This is the high wire act that governments and opposition walk at times of crisis – when to push ahead, when to hold back, and what tactics may be effective – and is the kind of dance that social movement studies have helped to elucidate. The campaign continued until February of the next year, when the government decided it had no choice but to escalate. The banquets were outlawed, a hastily organized protest brought people into the streets of Paris on February 22, a confrontation between the Municipal Guard and the marchers spilled over into riots, everything got out of hand, and the King fled Paris. Within a few weeks governments were toppled in Milan, Venice, Naples, Palermo, Vienna, Prague, Budapest, Krakow, and Berlin. I like to think that the graph above captures a little of that drama.

Conclusions

What I’ve tried to do here is follow the Akerlof & Kranton example of taking the rich sociological concepts of identity seriously, and used it to construct a rational choice model of uprisings that complements, rather than competes with, sociological models. I’ve added some dynamics to the approach, and brought in a modelling of institutions to build on the notion of collective identity as a motivating force for protest.

The results are that the theory recovers the key facet of other rational choice models of uprisings, which is cascades, but with a different interpretation. Here it is “identity cascades” rather than “information cascades” that drive the sudden change. Beyond cascades, the theory shows how screening provides a mechanism for the existence of “free space” institutions in which dissent can be sustained, even in authoritarian regimes. Finally, it shows how an organized opposition may appropriate mainstream institutions with the explicit intent of provoking a crisis, putting the government in a “dictator’s dilemma” in which neither responding nor failing to respond is a good option.

(Written in Org version 7.9.3f with Emacs version 23)