Pirate’s Dilemma Review Remixed

My better half had some suggestions to improve this essay, so I’m taking a second pass at it here because I liked the original myself and I don't do this kind of writing often so I might as well get it right. The style and content are much the same, it just gets going more quickly.

The lives of many idealistic left-wing youth become enmeshed in compromise as they get older and stoke the fires of capitalism during the day while trying to throw a little water on those same fires at night well I understand how this happens because like many middle-aged people I wrestle with these contradictions myself and admire tremendously those who have stuck to their principles even at a real cost to their careers and personal lives which is why the few people who really piss me off and whom I actively scorn and who get my blood boiling are those like The Pirate’s Dilemma author Matt Mason who dons the mantle of rebellion and anti-corporate politics while consulting for Disney and Pepsi and P&G and who babbles about the benefits of sharing because "it's not all about the money any more" while giving presentations to the people who brought you McDonald's "I'm Loving It" campaign and who praises vitamin water company Glacéau for "keeping it real" in its advertising campaigns with 50 Cent before telling us it was sold to Coca Cola for $4.1 billion and who praises Procter & Gamble for its viral video campaign and who is entranced by the way that Nike's Air Force One sneaker owes its success to the remix and who places himself on the romantic anti-establishment side of the battle between graffiti and advertising in "a turf war that has raged for centuries between the establishment and a secretive, loose-knit network that doesn't like the top-down, one-way flow of information in public spaces" only to approvingly quote advertising agency Droga5 on creating "a dialogue between advertising and graffiti" which really means using graffiti for commercial ends and making a buck and if that's not selling out to the man then what the fuck is really? because the punk spirit Mason loves so much and claims to identify with has nothing to do with business models or change agents or entrepreneurial spirit or brand-building no the spirit he writes of was defiantly and nihilistically anti-corporate and Matt Mason lives in a corporate world however much he'd like to think otherwise so when he claims that pirates are those who are "pushing back against authority, decentralizing monopolies, and promoting the rule of the people: the very nature of democracy itself" well I see what he means but when he goes on to claim that the anti-authoritarian ideals of youth culture are becoming a new more extreme, invigorated, and equitable strain of the free market–the decentralized future of capitalism well I just want to shake him by the neck and shout at him that you're obviously not stupid Matt Mason so why don't you do what you know you should do and FOLLOW THE FUCKING MONEY before making pronouncements about the benefits of sharing when it's still the case that money is not shared I mean if I share and you get the money then I'm not being altruistic I'm just being a sucker and you're not promoting community you're exploiting the good intentions of those who are spending their time and talent on your venture so if you want to impress me with the subversive role of DVD bootlegging DON’T JUST DON’T quote billionaire Mark Cuban and Disney co-chair Anne Sweeney and billionaire Steve Jobs at me because if they have found a way to co-exist with piracy it doesn't mean that they and their companies stand for a more democratic and equitable form of capitalism it just means they've found ways of using or living with piracy in a way that promotes their own interests over those of their rivals its meet-the-new-boss-same-as-the-old-boss-but-with-edgier-clothing time so as you can tell I end up not taking him seriously which is unfortunate because he has many entertaining stories of hip-hop bands and pirate radio stations and punk culture although I don’t know whether to trust them because where the book overlaps with things that I do know anything about he is often ludicrously wrong like when he repeatedly refers to Linux as a company or when he damns record companies for figuring out it's more profitable to control the distribution system than it is to nurture artists while completely failing to notice that big chunks of the Web 2.0 world he loves so much work on exactly the same model by owning the spot where the money is which is the platform that gives control of the distribution system or when he gives us a canned history of Wikipedia which is derived from one interview with Jimmy Wales so it's no surprise that it gets several key facts wrong or when he identifies Steve Jobs with openness and sharing and claims that the notoriously secretive and proprietary Apple won the music wars because it truly understood sharing when the fact is that Apple wants to share the music they don't own but wants to keep the technology they do own all to themselves you can ask Palm about that who can't sync their own phones with iTunes or you can ask the developers who have left the iPhone App Store over Apple's arbitrary and opaque approval process so after reading this book Matt Mason takes a place for me with people from an earlier generation like Kevin Kelly who claims to be a maverick while working for Conde Nast or Chris Anderson who claims to be on the side of small and scrappy businesses against big companies while promoting Amazon at $40,000 an appearance or Stewart Brand and John Perry Barlow who strive to combine activities like consulting for senior management at large corporations with statements like "I'm an anti-company man" if you can believe it I mean DO YOU HAVE ANY SELF_AWARENESS AT ALL do you have any sense of modesty and this matters because these people have been successful in leading young idealistic people with good intentions up the garden path in the belief that they are taking part in something progressive and politically anti-establishment but which ends up just feeding money into the pockets of Silicon Valley venture capitalists and the lucky guys who get to sell their startups to Google for a nice billion or so as if that's a triumph of the little guy give me a break.

Price of a Bargain: Review in Literary Review of Canada

My review of The Price of a Bargain: The Quest for Cheap and the Death of Globalization, by journalist Gordon Laird, is now out in the Literary Review of Canada. LRC makes some of its essays available online, but not all. Mine is not available, but some of the other contributions to the November 2009 issue are here.

I strongly recommend the lead essay by Charles Wright, Too Much Health Care; Janice Gross Stein is always worth reading whether you agree with her or not, even when she’s writing out of her usual field, and her take on the financial crisis (Between Euphoria and Fear) is a useful overview of a lot of points of view. Personally I’m not that interested in Pierre Trudeau, but apparently a lot of other people are, and Paul Wells of Macleans reviews the latest biography by John English called We’re Still Watching and there’s a lot more. I feel pretty good at being in that company.

One of the nice things about LRC is that you get over 2000 words, so you can do an essay not just a summary of the book’s contents. Here are some excerpts from the first few paragraphs to give a sense of where it goes:

The global supply chain digs shale from the hostile terrain of Northern Alberta, refines it, ships it half way round the world and back again, and in the process turns it into thousands of distinct consumer items, from dollar store plastic sharks to laptop computers.

Some see this continual transformation of the world’s raw materials into things that consumers can use as a Hayekian cornucopia…Others see something more sinister at work…Both pictures have some truth. Nothing can be this massive without having multiple faces, but Gordon Laird definitely leans towards the “sinister” camp. While he does note that “the supply chain has brought us many gifts”, he is more concerned that it has done so with “a global legacy of unresolved problems”: environmental horrors in the backwaters of rural Asia, unregulated emissions of shipping fleets in the world’s oceans, conflict and oppression of the labour force in China, global warming.

But, argues Laird, the era of the global supply chain is almost over. We have gone on our spending binge, and now we must face the hangover. As the environmental consequences of lax standards come home to roost, as sources of cheap energy, cheap credit, and even cheap labour threaten to dry up, we are nearing the death of globalization. “The golden age of affordable consumerism was short. We will very likely never shop this hard again”. … “Our bargain-addicted consumer economy is dangerously leveraged on a series of innovations and inventions not built to last. Specifically, the fundamentals of growth – cheap credit, offshore labour, affordable energy, and transport – will be depleted or become unavailable during the twenty-first century”.  He is not alone: ex-CIBC economist Jeff Rubin and John Ralston Saul both broadly share his opinion about the future of global trade, which is that there will be less of it, and that it will leave many difficult problems in its wake.

Gordon Laird … starts by browsing with us through piles of cheap consumer goods at Las Vegas trade shows for discount stores, then takes us to Shenzen in China where many of the goods are made, shows us around the port of Los Angeles where a never-ending stream of standardized containers flows off the ships and onto trucks, and flies us back to the very beginning of the chain, to the muskeg of northern Alberta where the largest industrial project on the planet is run with Wal-Mart-like precision to extract oil from tar sands. He also ventures to the Mexico-US border to track the migration of workers driven by the shifting labour patterns of globalization. He supplements this field work with wide reading (his list of sources is 25 pages long and contains about 400 items) and a good selection of interviews. The Price of a Bargain is a valuable contribution to the continuing debate on global trade, its impact, and its future.

But while Laird’s willingness to travel and read is obvious, the book would be better if he had sat longer, alone and quiet, to distil the complexity he presents into a coherent picture. As it is, his portrayal is confusing and sometimes contradictory. The book is strong on data, weak on synthesis.

Netflix Prize: Was The Napoleon Dynamite Problem Solved?

I just gave a talk at work on “Recommender Systems and the Netflix Prize”, and included the two major popular articles about the prize in its final year or so. One was in Wired Magazine and one was in the New York Times., and each focused on one outstanding problem that the competitors faced. Wired looked at the quirkiness of users as they rate movies, and the NYT focused on the difficulty of predicting ratings for a handful of divisive movies.

Now that the contest is over we can answer the question, “were those problems solved?”

Let’s start with the Wired article. Entitled “This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize” [link] it interviewed Gavin Potter, aka “Just a guy in a garage”. Here’s the hook:

The computer scientists and statisticians at the top of the leaderboard have developed elaborate and carefully tuned algorithms for representing movie watchers by lists of numbers, from which their tastes in movies can be estimated by a formula. Which is fine, in Gavin Potter’s view — except people aren’t lists of numbers and don’t watch movies as if they were.

Potter is focusing on effects like the Kahneman-Tversky anchoring effect:

If a customer watches three movies in a row that merit four stars — say, the Star Wars trilogy — and then sees one that’s a bit better — say, Blade Runner — they’ll likely give the last movie five stars. But if they started the week with one-star stinkers like the Star Wars prequels, Blade Runner might get only a 4 or even a 3. Anchoring suggests that rating systems need to take account of inertia — a user who has recently given a lot of above-average ratings is likely to continue to do so. Potter finds precisely this phenomenon in the Netflix data; and by being aware of it, he’s able to account for its biasing effects and thus more accurately pin down users’ true tastes.

Well Potter didn’t win, but did these kind of ideas help when it came to the winning submission? The answer is yes. The winning teams worked these kind of patterns, which are independent of particular user-movie combinations, into their models through the bland name of “baseline predictors”.

Any model has to predict ratings that individual users will give for particular movies. A very simple baseline predictor could take the average of all the ratings for that movie, take the average of all ratings given by the user in question, and split the difference. So if the movie has an average rating of 3.45 and the user has rated all the movies he/she/they have watched with an average of 2.55, then the model would predict a rating of 3. This includes some minimal level of user-quirkiness (are they a high or low rater?), some level of information about the movie (is it rated highly or not?) yet has nothing to say about how this particular user is expected to react to this particular movie.

In their winning submission, the BellKor team [PDF link] list their major improvements in the final year of the competition, and the first item they give is improved baseline predictors. In particular,

Much of the temporal variability in the data is included within the baseline predictors, through two major temporal effects. The first addresses the fact that an item’s popularity may change over time. For example, movies can go in and out of popularity as triggered by external events such as the appearance of an actor in a new movie. This is manifested in our models by treating the item bias as a function of time. The second major temporal effect allows users to change their baseline ratings over time. For example, a user who tended to rate an average movie “4 stars”, may now rate such a movie “3 stars”. This may reflect several factors including a natural drift in a user’s rating scale, the fact that ratings are given in the context of other ratings that were given recently and also the fact that the identity of the rater within a household can change over time…

It was brought to our attention by our colleagues at the Pragmatic Theory team (PT) that the number of ratings a user gave on a specific day explains a significant portion of the variability of the data during that day.

A model with these variations, and no specific user-movie considerations (i.e., one that is useless for presenting a list of recommendations to a user) actually ends up being significantly more accurate than Netflix’s own Cinematch algorithm was at the beginning of the competition.

So score one for the winners – they solved the user-quirkiness problem.

The second article was in the New York Times and was called “If You Liked This, You’re Sure to Love That” [link]. Its focus was not the quirkiness of users, but unpredictable movies. Author Clive Thompson interviewed Len Bertoni, a leading contestant:

The more Bertoni improved upon Netflix, the harder it became to move his number forward. This wasn’t just his problem, though; the other competitors say that their progress is stalling, too, as they edge toward 10 percent. Why?

Bertoni says it’s partly because of “Napoleon Dynamite,” an indie comedy from 2004 that achieved cult status and went on to become extremely popular on Netflix. It is, Bertoni and others have discovered, maddeningly hard to determine how much people will like it. When Bertoni runs his algorithms on regular hits like “Lethal Weapon” or “Miss Congeniality” and tries to predict how any given Netflix user will rate them, he’s usually within eight-tenths of a star. But with films like “Napoleon Dynamite,” he’s off by an average of 1.2 stars…

And while “Napoleon Dynamite” is the worst culprit, it isn’t the only troublemaker. A small subset of other titles have caused almost as much bedevilment among the Netflix Prize competitors. When Bertoni showed me a list of his 25 most-difficult-to-predict movies, I noticed they were all similar in some way to “Napoleon Dynamite” — culturally or politically polarizing and hard to classify, including “I Heart Huckabees,” “Lost in Translation,” “Fahrenheit 9/11,” “The Life Aquatic With Steve Zissou,” “Kill Bill: Volume 1” and “Sideways.”

So this is the question that gently haunts the Netflix competition, as well as the recommendation engines used by other online stores like Amazon and iTunes. Just how predictable is human taste, anyway? And if we can’t understand our own preferences, can computers really be any better at it?

The reason Napoleon Dynamite is a problem is not because it’s the most difficult movie to predict – it isn’t – but because it’s difficult to predict and it was rated by a lot of people. A movie that is difficult to predict but which is rated by only a handful of users will contribute very little to the total error in a model.

Well, now the competition is done, the complete data set and the  predictions of the winning submission are available for download [link], so download them I did, loaded them into a SQL Anywhere database, and graphed the results. Here is a plot of the remaining error for each movie against the total number of ratings for that movie, for all 17770 movies. Napoleon Dynamite is the red dot.

grand_prize_erorr_vs_rating_count

With 116,362 ratings, Napoleon Dynamite has a higher error than any other movie rated more than 50,000 times. I’s RMSE is 1.1934: just as bad as it was for Len Bertoni when the original article was written.

So there you go: user quirkiness was resolved, at least the the extent that was needed to win the prize, while quirky movies remained stubbornly unpredictable till the end.

Why I Am Such an Infrequent Blogger

“My assumption, always, is that everyone knows everything I know AND MORE. Rephrase. Everyone who is interested in the kinds of thing that interest me knows everything I know AND MORE. If they're not interested they don't know but don't want to. So there's no point in mentioning things that strike me as interesting, unless a) these are events in the last, say, 5 minutes (so those disposed to be interested might not be au fait or b) I'm up for proselytizing (those not disposed to be interested might be with enough encouragement).”

See? Helen DeWitt even knows more than I do about why I am such an infrequent blogger.