The Wikileaks Data Dump
| Peter Klein |
I’ve been fascinated by the reaction to the Wikileaks release of 90,000+ classified documents related Afghanistan war. US and British (and Pakistani) authorities are predictably outraged, while critics of the war are encouraged that the disclosures could help turn the tide, as did the Pentagon Papers three decades prior. What interests me the most, however, is the massive size of the Wikileaks archive. As the Guardian’s Roy Greenslade remarked, this is “data journalism.” Wikileaks doesn’t analyze, synthesize, attempt to corroborate, seek alternative points of view, write up the inverted-pyramid lead, or do the other things respectable journalists are supposed to do; it just dumps the data and lets others sort it out.
Some find this approach distasteful. A Pakistani official said “these reports betray a lack of understanding of the complexities of the nations involved.” Well, sure. They’re raw data, nothing more. But isn’t sharing data, and not just analysis, a quintessential New Economy phenomenon? Don’t we have search and analysis tools, data-mining algorithms, page rankings, and other means to sift through the huge piles of stuff that constitute the long tail? Shouldn’t expert commentary and analysis be replicable? Many journals now mandate data-sharing. E.g.: “It is the policy of the American Economic Review to publish papers only if the data used in the analysis are clearly and precisely documented and are readily available to any researcher for purposes of replication. Authors of accepted papers that contain empirical work, simulations, or experimental work must provide to the Review, prior to publication, the data, programs, and other details of the computations sufficient to permit replication. These will be posted on the AER Web site.” Why should foreign-affairs reporting be different?