Aaron Swartz Stole Nothing, But You Wouldn't Know That From the Coverage.

The arrest of Aaron Swartz is clearly uncalled for: he's guilty of nothing Aaron Swartzmore than downloading some files without permission, and we were pleased (though not surprised) to see QuestionCopyright.org board member James Jacobs quoted saying what needs to be said: "Aaron's prosecution undermines academic inquiry and democratic principles. It's incredible that the government would try to lock someone up for allegedly looking up articles at a library."

But beyond the disturbing fact of the arrest itself is a persistent problem in coverage of the case.  Venue after venue refers to Swartz being arrested for "theft" or "stealing", even though he didn't steal anything.  The bias isn't particularly subtle:

Aaron Swartz, the 24-year-old who went from helping to create Reddit to embracing a different brand of progressive activism, has been indicted on federal charges of breaking into MIT and stealing more than four million articles from an online database.

That's from Talking Points Memo, a political news site (boldface ours).  But often the bias starts in the headline:

Internet Activist Charged in M.I.T. Data Theft

That's from the New York Times.  Or this...

Reddit co-founder accused of stealing 4.8m JSTOR documents from MIT

...from The Guardian,or this...

Reddit's Aaron Swartz Charged with Data Theft

...from The Atlantic, or this...

Reddit Co-Founder Charged With Theft: He allegedly stole more than 4 million documents from M.I.T.

...from AdWeek, or this...

Reddit-connected activist indicted in MIT theft

...at MSNBC, or this...

Reddit Co-founder Indicted for Stealing 4 Million Documents from M.I.T. and JSTOR

...from TheBlaze, or...

...well, what's depressing is how much longer we could continue the list.  The "theft" paradigm is still the primary frame journalists use to analyze stories like this.  Even venues not particularly hostile to Aaron's cause unconsciously adopt the copying-is-stealing frame when reporting on such cases.  For example, here's a revealing passage from the Huffington Post:

... A spokeswoman for JSTOR said Tuesday that Swartz had agreed to return all the articles so the company can ensure they aren't distributed.


"We don't own any of this content. We really have to [be] responsible stewards of it," said spokeswoman Heidi McGregor. "We worked hard to find out what was going on. We worked hard to get the data back." ...

"Return all the articles"?  "Get the data back"?  As if it were somehow... missing?  It's right there on JSTOR's servers.  It never left.  JSTOR's computers made copies of it and sent those copies to Aaron's computers over a network.

Copying Is Not Theft

Let's review what the indictment actually alleges:

The claim is that Aaron Swartz downloaded more articles than he was authorized to from an online service, and that he used some networking tricks to prevent the service from noticing the volume and thus blocking the downloads.

That's it.  That's all there is to it.

He didn't "hack" into JSTOR's servers, in the sense of triggering lurking software vulnerabilities that could do actual damage to the servers.  He certainly didn't "steal" anything -- in fact, there is no evidence that he even redistributed the articles to others (not that there should be anything wrong with it if he had, and of course it still wouldn't be stealing even then).  He apparently just used the data for research into the funding behind authors of journal articles.

But to read coverage of the case, you'd think he had broken into someone's house and taken their pets.

I'd love some ideas for how we can get journalists and commentators to cover these cases accurately.  Would 4.8 million letters to the editor do it?

In the meantime, please sign the petition supporting Aaron, lest we all be in his position before long.