Lesson 1: Harnessing the power of statistics to predict the past
Nate Silver is an influential progressive columnist. He writes about the confluence of statistics and politics. He became a rising star during the last election, and now he writes for the New York Times. Influential guy, totally worth paying attention to.
Anyhow, yesterday he wrote this [TW] column about how Julian Assange was probably set up by the man. Not in the sense that the man made him rape those two women, but in the sense that Silver thinks the man is paying two women to pretend to be raped, what with [TW] all the fun that entails. Silver thinks this is likely the case because he knows some statistics. I also know some statistics.
I'm not an expert on statistics. I've taken three graduate level courses in frequentist statistics (more on that later). I've got a Ph.D. in Ecology (technically Zoology). I've taught ecology (hint: it's mostly statistics +/- lichens and shit). I've also taught college statistics (it also is mostly statistics). Silver studied economics and the statistics of baseball. And that's not me taking a swipe at him-- the statistics of baseball are complicated and meaningful.
Anyhow, one of the nice things about my training is that even though I don't work for the New York Times*, I've got a good sense of what I don't know much about. Things like Bayesian statistics.
So there's basically two statistical posses. There are the frequentists, who are essentially your grandmother's statisticians. As the cool kids say, these are the “unmarked” statisticians. You know, they do "normal" statistics, basically assuming that if you a run an experiment enough times, you'll get the right answer, plus or minus some level of variation.
Then there are the Bayesians. This one guy I knew was a Bayesian. I shared an office with him once. Anyhow, based on that, I'm going to tell you that Silver does an okay job of describing what Bayesian statistics are. As I understand it, Bayesians basically pay a lot of attention to how gaining new information changes your understanding of the statistical validity of a hypothesis.
Anyhow, if you really care about statistics, you're reading the wrong post(s). You should just check out the appropriate Wikipedia entries.
Interestingly enough, Wikipedia points out that fiducial inference also exists, but it's largely the sort of thing assholes use in a desperate attempt to make their Ph.D. theses* relevant, so I'll ignore it completely.
Besides, this post isn't actually about statistics at all.
Anyhow, Silver brings the power of statistics to bear on two important issues:
1) What are you, train-riding lady?
2) Did the nice (but potentially “creepy”) man rape those lying women?
I'm going to have to say the answer to question one is a hearty WTF? “Japanese, Caucasian or Mixed Ethnicity?” Aren't there Japanese of mixed ethnicity? And besides, I know “what the fuck am I” is one of my all-time favorite questions to field from strangers. Occupational hazard, I suppose.
In any case, I need Silver to be more specific, and also to stop staring at that poor lady.
Silver was reading about Assange recently. I was just reading an essay by Richard Lewontin and Richard Levins about the search for life on Mars. It's a small world. That's about as much of a non-sequitur as the one Silver's got going on with his train lady. Anyhow, Lewontin and Levins have this great line in there:
“Science is necessary because things are different, but that science is only possible because things are the same.”
Aside from those authors' more immediate point about NASA not knowing what it's doing (back in the day, at least :eyeroll:), I think the quote is a pretty nice summation of why scientific inference can be of limited utility, or more to the point, why it's difficult. It's certainly possible (and even worthwhile) to do science, but you need to think long and hard about the assumptions you're making if you're going to have any hope of making any headway.
So, the reason Silver wants to know whether Julian Assange raped those women stems from the dilemma that not all rape allegations are the same.
Okay, let's back up. The reason Silver wants to know whether Julian Assange raped those women probably stems from concern about the importance of WikiLeaks. Or maybe the many issues surrounding the widespread prevalence of rape. Or even interest in the well-being of the women in question. It's probably one, hopefully two of those.
In any case, every rape accusation is unique. We couldn't possibly treat all rape accusations as equivalent, otherwise [TW] nobody would ever get convicted of rape. Not even the vanishingly small number that do now. So we have to investigate each accusation on its own. And sure, there totally are statistics we could use to see the degree to which the Assange cases fit various statistical patterns from all rape cases. Indeed, in order to do statistics we need to assume that the Assange cases are like every other rape case.
There are a couple of problems here:
1. As Silver admits, the Assange cases aren't necessarily typically. Michael Moore doesn't typically [TW] bail alleged rapists out of jail. This could be taken as evidence of Assange's innocence, but it could be taken as evidence that one can't compare the way the Swedish government has handled Assange to the way it has handled other rape suspects.
2. In order to do Silver's faux statistical analysis, you have to assume that courts always convict rapists (and likewise, always acquit innocent defendants). What Silver is really doing is evaluating (er... speculating, well, concern-trolling about) the likelihood that Assange will be convicted, which is most certainly not the same as analyzing the likelihood that he raped one or both of the women in question. Not the same thing at all.
3. In reality, these are two events that have already happened. Either Julian Assange raped one or both of these women, or he didn't. No amount of statistics is going to help us figure out what happened. One thing that might help would be testimony. For example, the testimony of the women. The women who have given the police detailed descriptions of being raped by Assange.
So none of this has anything to do with statistics, let alone Bayesian statistics. Still, both Silver and I got to waste people's time being pretentious. I think he might have even gotten paid* to do so.
In closing, let's look at the ultimate line of Silver's column:
“In a world of limited information, the political motivation behind the charges might be the most important clue we have in evaluating their merit.”
WTFOMGJUSTNO. Silver's got his variables all asunder here. When political motivation exists, people pay attention to rape charges. When it's just some dude, nobody really cares. Well, victims, survivors and women might care about the charges, but people who matter typically don't.
Besides, there's a difference between limited information and limited willingness to listen to women. I suspect the relationship between those two isn't what Silver thinks it is.
*If anyone's actually at the Times, I can get you my CV. You guys hire whoever, right? Sorry, it's whomever, right? :cough: You guys hire whomever, right? :curtsy:
via: Commenter Allison at Sady's. It's also not a coincidence that a lot of my links come from the Tiger Beatdown post in question.