Archive for March, 2013

Understanding Bayesian Inference

March 24, 2013

I use Bayesian inference frequently and it comes up all the time when we are trying to interpret the results of test and measurements that we hear about. However, it’s incredibly difficult to develop an intuitive understanding of Bayes theorem. (This is something that Daniel Kahneman argues in “Thinking Fast and Slow” may never be intuitive.) Unfortunately most resources you’ll readily find (I’m looking at you Wikipedia) present an explanation that is rather obtuse. Eliezer S. Yudkowsky presents a good description [via Less Wrong] of how Bayesian inference works and provides nice examples to work through. This explanation is really quite good but it still doesn’t resolve the fundamental problem with any kind of statistical inference, namely that the user must really think about the problem. This is ultimately the step which seems to prevent people from making correct inferences. I use the word think here to mean that you have to work your brain, put aside other thoughts and focus. It’s easier to ignore the prior and take a frequentist approach. I wrote about this in an earlier post when I discussed how to interpret the DIBELS literacy test and how it might be misinterpreted.

This xkcd cartoon sums up the situation quite well:

WordPress vs. Blogger: why I chose WordPress

March 19, 2013

I spent too much time trying t get blogger to make a decent looking equation and what seemed to work once would fail after some seemingly cosmetic update. So, I chose wordpress. I can do equations with no grief and no funny javascript business.

My old blog still exists as but I won’t be updating it any more.

Does it do $latex \LaTeX$?

March 17, 2013

Does it do \LaTeX?

\int_{-\inf}^{+\inf} {1 \over x} dx

Yes it does, and this is so much better the blogger. I’ve been experimenting with a blog there and I’ve never been able to get it to \LaTeX.

What is the baserate number of "Wackos in America"?

March 15, 2013

I heard an NPR story this morning in which the local sheriff is interviewed and mentions that he has to follow up on all death threats because “there are a lot of wackos in America.”

I immediately began to wonder what exactly what is the number of wackos in America? Does anyone really know? This kind of comment fits a convenient narrative about wackos but is lacking in the kind of precision and nuanced meaning that is crucial for making informed decisions about how much risk we really face. I don’t know what a wacko is, nor do I actually know anything about the people who make death threats. Are they really serious? Has anyone ever studied this? For example at the CPAC meeting there is a session titled “Should we shoot all the consultants now?” These might be threatening words, but does anyone really think this is a threat?

For reference:
The CDC reports that there were 16,259 homicides in the US in 2010, for a rate of 5.3 per 100k. They also report that there were 120k unintentional injury deaths (accidents) for a rate of 39.1 per 100k. If we assume that homicide is caused by wackos then the risk of death by wacko is significantly less than accidental death.

Spectral Analysis

March 6, 2013

This article Spectrum and spectral density estimation by the Discrete Fourier transform (DFT), including a comprehensive list of window functions and some new flat-top windows. by Heinzl et al provides an excellent resource for understanding power spectrum estimation from real data. They list a lot of good examples to help develop a solid understanding.

This article is one of the few places where I’ve found all of the important pieces needed for DFT-based spectral analysis in one place. There are a lot of good resources out there but they usually do not touch on all of the details needed to generate a high-quality analysis. 

DIBELS and False Alarm Analysis

March 3, 2013

DIBELS is a test of language for small children learning to read. It was designed at the University of Oregon, Center for Teaching and Learning. It is a correlative test which is designed to allow an assessment of the probability that a student will pass a third grade reading test. In this sense it is simply a detector like any other and can be analyzed as such. Like any other detector a characterization of the false alarm picture is required to ensure that we understand how the detector operates.

The test operates like most any other detector. The child takes a test, if the score exceeds a high threshold called the benchmark they are classified low risk. If the child exceeds a lower threshold they are classified at medium risk and below that high risk.

Technical reports for the DIBELS test are reported here. An analysis and discussion of how the thresholds are chosen is found in this particular report, 2012 – 2013 DIBELS Next Benchmark Goals: Technical Supplement [pdf]. This report publishes the ROC curves for the DIBELS tests at various time from early kindergarten up to third grade. In particular it also discusses how they anticipate changing the receiver operating point to increase the probability of detection. The ROC curves are all fairly similar so I’ll use figure one as illustrative. The ROC curves in this report use “Sensitivity” along the y-axis which is equivalent to the probability of detection Pd . The curves use “1-Specificity” along the x-axis which is equivalent to the probability of false alarm Pfa. In figure one on the left, then benchmark case, they show two operating points. The new “Recommended Goal” lists a  Pd=0.9, Pfa=0.57 so there is a 57% chance that a student who meets the bench mark will fail to make the cut using this operating point while 90% of students who are below bench mark will be identified. I.e. 57% of students who are in no need of help will be flagged for extra help and classified as at risk by their teachers and schools.

It’s important to remember that Pd and Pfa are conditional probabilities so to interpret the full situation we need to know the base rate for a given scenario in order to understand what impact this has. It can be difficult to estimate the base rate so as a proxy we can use reported statistics for students who pass Ohio’s third grade reading test. These were reported in the Dayton Daily News in an article titled “New reading requirements could cost schools millions“. The article appeared in the 11 Feb 2013 print edition along with a table detailing how many students from various districts pass the third grade reading exam. (This table only appears in the print edition.) The Oakwood school district had only 0.6% of students fail to pass the exam, while Dayton City has 36.2% fail to pass.

Using these pass/fail values as proxy for the base rate we can use Bayes theorem to estimate the probability that an alarm is for an Oakwood student who truly needs help and find it to be 0.9%, so the other 99.1% of the time the student flagged at risk is not truly at risk, i.e. they are a false alarm. A similar calculation for Dayton City schools reveals that the probability that a student is truly at risk to be 47.3%. Even for a urban school district like Dayton City, the probability that any given student who is classified as at risk by DIBELS is actually at risk is still less than 50%.

Large false alarm rates lead to large problems. First, the people who use the output of the detector will eventually become desensitized to the output and ignore it. This is what drives the requirements of many detection algorithms. Teachers and educators are busy with many other important tasks and will not have the time to sort out true from false detections. Second, even if we take seriously the large number of detections, we will be putting enormous resources into what’s called intervention even though it’s not necessary. Even in a school district with infinite funds, there is still an opportunity cost in terms of educational time. I.e. children could be taught math, science, music, art or anything else and this would be beneficial to them which the reading instruction for most of these children is wasted resources. Finally, at these levels of “detection” we are no longer talking about intervention we are simply talking about teaching and it would be worthwhile and more efficient to consider that hiring more regular teachers and lowering class size would be more beneficial than hiring (presumably higher-salary) specialized intervention teachers.