I’ve read quite a few presentations on Bayesian data analysis, but I always seem to fall into the crack between the first, one-step problem (the usual being how likely you are to have a disease), and more advanced problems that involve conjugate priors and quite a few other concepts. So, when I was recently reading Bayesian Data Analysis[1], I decided to tackle Exercise 13 from Chapter 2 using only Bayes rule updates and simulation. I think it’s been illuminating, so decided to write it up here, using my favorite tool **R**.

Exercise 13 involves annual airline accidents from 1976 to 1985, modeled as a Poisson distribution. The data is :

`accidents <- c(24, 25, 31, 31, 22, 21, 26, 20, 16, 22)`

A little playing around with graphs and **R**‘s `dpois`

gives me the impression that is probably around 24, but I’ll make three priors (for ):

`r <- seq (10, 45, 0.2)`

theta1 <- dnorm (r, 15, 5)

theta1 <- theta1 / sum (theta1)

theta2 <- dnorm (r, 35, 6)

theta2 <- theta2 / sum (theta2)

theta3 <- 20 - abs (r - mean (r))

theta3 <- theta3 / sum (theta3)

Where theta1 is probably low, theta2 is probably high, and theta3 is not even parametric. (More on this later.) I normalized them to create proper priors so that they graph well together, but that wasn’t necessary. Remember, I’m doing all of my calculations over the discretized range, . Plotting the theta’s together:

So let’s run the numbers (accidents) through a repeated set of Bayes Rule updates to get a posterior distribution for based on the prior distribution, theta1:

Continue reading →