R Users Will Now Inevitably Become Bayesians

There are several reasons why everyone isn’t using Bayesian methods for regression modeling. One reason is that Bayesian modeling requires more thought: you need pesky things like priors, and you can’t assume that if a procedure runs without throwing an error that the answers are valid. A second reason is that MCMC sampling — the bedrock of practical Bayesian modeling — can be slow compared to closed-form or MLE procedures. A third reason is that existing Bayesian solutions have either been highly-specialized (and thus inflexible), or have required knowing how to use a generalized tool like BUGS, JAGS, or Stan. This third reason has recently been shattered in the R world by not one but two packages: brms and rstanarm. Interestingly, both of these packages are elegant front ends to Stan, via rstan and shinystan.

This article describes brms and rstanarm, how they help you, and how they differ.

BRMS Diagnostic 1
Continue reading

Map Projections

The Earth is round, and maps are flat. That’s a problem for map makers. And a source of endless entertainment for geeks.

World Map

Carlos A. Furuti has an excellent website with many projections and clear explanations of the tradeoffs of each. The main projection page has links to all types, including two of my favorites: Other Interesting Projections, and Projections on 3D Polyhedra. Enjoy!

In R, the packages maps and mapproj are your entrée to this world. I created the above map (a Mollweide projection, which is a useful favorite), with:

library (maps)
library (mapproj)

map ("world", projection="mollweide", regions="", wrap=TRUE, fill=TRUE, col="green")
map.grid (labels=FALSE, nx=36, ny=18)

Real Data Science pt1: Review of Numbersense

So far, when I’ve written on Data Science topics I’ve written about the fun part: the statistical analysis, graphs, conclusions, insights, etc. For this next series of postings, I’m going to concentrate more on what we can call Real Data Science®: the less glamorous side of the job, where you have to beat your data and software into submission, where you don’t have access to the tools or data you need, and so on. In other words, where you spend the vast majority of your time as a Data Scientist.

I’ll start the series with a review of Kaiser Fung’s Numbersense, published in 2013. It’s not mainly about Real Data Science, but I’ll start with it because it’s a great book that illustrate several common data pitfalls, and in the epilogue Kaiser shares one of his own Real Data Science stories and I found myself nodding my head and saying, “Yup, that’s how I spent several days in the last couple of weeks!”

IMG_1078
Continue reading

Book recommendation: Longitudinal Structural Equation Modeling

Longitudinal Structural Equation Modeling, Todd D. Little, Guilford Press 2013.

Let me start by saying that this is one of the best textbooks I’ve ever read. It was written as if the author was our mentor, and I really get the feeling that he’s sharing his wisdom with us rather than trying to be pedagogically correct. The book is full of insights on how he thinks about building and applying SEMs, and the lessons he’s learned the hard way.

LittleBook
Continue reading

Children’s science fiction and fantasy: it’s not just for children anymore!

Larklight Cover

My wife and I have started listening to books on audio during car trips or in the evening, and we’ve discovered that there are some absolute gems in the children’s literature section of the library. Yep, kid’s stories aren’t just for kids anymore. (And I wish I’d had stories like this when I was growing up!) In particular, we’ve been listening to audio CD’s and several of them have superb a voice acting that really enhances the story. These stories show incredible imagination, and in this posting I’d like to highly recommend two series: The Larklight Trilogy and the Bartimaeus Sequence, especially the audio CD’s.

Continue reading