As you probably know, I’m a big fan of R’s
brms package, available from CRAN. In case you haven’t heard of it,
brms is an R package by Paul-Christian Buerkner that implements Bayesian regression of all types using an extension of R’s formula specification that will be familiar to users of
lmer. Under the hood, it translates the formula into Stan code, Stan translates this to C++, your system’s C++ compiler is used to compile the result and it’s run.
brms is impressive in its own right. But also impressive is how it continues to add capabilities and the breadth of Buerkner’s vision for it. I last posted something way back on version 0.8, when
brms gained the ability to do non-linear regression, but now we’re up to version 1.1, with 1.2 around the corner. What’s been added since 0.8, you may ask? Here are a few highlights:
Just a quick posting following up on the brms/rstanarm posting. In
brms 0.8, they’ve added non-linear regression. Non-linear regression is fraught with peril, and when venturing into that realm you have to worry about many more issues than with linear regression. It’s not unusual to hit roadblocks that prevent you from getting answers. (Read the Wikipedia links Non-linear regression and Non-linear least squares to get an idea.)
There are several reasons why everyone isn’t using Bayesian methods for regression modeling. One reason is that Bayesian modeling requires more thought: you need pesky things like priors, and you can’t assume that if a procedure runs without throwing an error that the answers are valid. A second reason is that MCMC sampling — the bedrock of practical Bayesian modeling — can be slow compared to closed-form or MLE procedures. A third reason is that existing Bayesian solutions have either been highly-specialized (and thus inflexible), or have required knowing how to use a generalized tool like BUGS, JAGS, or Stan. This third reason has recently been shattered in the R world by not one but two packages:
rstanarm. Interestingly, both of these packages are elegant front ends to Stan, via
This article describes
rstanarm, how they help you, and how they differ.
I just read about a website, accidental aRt, that shows how artistic R graphics can look when things go bad. Wonderful!
When I first heard of SSA (Singular Spectrum Analysis) and the EMD (Empirical Mode Decomposition) I though surely I’ve found a couple of magical methods for decomposing a time series into component parts (trend, various seasonalities, various cycles, noise). And joy of joys, it turns out that each of these methods is implemented in R packages:
In this posting, I’m going to document some of my explorations of the two methods, to hopefully paint a more realistic picture of what the packages and the methods can actually do. (At least in the hands of a non-expert such as myself.)
In a previous series of postings, I described a model that I developed to predict monthly electricity usage and expenditure for a condo association. I based my model on the average monthly temperature at a nearby NOAA weather station at Ronald Reagan Airport (DCA), because the results are reasonable and more importantly because I can actually obtain forecasts from NOAA up to a year out.
The small complication is that the NOAA forecasts cover three-month periods rather than single month: JFM (Jan-Feb-Mar), FMA (Feb-Mar-Apr), MAM (Mar-Apr-May), etc. So, in this posting, we’ll briefly describe how to turn a series of these overlapping three-month forecasts into a series of monthly approximations.
I’m always intrigued by techniques that have cool names: Support Vector Machines, State Space Models, Spectral Clustering, and an old favorite Hidden Markov Models (HMM’s). While going through some of my notes, I stumbled onto a fun experiment with HMM’s where you feed a bunch of English text into a two-state HMM and it will (tend to) discover what letters are vowels.