Stata for R users pt 3

In Part 1 of this series, I listed a bunch of Stata strengths that appealed to me as a long-time R user. In Part 2, I gave thoughts and tips for R users who are new to Stata. As I was writing Part 2, I realized that I had left several important strengths out of Part 1, and wanted to add them, and also to expand on one of the new Stata 13 features, forecast. (Something that goes way beyond the usual predict.)

Dollars Forecast

More Stata Strengths

Generalized Structural Equation Models

In addition to the nice SEM features I mentioned previously, Stata 13 features SEM’s that use GLM’s instead of just linear models, which also allows multi-level models. You can predict from a SEM, calculate various margins, and even use time series lags/diffs if appropriate.

Multi-level Mixed-effects

Multi-level mixed-effect models support: linear, logistic, ordered logistic, probit, ordered probit, poisson, and glm’s. For all types, you can specify the variance-covariance structure of the random effects separately for each random effects equation, and for linear you can also specify the structure of the residual errors in the lowest-level groups (including AR# and MA#). Very thorough.


Stata has a unified constraint language and constraint list (constraint dir). Most estimation procedures allow constraints — I count 70 of them that do in the help — including everything from OLS regression to GLM, ARFIMA, GARCH, State Space, and Stochastic Frontier models.


Stata now has Project files, which can include code (do files), graphs, datasets, and other files you might use. You can now write plugins for Stata in Java: there’s an API for accessing, creating, or deleting anything you might need in Stata from within your Java code. There is an fp: prefix for regressions with fractional polynomial regressors. There are commands for reading and writing Excel spreadsheets, even down to the cell level.

Economists’ Delight

I want to emphasize how deep Stata is for economists. (Though I’m not an economist.) For example, if you want to do IV regression with panel data, you have at least two choices: xtivreg (assumes that a subset of the explanatory variables are correlated with the error) versus xthtaylor (assumes that a subset of the explanatory variables are correlated with individual-level effects and NOT with the error). Oh, and you can use the Amemiya–MaCurdy estimator with xthtaylor, too. (No, I don’t know what that is.) There are a huge number of functions for economists.


Stata 13 has a forecast command, that can forecast based on systems of equations, where the equations can be deterministic or stochastic (estimated by Stata’s estimation commands). Forecasts that include lagged variables can be static (use actual values) or dynamic (use forecast values), and forecasts can include alternate scenarios.

You could obviously program up something similar, or very carefully manage a do file, commenting and uncommenting things as appropriate, but forecast manages the forecasting process so well that it’s changed how I look at a problem. In the last part of this series, Part 4, I’ll take the Condo Association Electricity Usage problem that I addressed in R and revisit it in Stata, using forecast and stepping up the complexity of my forecast. The graph at the top of this posting is the result.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s