In Part 1 of this series, I listed a bunch of Stata strengths that appealed to me as a long-time R user. In Part 2, I gave thoughts and tips for R users who are new to Stata. As I was writing Part 2, I realized that I had left several important strengths out of Part 1, and wanted to add them, and also to expand on one of the new Stata 13 features, forecast
. (Something that goes way beyond the usual predict
.)
More Stata Strengths
Generalized Structural Equation Models
In addition to the nice SEM features I mentioned previously, Stata 13 features SEM’s that use GLM’s instead of just linear models, which also allows multi-level models. You can predict from a SEM, calculate various margins, and even use time series lags/diffs if appropriate.
Multi-level Mixed-effects
Multi-level mixed-effect models support: linear, logistic, ordered logistic, probit, ordered probit, poisson, and glm’s. For all types, you can specify the variance-covariance structure of the random effects separately for each random effects equation, and for linear you can also specify the structure of the residual errors in the lowest-level groups (including AR# and MA#). Very thorough.
Constraints
Stata has a unified constraint language and constraint list (
constraint dir
). Most estimation procedures allow constraints — I count 70 of them that do in the help — including everything from OLS regression to GLM, ARFIMA, GARCH, State Space, and Stochastic Frontier models.
Other
Stata now has Project files, which can include code (do files), graphs, datasets, and other files you might use. You can now write plugins for Stata in Java: there’s an API for accessing, creating, or deleting anything you might need in Stata from within your Java code. There is an
fp:
prefix for regressions with fractional polynomial regressors. There are commands for reading and writing Excel spreadsheets, even down to the cell level.
Economists’ Delight
I want to emphasize how deep Stata is for economists. (Though I’m not an economist.) For example, if you want to do IV regression with panel data, you have at least two choices:
xtivreg
(assumes that a subset of the explanatory variables are correlated with the error) versus xthtaylor
(assumes that a subset of the explanatory variables are correlated with individual-level effects and NOT with the error). Oh, and you can use the Amemiya–MaCurdy
estimator with xthtaylor
, too. (No, I don’t know what that is.) There are a huge number of functions for economists.
Forecast
Stata 13 has a
forecast
command, that can forecast based on systems of equations, where the equations can be deterministic or stochastic (estimated by Stata’s estimation commands). Forecasts that include lagged variables can be static (use actual values) or dynamic (use forecast values), and forecasts can include alternate scenarios.
You could obviously program up something similar, or very carefully manage a do file, commenting and uncommenting things as appropriate, but forecast
manages the forecasting process so well that it’s changed how I look at a problem. In the last part of this series, Part 4, I’ll take the Condo Association Electricity Usage problem that I addressed in R and revisit it in Stata, using forecast
and stepping up the complexity of my forecast. The graph at the top of this posting is the result.