List of figures
Part I General information
1 Introduction
1.1 What is this book about?
1.2 Which models are considered?
1.3 Whom is this book for?
1.4 How is the book organized?
1.5 The SPost software
1.5.1 Updating Stata
1.5.2 Installing SPost13
Uninstalling SPost9
Installing SPost13 using search
Installing SPost13 using net install
1.5.3 Uninstalling SPost13
1.6 Sample do-files and datasets
1.6.1 Installing the spost13_do package
1.6.2 Using spex to load data and run examples
1.7 Getting help with SPost
1.7.1 What if an SPost command does not work?
1.7.2 Getting help from the authors
What we need to help you
1.8 Where can I learn more about the models?
2 Introduction to Stata
2.1 The Stata interface
2.2 Abbreviations
2.3 Getting help
2.3.1 Online help
2.3.2 PDF manuals
2.3.3 Error messages
2.3.4 Asking for help
2.3.5 Other resources
2.4 The working directory
2.5 Stata file types
2.6 Saving output to log files
2.7 Using and saving datasets
2.7.1 Data in Stata format
2.7.2 Data in other formats
2.7.3 Entering data by hand
2.8 Size limitations on datasets
2.9 Do-files
2.9.1 Adding comments
2.9.2 Long lines
2.9.3 Stopping a do-file while it is running
2.9.4 Creating do-files
2.9.5 Recommended structure for do-files
2.10 Using Stata for serious data analysis
2.11 Syntax of Stata commands
2.11.1 Commands
2.11.2 Variable lists
2.11.3 if and in qualifiers
2.11.4 Options
2.12 Managing data
2.12.1 Looking at your data
2.12.2 Getting information about variables
2.12.3 Missing values
2.12.4 Selecting observations
2.12.5 Selecting variables
2.13 Creating new variables
2.13.1 The generate command
2.13.2 The replace command
2.13.3 The recode command
2.14 Labeling variables and values
2.14.1 Variable labels
2.14.2 Value labels
2.14.3 The notes command
2.15 Global and local macros
2.16 Loops using foreach and forvalues
2.17 Graphics
2.17.1 The graph command
2.18 A brief tutorial
2.19 A do-file template
2.20 Conclusion
3 Estimation, testing, and fit
3.1 Estimation
3.1.1 Stata’s output for ML estimation
3.1.2 ML and sample size
3.1.3 Problems in obtaining ML estimates
3.1.4 Syntax of estimation commands
3.1.5 Variable lists
Using factor-variable notation in the variable list
Specifying interaction and polynomials
More on factor-variable notation
3.1.6 Specifying the estimation sample
Missing data
Information about missing values
Postestimation commands and the estimation sample
3.1.7 Weights and survey data
Complex survey designs
3.1.8 Options for regression models
3.1.9 Robust standard errors
3.1.10 Reading the estimation output
3.1.11 Storing estimation results
(Advanced) Saving estimates to a file
3.1.12 Reformatting output with estimates table
3.2 Testing
3.2.1 One-tailed and two-tailed tests
3.2.2 Wald and likelihood-ratio tests
3.2.3 Wald tests with test and testparm
3.2.4 LR tests with lrtest
Avoiding invalid LR tests
3.3 Measures of fit
3.3.1 Syntax of fitstat
3.3.2 Methods and formulas used by fitstat
3.3.3 Example of fitstat
3.4 estat postestimation commands
3.5 Conclusion
4 Methods of interpretation
4.1 Comparing linear and nonlinear models
4.2 Approaches to interpretation
4.2.1 Method of interpretation based on predictions
4.2.2 Method of interpretation using parameters
4.2.3 Stata and SPost commands for interpretation
4.3 Predictions for each observation
4.4 Predictions at specified values
4.4.1 Why use the m* commands instead of margins?
4.4.2 Using margins for predictions
Predictions using interaction and polynomial terms
Making multiple predictions
Predictions for groups defined by levels of categorical variables
4.4.3 (Advanced) Nondefault predictions using margins
The predict() option
The expression() option
4.4.4 Tables of predictions using mtable
mtable with categorical and count outcomes
(Advanced) Combining and formatting tables using mtable
4.5 Marginal effects: Changes in predictions
4.5.1 Marginal effects using margins
4.5.2 Marginal effects using mtable
4.5.3 Posting predictions and using mlincom
4.5.4 Marginal effects using mchange
4.6 Plotting predictions
4.6.1 Plotting predictions with marginsplot
4.6.2 Plotting predictions using mgen
4.7 Interpretation of parameters
4.7.1 The listcoef command
4.7.2 Standardized coefficients
4.7.3 Factor and percentage change coefficients
4.8 Next steps
Part II Models for specific kinds of outcomes
5 Models for binary outcomes: Estimation, testing, and fit
5.1 The statistical model
5.1.1 A latent-variable model
5.1.2 A nonlinear probability model
5.2 Estimation using logit and probit commands
5.2.1 Example of logit model
5.2.2 Comparing logit and probit
5.2.3 (Advanced) Observations predicted perfectly
5.3 Hypothesis testing
5.3.1 Testing individual coefficients
5.3.2 Testing multiple coefficients
5.3.3 Comparing LR and Wald tests
5.4 Predicted probabilities, residuals, and influential observations
5.4.1 Predicted probabilities using predict
5.4.2 Residuals and influential observations using predict
5.4.3 Least likely observations
5.5 Measures of fit
5.5.1 Information criteria
5.5.2 Pseudo-R²'s
5.5.3 (Advanced) Hosmer–Lemeshow statistic
5.6 Other commands for binary outcomes
5.7 Conclusion
6 Models for binary outcomes: Interpretation
6.1 Interpretation using regression coefficients
6.1.1 Interpretation using odds ratios
6.1.2 (Advanced) Interpretation using y*
6.2 Marginal effects: Changes in probabilities
6.2.1 Linked variables
6.2.2 Summary measures of change
MEMs and MERs
AMEs
Standard errors of marginal effects
6.2.3 Should you use the AME, the MEM, or the MER?
6.2.4 Examples of marginal effects
AMEs for continuous variables
AMEs for factor variables
Summary table of AMEs
Marginal effects for subgroups
MEMs and MERs
Marginal effects with powers and interactions
6.2.5 The distribution of marginal effects
6.2.6 (Advanced) Algorithm for computing the distribution of effects
6.3 Ideal types
6.3.1 Using local means with ideal types
6.3.2 Comparing ideal types with statistical tests
6.3.3 (Advanced) Using macros to test differences between ideal types
6.3.4 Marginal effects for ideal types
6.4 Tables of predicted probabilities
6.5 Second differences comparing marginal effects
6.6. Graphing predicted probabilities
6.6.1 Using marginsplot
6.6.2 Using mgen with the graph command
6.6.3 Graphing multiple predictions
6.6.4 Overlapping confidence intervals
6.6.5 Adding power terms and plotting predictions
6.6.6 (Advanced) Graphs with local means
6.7 Conclusion
7 Models for ordinal outcomes
7.1 The statistical model
7.1.1 A latent-variable model
7.1.2 A nonlinear probability model
7.2 Estimation using ologit and oprobit
7.2.1 Example of ordinal logit mdel
7.2.2 Predicting perfectly
7.3 Hypothesis testing
7.3.1 Testing individual coefficients
7.3.2 Testing multiple coefficients
7.4 Measures of fit using fitstat
7.5 (Advanced) Converting to a different parameterization
7.6 The parallel regression assumption
7.6.1 Testing the parallel regression assumption using oparallel
7.6.2 Testing the parallel regression assumption using brant
7.6.3 Caveat regarding the parallel regression assumption
7.7 Overview of interpretation
7.8 Interpreting transformed coefficients
7.8.1 Marginal change in y*
7.8.2 Odds ratios
7.9 Interpretations based on predicted probabilities
7.10 Predicted probabilities with predict
7.11 Marginal effects
7.11.1 Plotting marginal effects
7.11.2 Marginal effects for a quick overview
7.12 Predicted probabilities for ideal types
7.12.1 (Advanced) Testing differences between ideal types
7.13 Tables of predicted probabilities
7.14 Plotting predicted probabilities
7.15 Probability plots and marginal effects
7.16 Less common models for ordinal outcomes
7.16.1 The stereotype logistical model
7.16.2 The generalized ordered logit model
7.16.3 (Advanced) Predictions without using factor-variable notation
7.16.4 The sequential logit model
7.17 Conclusion
8 Models for nominal outcomes
8.1 The multinomial logit model
8.1.1 Formal statement of the model
8.2 Estimation using the mlogit command
Weights and complex samples
Options
8.2.1 Examples of MNLM
8.2.2 Selecting different base outcomes
8.2.3 Predicting perfectly
8.3 Hypothesis testing
8.3.1 mlogtest for tests of the MNLM
8.3.2 Testing the effects of the independent variables
8.3.3 Tests for combining alternatives
8.4 Independence of irrelevant alternatives
8.4.1 Hausman-McFadden test of IIA
8.4.2 Small-Hsiao test of IIA
8.5 Measures of fit
8.6 Overview of interpretation
8.7 Predicted probabilities with predict
8.8 Marginal effects
8.8.1 (Advanced) The distribution of marginal effects
8.9 Tables of predicted probabilities
8.9.1 (Advanced) Testing second differences
8.9.2 (Advanced) Predictions using local means and subsamples
8.10 Graphing predicted probabilities
8.11 Odds ratios
8.11.1 Listing odds ratios with listcoef
8.11.2 Plotting odds ratios
8.12 (Advanced) Additional models for nominal outcomes
8.12.1 Stereotype logistic regression
8.12.2 Conditional logit model
8.12.3 Multinomial probit model with IIA
8.12.4 Alternative-specific multinomial probit
8.12.5 Rank-ordered logit model
8.13 Conclusion
9 Models for count outcomes
9.1 The Poisson distribution
9.1.1 Fitting the Poisson distribution with the poisson command
9.1.2 Compaing observed and predicted counts with mgen
9.2 The Poisson regression model
9.2.1 Estimation using poisson
Example of the PRM
9.2.2 Factor and percentage changes in E(y | x)
Example of factor and percentage change
9.2.3 Marginal effects on E(y | x)
Examples of marginal effects
9.2.4 Interpretation using predicted probabilities
Predicted probabilities using mtable and mchange
Treating a count independent variable as a factor variable
Predicted probabilities using mgen
9.2.5 Comparing observed and predicted counts to evaluate model specification
9.2.6 (Advanced) Exposure time
9.3 The negative binomial regression model
9.3.1 Estimation using nbreg
NB1 and NB2 variance functions
9.3.2 Example of NBRM
9.3.3 Testing for overdispersion
9.3.4 Comparing the PRM and NBRM using estimates table
9.3.5 Robust standard errors
9.3.6 Interpretation using E(y | x)
9.3.7 Interpretation using predicted probabilities
9.4 Models for truncated counts
9.4.1 Estimation using tpoisson and tnbreg
Example of zero-truncated model
9.4.2 Interpretation using E(y | x)
9.4.3 Predictions in the estimation sample
9.4.4 Interpretation using predicted rates and probabilities
9.5 (Advanced) The hurdle regression model
9.5.1 Fitting the hurdle model
9.5.2 Predictions in the sample
9.5.3 Predictions at user-specified values
9.5.4 Warning regarding sample specification
9.6 Zero-inflated count models
9.6.1 Estimation using zinb and zip
9.6.2 Example of zero-inflated models
9.6.3 Interpretation of coefficients
9.6.4 Interpretation of predicted probabilities
Predicted probabilities with mtable
Plotting predicted probabilities with mgen
9.7 Comparisons among count models
9.7.1 Comparing mean probabilities
9.7.2 Tests to compare count models
9.7.3 Using countfit to compare count models
9.8 Conclusion