In accounting and finance (and indeed in other research areas too), there is a possibility for having a categorical dependent variable. For instance, research in auditor opinion or efficiency or board size determinants makes our dependent variable to be binary. However, we are afraid that there is no variance in our categorical dependent variables due to our panel data set. Then the question is “Can we run panel logistic regression?”
By far, my answer is still yes. Indeed, there is possibility of estimation bias due to the variance and standard errors post-estimation. So, how can we run panel logistic regression?
Using stata command, we originally use xtreg for our static panel regression. The assumption is that our N has similar number with our t (Pedroni, 2008). In logistic regression for panel data set, we just need to change that xtreg becomes xtlogit
The sytanx will be:
Random effect model
xtlogit DV IV [if] [in] [weight], re re_options
Fixed effect model
xtlogit DV IV [if] [in] [weight], fe fe_options
the RE_options and FE_options can be:
nonconstant –> Supress constant term (can be due to dummy trap)
offset(varname) –> include varname in model with coefficient constrained to 1
constraints(constraints) –> apply specified liner constraints
collinear –> keep collinear variables
asis –> retain perfect predictor variables
Meanwhile the standard error robustness is still the same method like the xtreg, whereby:
vce or cluster (whether you have collinearity issue or heterocedasticity issue)
How if we make correction for our estimation bias for our categorical dependent variable of panel data set?
- The common method is using MLE logistic regression (Hsieh et al, 1985; King and Zeng, 2001). This method is easier to use as the estimation of corrected coefficient
- The alternative method is Manski (1999) of weighted exogenous sampling MLE logistic regression, especially for big sample size
- The other method is King and Zeng (2001) of conditional MLE.
How far is the difference with correction and without correction of post-estimation?
Many papers in logistic regression shows that corrected coefficient is much better especially for not that big sample size. However, several research papers also show that using MLE logistic regression and weighted exogeneous sampling MLE regression have not much difference