A Comparison of Quasi-Poisson and Zero Inflated Negative Binomial Regression Models for Over-dispersion Count Data

Authors

  • Navapun Chuea-am คณะวิทยาศาสตร์ มหาวิทยาลัยเกษตรศาสตร์
  • Boonorm Chomtee
  • Apinya Hirunwong

Abstract

In this research study aimed to compare the appropriation of regression models between Quasi-Poisson (QP) and Zero inflated negative binomial (ZINB) which dependent variable was count data and the variance was greater than the mean. The dependent  variable for the real data was the number of injured in each accident which there are three cases: small (=17), medium (=32)  and large (=56). The probabilities of zero event () were 0.25 for the small sample size and 0.50 for the medium and large sample sizes. In the real data set, there are three independent variables. For the simulation data, the dependent variable had Zero inflated negative binomial distribution. The dispersion parameter of the distribution () were 1.25, 1.50 and 1.75, the probability of zero events () were 0.25 and 0.50 and the mean () were 1.4, 2 and 3. For the simulation data set, Three independent  variables were determined with bernoulli distribution and the probability of success events () were 0.3, 0.5 and 0.8. The sample size of the simulation data () were small (=15, 20), medium (=30, 35) and large (=45, 50). The criteria of model appropriation  were root mean square error (RMSE) and absolute average error (AAE). The smaller values of RMSE or AAE indicate the better model. For the results based on RMSE and AAE, it is found that the Quasi-Poisson regression model was more appropriate than Zero inflated negative binomial regression model at almost case of study for both the real and simulation data.  Keywords :  count data, over-dispersion, Quasi-Poisson Regression model, Zero inflated negative binomial                      regression model 

Author Biography

Navapun Chuea-am, คณะวิทยาศาสตร์ มหาวิทยาลัยเกษตรศาสตร์

   

References

Batra, M., Shah, A.F., Rajput, P., & Shah, I.A. (2016). Comparison of linear and zero-inflated negative binomial regression models for appraisal of risk factors associated with dental caries. J Indian Soc Pedod Prev Dent, 34(1), 71-75.
Charatiam, N. (2010). Comparative Test Statistics for Zero-Inflated Generalized Poisson Regression Model against Generalized Poisson Regression Model in the Presence of Covariate Outliers. Master of Science Thesis, Thammasat University. (in Thai)
Doyle, S.R. 2009. Examples of Computing Power for Zero-Inflated and Overdispersed Count Data. Journal of Modern Applied Statistical Methods, Vol. 8(2), 360-376.
Hu, M.C., Pavlicova, M., & Nunes, E.V. (2011). Zero-inflated and Hurdle Models of Count Data with Extra Zeros: Examples form an HIV – Risk Reduction Intervention Trial. Am J Drug Alcohol Abuse, 37(5), 367 – 375.
Nitchanpunsri, K. (2011). Comparative Study of Model Fit for Generalized Linear Models: Zero-Inflated Distributions. Master of Science Thesis, Thammasat University. (in Thai)
Pornapraditpun, S. (2009). The Dispersion Estimation under Generalized Linear Model with Negative Binomial Distribution for Small Sample. Master of Science Thesis, Thammasat University. (in Thai)
Potts, J.M., & Elih, J. (2006). Comparing species abundance models. Ecological Modeling, 199, 153-163.
Sileshi, G. (2006). Selecting the right statistical model for analysis of insect count data by using information theoretic measures. Bulletin of Entomological Research, 96, 479-488.
Ver Hoef, J.M., & Boveng, P.L. (2007). Quasi-Poisson VS. Negative binomial Regression: How Should
We Model Overdispersed Count Data. Ecology Society of America Journal, 88(1), 2766-2772.
Wedderburn, R.W.M. (1974). Quasi-likelihood function, Generalized linear models and the Gauss-Newton method. Biometrika, 61, 439-447.
Yau, K.K.W., Wang, K., & Lee, A.H. (2003). Zero-inflated negative binomial mixed regression modeling of over- dispersed count data with extra zeros. Biometrical Journal, 45, 437-452.
Yusuf, O.B., Bello, T., & Gureje, O. (2017). Zero Inflated Poisson and Zero Inflated Negative Binomial Models with Application to Number of Falls in the Elderly. Biostatistics and Biometrics Open Access Journal, 1(4), 1-7.

Downloads

Published

2018-06-12