Efficiency Comparison of Regression Coefficient Estimation Methods for Multiple Linear Regression Model when Data Contain Outliers in Dependent Variable

Authors

  • Kritaporn Thitacharee
  • Juthaphorn Sinsomboonthong
  • Thidaporn Supapakorn

Abstract

The purpose of this research was to compare the efficiency of five regression coefficient estimation methods for multiple linear regression model when data containing mild outliers in dependent  variable. The                    five  methods  composed  of  ordinary least  squares method, least  trimmed  squares  method,  M  method  usingAndrews and Welsch weight functions and GM method using Huber weight function. The criterion for efficiency comparison was estimated mean square error (EMSE). The data was generated by Monte Carlo simulation technique for 78 situations and repeated 1,000 times for each situation. The results of this research were as follow: in case of no outliers in dependent variable, ordinary least squares method was the most efficient method. In case of outliers in dependent variable and random error was normally distributed, when sample size was 10, 20 and 30, M method using Welsch weight function provided the most efficient estimator. In addition, when sample size was 50, 100 and 150, M method using Andrews weight function provided the most efficient estimator. However, when random error was t-distributed with 1 degree of freedom, GM tended to be the most efficient estimator for all situations. Moreover, when degree of freedom increased and sample size was not greater than 30, M method using Welsch weight function was likely to be the most efficient estimator. However, when sample size was greater than 30, M method using Andrews weight function tended to be the most efficient estimator. Keywords :  multiple linear regression model, outliers, regression coefficient, ordinary least squares method,                                mean square error

References

Anscombe, F. J. (1960). Rejection of Outlier. Journal of American Statistical Association and American Society for Quality, 2(2), 123-147.
Barnett, V., & Lewis, T. (1995). Outlier in Statistical Data, 3th Edition., New York: John Wiley and Sons.
Faris, M. A., & Al-Amleh, M. A. (2016). A Comparison between Least Trimmed of Squares and MM-Estimation in Linear Regression Using Simulation Technique. Retrieved September 8, 2017, from http://iacmc.zu.edu.jo/ar/images/stories/IACMC2016/39.
Hadara, P. (2006). A Comparison of Methods for Estimation of Parameters in Multiple Linear Regression with Outliers. Master of Science Thesis, Naresuan University. (in Thai).
Holland, P. W., & Welsch, R. E. (1977). Robust Regression Using Iteratively Reweight Least-Squares. Communications in Statistics-Theory and Methods, 6(9), 813-827.
Huber, P. J. (1964). Robust Estimation of a Location Parameter. Annals of Mathematical Statistics. 35(1), 73-101
Jitthavech, J. (2015). Regression Analysis. Bangkok: WVO Officer of Printing Mull. (in Thai).
Montgomery D.C., Peck, E. A., & Vining, G. G. (2012). Introduction to Linear Regression Analysis, 5th Edition. New York: John Wiley and Sons.
Ozkale, M. R., & Arican, E. (2015). First-order r-d Class Estimator in Binary Logistic Regression Model. Statistics and Probability Letters. Retrieved March 12, 2015, from http://booksc.org/book/44176223/fa1f97.
Rousseeuw, P. J. (1984) Least Median of Squares Regression. Journal of the American Statistics Association, 79(388), 871-880.
Simpsona, J. R., & Montgomery, D. C. (1996). A Biased-Robust Regression Technique for the Combined Outlier-Multicolinearity Problem. Journal of Statistical Computation and Simulation, 56(1), 1-22.
Tantrakul, O. (2012). A Comparison of Robust Regression Coefficient Estimation Methods for Multiple Linear Regression with Outliers. Master of Science Thesis, Kasetsart University. (in Thai).

Downloads

Published

2018-06-25