OLS, LASSO dan PLS Pada data Mengandung Multikolinearitas
Abstract
Correlation between predictor variables (multicollinearity) become a problem in regression analysis.
There are some methods to solve the problem and each method has its own complexity. This research
aims to know performance of OLS, LASSO and PLS on data that have correlation between predictor
variables. OLS establishes model by minimizing sum square of residual. LASSO minimizes sum
square of residual subject to sum of absolute coefficient less than a constant and PLS combine principal
component analysis and multiple linear regression. By analyzing simulation and real data using R
program, result of this research are that for data with serious multicollinearity (there is high correlation
between predictor variables), LASSO tend to have low bias average than PLS in prediction of response
variable. OLS method has greatest variance of MSEP , that is most not consistent in estimating the
Mean Square Error Prediction (MSEP). MSEP that is resulted by using PLS is less than that by using
LASSO.