Reducing Sampling Ratios and Increasing Number of Estimates Improve Bagging in Sparse Regression


Bagging, a powerful ensemble method from machine learning, has shown the ability to improve the performance of unstable predictors in difficult practical settings. Although Bagging is most well-known for its application in classification problems, here we demonstrate that employing Bagging in sparse regression improves performance compared to the baseline method (I1 minimization). Although the original Bagging method uses a bootstrap sampling ratio of 1, such that the sizes of the bootstrap samples L are the same as the total number of data points m, we generalize the bootstrap sampling ratio to explore the optimal sampling ratios for various cases. The performance limits associated with different choices of bootstrap sampling ratio L/m and number of estimates K are analyzed theoretically. Simulation results show that a lower L/m ratio (0.6 - 0.9) leads to better performance than the conventional choice (L/m = 1), especially in challenging cases with low levels of measurements. With the reduced sampling rate, SNR improves over the original Bagging method by up to 24% and over the base algorithm I1 minimization by up to 367%. With a properly chosen sampling ratio, a reasonably small number of estimates (K = 30) gives a satisfying result, although increasing K is discovered to always improve or at least maintain performance.

2019 53rd Annual Conference on Information Sciences and Systems (CISS)