Riemann-Lebesgue Forest for Regression
Tian Qin, Wei-Min Huang
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We propose a novel ensemble method called Riemann-Lebesgue Forest (RLF) for regression. The core idea in RLF is to mimic the way how a measurable function can be approximated by partitioning its range into a few intervals. With this idea in mind, we develop a new tree learner named Riemann-Lebesgue Tree (RLT) which has a chance to perform Lebesgue type cutting,i.e splitting the node from response Y at certain non-terminal nodes. We show that the optimal Lebesgue type cutting results in larger variance reduction in response Y than ordinary CART Breiman1984ClassificationAR cutting (an analogue of Riemann partition). Such property is beneficial to the ensemble part of RLF. We also generalize the asymptotic normality of RLF under different parameter settings. Two one-dimensional examples are provided to illustrate the flexibility of RLF. The competitive performance of RLF against original random forest Breiman2001RandomF is demonstrated by experiments in simulation data and real world datasets.