technical paper
MAPTree: Beating “Optimal” Decision Trees with Bayesian Decision Trees
keywords:
interpretable ml
decision tree learning
bayesian learning
Decision trees remain one of the most popular machine learning models today due to their out-of-the-box performance and interpretability. In this work, we present MAPTree, a Bayesian approach to decision tree induction via maximum a posteriori (MAP) inference of a posterior distribution over trees. We first demonstrate a connection between maximum a posteriori inference of decision trees and AND/OR search. Using this connection, we propose an AND/OR search algorithm which is able to recover the MAP tree. Lastly, we demonstrate the empirical performance of the MAP tree both on synthetic data and in real world settings. On 16 real world datasets, MAPTree either outperforms baselines or demonstrates comparable performance but with much smaller trees. On synthetic tree-generated data, MAPTree also demonstrates greater robustness to noise and better generalization than existing approaches. Finally, MAPTree recovers the MAP tree orders of magnitude faster than existing sampling-based approaches and, in contrast with those algorithms, is able to provide a certificate of optimality.