HyperJump: Accelerating HyperBand via Risk Modelling
Pedro Mendes,
Maria Casimiro, Paolo Romano and
David Garlan.
In Proceedings of the 37th AAAI Conference on Artificial Intelligence, 2023.
Online links:
Abstract
In the literature on hyper-parameter tuning, a number of recent solutions rely on low-fidelity observations (e.g., training with sub-sampled datasets) in order to efficiently identify promising configurations to be then tested via high-fidelity
observations (e.g., using the full dataset). Among these, HyperBand is arguably one of the most popular solutions, due to its efficiency and theoretically provable robustness.
In this work, we introduce HyperJump, a new approach that builds on HyperBand’s robust search strategy and complements it with novel model-based risk analysis techniques that accelerate the search by skipping the evaluation of low
risk configurations, i.e., configurations that are likely to be eventually discarded by HyperBand. We evaluate HyperJump on a suite of hyper-parameter optimization problems and show that it provides over one-order of magnitude speed-ups,
both in sequential and parallel deployments, on a variety of deep-learning, kernel-based learning and neural architectural search problems when compared to HyperBand and to several state-of-the-art optimizers. |
Keywords: Machine Learning, Self-adaptation.
@InProceedings{AAAI23::HyperJump,
AUTHOR = {Mendes, Pedro and Casimiro, Maria and Romano, Paolo and Garlan, David},
TITLE = {HyperJump: Accelerating HyperBand via Risk Modelling},
YEAR = {2023},
BOOKTITLE = {Proceedings of the 37th AAAI Conference on Artificial Intelligence},
PDF = {http://acme.able.cs.cmu.edu/pubs/uploads/pdf/hyperjump_AAAI23_CR.pdf},
ABSTRACT = {In the literature on hyper-parameter tuning, a number of recent solutions rely on low-fidelity observations (e.g., training with sub-sampled datasets) in order to efficiently identify promising configurations to be then tested via high-fidelity
observations (e.g., using the full dataset). Among these, HyperBand is arguably one of the most popular solutions, due to its efficiency and theoretically provable robustness.
In this work, we introduce HyperJump, a new approach that builds on HyperBand’s robust search strategy and complements it with novel model-based risk analysis techniques that accelerate the search by skipping the evaluation of low
risk configurations, i.e., configurations that are likely to be eventually discarded by HyperBand. We evaluate HyperJump on a suite of hyper-parameter optimization problems and show that it provides over one-order of magnitude speed-ups,
both in sequential and parallel deployments, on a variety of deep-learning, kernel-based learning and neural architectural search problems when compared to HyperBand and to several state-of-the-art optimizers.},
KEYWORDS = {Machine Learning, Self-adaptation} }
|