Regularizers for Online Mirror Descent
This is a reference blog post about useful regularizers for Online Mirror Descent. This is by no means an exhaustive list. I have only added regularizers that I have seen in papers.
- Squared Euclidean Norm: This regularizer gives Online Gradient Descent
- Negative Entropy: On the probability simplex, this regularizer gives the Hedge algorithm.
- Log barrier: Useful in deriving adaptive bandit bounds. First introduced in [1]
- Tsallis Entropy: This is defined for \(q\in (0,1)\). When \(q=1/2\), OMD with this entropy can be shown to match the regret lower bound of \(\sqrt{TK}\) in the multi armed bandit problem. Also shown to achive best of both worlds kind of bounds [2] when \(q=1/2\).
- Hyperbolic Entropy: Introduced in [3]. This regularizer inerpolates between OGD and Hedge based on \(\beta\).
- Negative Von-Neuman Entropy: When used on the spectrahedron, gives the Matrix Multiplicative weights algorithm [4].
- Log Determinant: Used in Bandit PCA [5].
- Matrix Tsallis Entropy: Used in [6] for an online optimization proof of the Batson-Spielman-Srivastava spectral sparsification result. \(q\in (0,1)\)
- Soft Exploration: Used in [7] for solving several open prblems. Introduced the idea of “soft” exploration as opposed to using “forced” exploration \((1-\gamma)x_t + \frac{\gamma}{n}\).
- \(\ell^n_p\) balls: Also used in [7] for \(p \in [1,2]\):
References
- Foster, Dylan J., et al. “Learning in games: Robustness of fast convergence.” Advances in Neural Information Processing Systems. 2016.
- Zimmert, Julian, and Yevgeny Seldin. “An optimal algorithm for stochastic and adversarial bandits.” arXiv preprint arXiv:1807.07623 (2018).
- Ghai, Udaya, Elad Hazan, and Yoram Singer. “Exponentiated gradient meets gradient descent.” arXiv preprint arXiv:1902.01903 (2019).
- Arora, Sanjeev, Elad Hazan, and Satyen Kale. “The multiplicative weights update method: a meta-algorithm and applications.” Theory of Computing 8.1 (2012): 121-164
- Kotłowski, Wojciech and Gergely Neu. “Bandit Principal Component Analysis” Proceedings of Machine Learning Research vol 99:1–31, 2019 32nd Annual Conference on Learning Theory
- Allen-Zhu, Zeyuan, Zhenyu Liao, and Lorenzo Orecchia. “Spectral sparsification and regret minimization beyond matrix multiplicative updates.” Proceedings of the forty-seventh annual ACM symposium on Theory of computing. ACM, 2015
- Bubeck, Sébastien, Michael B. Cohen, and Yuanzhi Li. “Sparsity, variance and curvature in multi-armed bandits.” arXiv preprint arXiv:1711.01037 (2017).
Written on July 21, 2019