Online and Game Theoretic Learning: December 2011

Friday, December 2, 2011

Regret Bound by Variation for Online Convex Optimization

Recently, we tried to find regret bounds by variation for general Online Convex Optimization problem. If one goes through the Chapter 2 of Prediction, Learning, and Games, he/she can easily find three different ways to express the regret bound: in terms of the number of rounds T, loss of the best expert, and variation of cost vectors. In Extracting certainty from uncertainty: regret bounded by variation in costs , the authors extend the result to online linear optimization and showed that the regret of online linear optimization can be bounded by the total variation of the cost vectors. In our work, we extended this result to general OCO. We first analyze the limitations of the algorithm in Extracting certainty from uncertainty: regret bounded by variation in costs when applied it to online convex optimization. We then present two algorithms for online convex optimization whose regrets are bounded by the variation of cost functions. We finally consider the bandit setting, and present a randomized algorithm for online bandit convex optimization with a variation-based regret bound. We show that the regret bound for online bandit convex optimization is optimal when the variation of cost functions is independent of the number of trials.

Online Convex Optimization without Projection

In this post I would like to briefly discuss our recent work in Online Convex Optimization (OCO) namely OCO with Long Term Constraints. Classical gradient descent optimization techniques usually require a projection step in each iteration, in order to get back to the feasible domain. Although for simple convex domains like Eculidain ball and simplex, this is a well studied problem and there exists efficient algorithms, however; for a variety of applications, this is a non-trivial and computationally burdensome step. One prominent example is semide nite optimization, where the projection of an arbitrary symmetric matrix back to the PSD matrices requires the computation of a complete eigenvalue-decomposition. Comparing this to the linear time update of first order gradient methods, it makes more sense to resolve this problem. In OCO, this is more crucial since the running time of each step supposed to be more efficient. We tackled this problem by posing a new OCO problem. The intuition is to replace the complex convex domain with simpler one with efficient projection algorithm and see how this process affects the general performance of the algorithm. The solution we proposed seems interesting in its own way; leaving some open problems needs through investigation.