The Gain from Ordering in Online Learning.

NeurIPS(2023)

引用 0|浏览12
暂无评分
摘要
We study fixed-design online learning where the learner is allowed to choose the order of the datapoints in order to minimize their regret (aka self-directed online learning). We focus on the fundamental task of online linear regression: the learner is given a dataset $X$ with $n$ examples in $d$ dimensions and at step $t$ they select a point $x_t \in X$, predict a value $\widetilde y_t$, and suffer loss $(\widetilde y_t - w^\ast \cdot x_t)^2$. The goal is to design algorithms that order the examples and achieve better regret than random- or worst-order online algorithms. For an arbitrary dataset $X$, we show that, under the Exponential Time Hypothesis, no efficient algorithm can approximate the optimal (best-order) regret within a factor of $d^{1/\poly(\log \log d)}$. We then show that, for structured datasets, we can bypass the above hardness result and achieve nearly optimal regret. When the examples of $X$ are drawn i.i.d.\ from the uniform distribution on the sphere, we present an algorithm based on the greedy heuristic of selecting ``easiest'' examples first that achieves a $\log d$-approximation of the optimal regret.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要