99欧美日韩精品一区二区红桃-国产欧美日韩综合在线

This paper presents universal algorithms for clustering problems, including the widely studied $k$-median, $k$-means, and $k$-center objectives. The input is a metric space containing all potential client locations. The algorithm must select $k$ cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithm's solution and that of an optimal solution. A universal algorithm's solution $SOL$ for a clustering problem is said to be an $(\alpha, \beta)$-approximation if for all subsets of clients $C'$, it satisfies $SOL(C') \leq \alpha \cdot OPT(C') + \beta \cdot MR$, where $OPT(C')$ is the cost of the optimal solution for clients $C'$ and $MR$ is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of $k$-median, $k$-means, and $k$-center that achieve $(O(1), O(1))$-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other $\ell_p$-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that $(\alpha, \beta)$-approximation is NP-hard if $\alpha$ or $\beta$ is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, $(O(1), O(1))$-approximation is the strongest type of guarantee obtainable for universal clustering.

相關內容

簇

關注 1

CASE · Extensibility · 優化器 · 在線 · 排序 ·

2021 年 9 月 16 日

Randomized Online Algorithms for Adwords

Vijay V. Vazirani

from arxiv, 22 pages

The general adwords problem has remained largely unresolved. We define a subcase called {\em $k$-TYPICAL}, $k \in \Zplus$, as follows: the total budget of all the bidders is sufficient to buy $k$ bids for each bidder. This seems a reasonable assumption for a "typical" instance, at least for moderate values of $k$. We give a randomized online algorithm, achieving a competitive ratio of $\left(1 - {1 \over e} - {1 \over k} \right)$, for this problem. We also give randomized online algorithms for other special cases of adwords. Another subcase, when bids are small compared to budgets, has been of considerable practical significance in ad auctions \cite{MSVV}. For this case, we give an optimal randomized online algorithm achieving a competitive ratio of $\left(1 - {1 \over e} \right)$. Previous algorithms for this case were based on LP-duality; the impact of our new approach remains to be seen. The key to these results is a simplification of the proof for RANKING, the optimal algorithm for online bipartite matching, given in \cite{KVV}. Our algorithms for adwords can be seen as natural extensions of RANKING.

學成 · 圖 · 標注 · 樣例 · 未標記 ·

2021 年 9 月 16 日

Data driven algorithms for limited labeled data learning

Maria-Florina Balcan,Dravyansh Sharma

from arxiv, 33 pages, 11 figures

We consider a novel data driven approach for designing learning algorithms that can effectively learn with only a small number of labeled examples. This is crucial for modern machine learning applications where labels are scarce or expensive to obtain. We focus on graph-based techniques, where the unlabeled examples are connected in a graph under the implicit assumption that similar nodes likely have similar labels. Over the past decades, several elegant graph-based semi-supervised and active learning algorithms for how to infer the labels of the unlabeled examples given the graph and a few labeled examples have been proposed. However, the problem of how to create the graph (which impacts the practical usefulness of these methods significantly) has been relegated to domain-specific art and heuristics and no general principles have been proposed. In this work we present a novel data driven approach for learning the graph and provide strong formal guarantees in both the distributional and online learning formalizations. We show how to leverage problem instances coming from an underlying problem domain to learn the graph hyperparameters from commonly used parametric families of graphs that perform well on new instances coming from the same domain. We obtain low regret and efficient algorithms in the online setting, and generalization guarantees in the distributional setting. We also show how to combine several very different similarity metrics and learn multiple hyperparameters, providing general techniques to apply to large classes of problems. We expect some of the tools and techniques we develop along the way to be of interest beyond semi-supervised and active learning, for data driven algorithms for combinatorial problems more generally.

優化器 · 劃分 · 情景 · 設計 · 算法與數據結構 ·

2021 年 9 月 16 日

A Quadratic Time Locally Optimal Algorithm for NP-hard Equal Cardinality Partition Optimization

Kaan Gokcesu,Hakan Gokcesu

We study the optimization version of the equal cardinality set partition problem (where the absolute difference between the equal sized partitions' sums are minimized). While this problem is NP-hard and requires exponential complexity to solve in general, we have formulated a weaker version of this NP-hard problem, where the goal is to find a locally optimal solution. The local optimality considered in our work is under any swap between the opposing partitions' element pairs. To this end, we designed an algorithm which can produce such a locally optimal solution in $O(N^2)$ time and $O(N)$ space. Our approach does not require positive or integer inputs and works equally well under arbitrary input precisions. Thus, it is widely applicable in different problem scenarios.

Networks · Networking · 結點 · 優化器 · 標注 ·

2021 年 9 月 16 日

Optimal Space Lower Bound for Deterministic Self-Stabilizing Leader Election Algorithms

Lélia Blin,Laurent Feuilloley,Gabriel Le Bouder

from arxiv, The paper as been rewritten. It appeared in the arxiv, and as a brief announcment at DISC 2019, under the name "Memory Lower Bounds for Self-Stabilization"

Given a boolean predicate $\Pi$ on labeled networks (e.g., proper coloring, leader election, etc.), a self-stabilizing algorithm for $\Pi$ is a distributed algorithm that can start from any initial configuration of the network (i.e., every node has an arbitrary value assigned to each of its variables), and eventually converge to a configuration satisfying $\Pi$. It is known that leader election does not have a deterministic self-stabilizing algorithm using a constant-size register at each node, i.e., for some networks, some of their nodes must have registers whose sizes grow with the size $n$ of the networks. On the other hand, it is also known that leader election can be solved by a deterministic self-stabilizing algorithm using registers of $O(\log \log n)$ bits per node in any $n$-node bounded-degree network. We show that this latter space complexity is optimal. Specifically, we prove that every deterministic self-stabilizing algorithm solving leader election must use $\Omega(\log \log n)$-bit per node registers in some $n$-node networks. In addition, we show that our lower bounds go beyond leader election, and apply to all problems that cannot be solved by anonymous algorithms.

Integration · 離散化 · 共軛 · CASE · GROUP ·

2021 年 9 月 15 日

Applying splitting methods with complex coefficients to the numerical integration of unitary problems

S. Blanes,F. Casas,A. Escorihuela-Tomàs

from arxiv, 18 pages, 7 figures. To be published in Journal of Computational Dynamics

We explore the applicability of splitting methods involving complex coefficients to solve numerically the time-dependent Schr\"odinger equation. We prove that a particular class of integrators are conjugate to unitary methods for sufficiently small step sizes when applied to problems defined in the group $\mathrm{SU}(2)$. In the general case, the error in both the energy and the norm of the numerical approximation provided by these methods does not possess a secular component over long time intervals, when combined with pseudo-spectral discretization techniques in space.

近似 · 相互獨立的 · 輸入分布 · 情景 · 同分布的 ·

2021 年 9 月 14 日

Probabilistic Analysis of Euclidean Capacitated Vehicle Routing

Claire Mathieu,Hang Zhou

We give a probabilistic analysis of the unit-demand Euclidean capacitated vehicle routing problem in the random setting, where the input distribution consists of $n$ unit-demand customers modeled as independent, identically distributed uniform random points in the two-dimensional plane. The objective is to visit every customer using a set of routes of minimum total length, such that each route visits at most $k$ customers, where $k$ is the capacity of a vehicle. All of the following results are in the random setting and hold asymptotically almost surely. The best known polynomial-time approximation for this problem is the iterated tour partitioning (ITP) algorithm, introduced in 1985 by Haimovich and Rinnooy Kan. They showed that the ITP algorithm is near-optimal when $k$ is either $o(\sqrt{n})$ or $\omega(\sqrt{n})$, and they asked whether the ITP algorithm was also effective in the intermediate range. In this work, we show that when $k=\sqrt{n}$, the ITP algorithm is at best a $(1+c_0)$-approximation for some positive constant $c_0$. On the other hand, the approximation ratio of the ITP algorithm was known to be at most $0.995+\alpha$ due to Bompadre, Dror, and Orlin, where $\alpha$ is the approximation ratio of an algorithm for the traveling salesman problem. In this work, we improve the upper bound on the approximation ratio of the ITP algorithm to $0.915+\alpha$. Our analysis is based on a new lower bound on the optimal cost for the metric capacitated vehicle routing problem, which may be of independent interest.

簇 · 列 · 極小點 · 局部極小 · 全局最小值 ·

2021 年 9 月 14 日

Biclustering with Alternating K-Means

Nicolas Fraiman,Zichao Li

Biclustering is the task of simultaneously clustering the rows and columns of the data matrix into different subgroups such that the rows and columns within a subgroup exhibit similar patterns. In this paper, we consider the case of producing block-diagonal biclusters. We provide a new formulation of the biclustering problem based on the idea of minimizing the empirical clustering risk. We develop and prove a consistency result with respect to the empirical clustering risk. Since the optimization problem is combinatorial in nature, finding the global minimum is computationally intractable. In light of this fact, we propose a simple and novel algorithm that finds a local minimum by alternating the use of an adapted version of the k-means clustering algorithm between columns and rows. We evaluate and compare the performance of our algorithm to other related biclustering methods on both simulated data and real-world gene expression data sets. The results demonstrate that our algorithm is able to detect meaningful structures in the data and outperform other competing biclustering methods in various settings and situations.

線性的 · Weight · 行 · 奇異向量 · 似然 ·

2021 年 9 月 14 日

A Weighted Randomized Kaczmarz Method for Solving Linear Systems

Stefan Steinerberger

The Kaczmarz method for solving a linear system $Ax = b$ interprets such a system as a collection of equations $\left\langle a_i, x\right\rangle = b_i$, where $a_i$ is the $i-$th row of $A$, then picks such an equation and corrects $x_{k+1} = x_k + \lambda a_i$ where $\lambda$ is chosen so that the $i-$th equation is satisfied. Convergence rates are difficult to establish. Assuming the rows to be normalized, $\|a_i\|_{\ell^2}=1$, Strohmer \& Vershynin established that if the order of equations is chosen at random, $\mathbb{E}~ \|x_k - x\|_{\ell^2}$ converges exponentially. We prove that if the $i-$th row is selected with likelihood proportional to $\left|\left\langle a_i, x_k \right\rangle - b_i\right|^{p}$, where $0<p<\infty$, then $\mathbb{E}~\|x_k - x\|_{\ell^2}$ converges faster than the purely random method. As $p \rightarrow \infty$, the method de-randomizes and explains, among other things, why the maximal correction method works well. We empirically observe that the method computes approximations of small singular vectors of $A$ as a byproduct.

方差減小 · 估計/估計量 · 置信度 · 優化器 · 學成 ·

2018 年 4 月 25 日

Variance Reduction Methods for Sublinear Reinforcement Learning

Sham Kakade,Mengdi Wang,Lin F. Yang

from arxiv, Fixed a bug of a previous version

This work considers the problem of provably optimal reinforcement learning for episodic finite horizon MDPs, i.e. how an agent learns to maximize his/her long term reward in an uncertain environment. The main contribution is in providing a novel algorithm --- Variance-reduced Upper Confidence Q-learning (vUCQ) --- which enjoys a regret bound of $\widetilde{O}(\sqrt{HSAT} + H^5SA)$, where the $T$ is the number of time steps the agent acts in the MDP, $S$ is the number of states, $A$ is the number of actions, and $H$ is the (episodic) horizon time. This is the first regret bound that is both sub-linear in the model size and asymptotically optimal. The algorithm is sub-linear in that the time to achieve $\epsilon$-average regret for any constant $\epsilon$ is $O(SA)$, which is a number of samples that is far less than that required to learn any non-trivial estimate of the transition model (the transition model is specified by $O(S^2A)$ parameters). The importance of sub-linear algorithms is largely the motivation for algorithms such as $Q$-learning and other "model free" approaches. vUCQ algorithm also enjoys minimax optimal regret in the long run, matching the $\Omega(\sqrt{HSAT})$ lower bound. Variance-reduced Upper Confidence Q-learning (vUCQ) is a successive refinement method in which the algorithm reduces the variance in $Q$-value estimates and couples this estimation scheme with an upper confidence based algorithm. Technically, the coupling of both of these techniques is what leads to the algorithm enjoying both the sub-linear regret property and the asymptotically optimal regret.

ReQuEST · 圖 · 網絡嵌入 · Networking · 寬度 ·

2018 年 3 月 12 日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Matthias Rost,Stefan Schmid

Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.