亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

An oblivious subspace embedding (OSE), characterized by parameters $m,n,d,\epsilon,\delta$, is a random matrix $\Pi\in \mathbb{R}^{m\times n}$ such that for any $d$-dimensional subspace $T\subseteq \mathbb{R}^n$, $\Pr_\Pi[\forall x\in T, (1-\epsilon)\|x\|_2 \leq \|\Pi x\|_2\leq (1+\epsilon)\|x\|_2] \geq 1-\delta$. For $\epsilon$ and $\delta$ at most a small constant, we show that any OSE with one nonzero entry in each column must satisfy that $m = \Omega(d^2/(\epsilon^2\delta))$, establishing the optimality of the classical Count-Sketch matrix. When an OSE has $1/(9\epsilon)$ nonzero entries in each column, we show it must hold that $m = \Omega(\epsilon^{O(\delta)} d^2)$, improving on the previous $\Omega(\epsilon^2 d^2)$ lower bound due to Nelson and Nguyen (ICALP 2014).

相關內容

We describe a Lanczos-based algorithm for approximating the product of a rational matrix function with a vector. This algorithm, which we call the Lanczos method for optimal rational matrix function approximation (Lanczos-OR), returns the optimal approximation from a Krylov subspace in a norm induced by the rational function's denominator, and can be computed using the information from a slightly larger Krylov subspace. We also provide a low-memory implementation which only requires storing a number of vectors proportional to the denominator degree of the rational function. Finally, we show that Lanczos-OR can also be used to derive algorithms for computing other matrix functions, including the matrix sign function and quadrature based rational function approximations. In many cases, it improves on the approximation quality of prior approaches, including the standard Lanczos method, with little additional computational overhead.

This paper studies the problem of matching two complete graphs with edge weights correlated through latent geometries, extending a recent line of research on random graph matching with independent edge weights to geometric models. Specifically, given a random permutation $\pi^*$ on $[n]$ and $n$ iid pairs of correlated Gaussian vectors $\{X_{\pi^*(i)}, Y_i\}$ in $\mathbb{R}^d$ with noise parameter $\sigma$, the edge weights are given by $A_{ij}=\kappa(X_i,X_j)$ and $B_{ij}=\kappa(Y_i,Y_j)$ for some link function $\kappa$. The goal is to recover the hidden vertex correspondence $\pi^*$ based on the observation of $A$ and $B$. We focus on the dot-product model with $\kappa(x,y)=\langle x, y \rangle$ and Euclidean distance model with $\kappa(x,y)=\|x-y\|^2$, in the low-dimensional regime of $d=o(\log n)$ wherein the underlying geometric structures are most evident. We derive an approximate maximum likelihood estimator, which provably achieves, with high probability, perfect recovery of $\pi^*$ when $\sigma=o(n^{-2/d})$ and almost perfect recovery with a vanishing fraction of errors when $\sigma=o(n^{-1/d})$. Furthermore, these conditions are shown to be information-theoretically optimal even when the latent coordinates $\{X_i\}$ and $\{Y_i\}$ are observed, complementing the recent results of [DCK19] and [KNW22] in geometric models of the planted bipartite matching problem. As a side discovery, we show that the celebrated spectral algorithm of [Ume88] emerges as a further approximation to the maximum likelihood in the geometric model.

In this paper, we consider the problem of black-box optimization using Gaussian Process (GP) bandit optimization with a small number of batches. Assuming the unknown function has a low norm in the Reproducing Kernel Hilbert Space (RKHS), we introduce a batch algorithm inspired by batched finite-arm bandit algorithms, and show that it achieves the cumulative regret upper bound $O^\ast(\sqrt{T\gamma_T})$ using $O(\log\log T)$ batches within time horizon $T$, where the $O^\ast(\cdot)$ notation hides dimension-independent logarithmic factors and $\gamma_T$ is the maximum information gain associated with the kernel. This bound is near-optimal for several kernels of interest and improves on the typical $O^\ast(\sqrt{T}\gamma_T)$ bound, and our approach is arguably the simplest among algorithms attaining this improvement. In addition, in the case of a constant number of batches (not depending on $T$), we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches, focusing on the squared exponential and Mat\'ern kernels. The algorithmic upper bounds are shown to be nearly minimax optimal via analogous algorithm-independent lower bounds.

In this paper, we study the sample complexity of {\em noisy Bayesian quadrature} (BQ), in which we seek to approximate an integral based on noisy black-box queries to the underlying function. We consider functions in a {\em Reproducing Kernel Hilbert Space} (RKHS) with the Mat\'ern-$\nu$ kernel, focusing on combinations of the parameter $\nu$ and dimension $d$ such that the RKHS is equivalent to a Sobolev class. In this setting, we provide near-matching upper and lower bounds on the best possible average error. Specifically, we find that when the black-box queries are subject to Gaussian noise having variance $\sigma^2$, any algorithm making at most $T$ queries (even with adaptive sampling) must incur a mean absolute error of $\Omega(T^{-\frac{\nu}{d}-1} + \sigma T^{-\frac{1}{2}})$, and there exists a non-adaptive algorithm attaining an error of at most $O(T^{-\frac{\nu}{d}-1} + \sigma T^{-\frac{1}{2}})$. Hence, the bounds are order-optimal, and establish that there is no adaptivity gap in terms of scaling laws.

In the Strip Packing problem (SP), we are given a vertical half-strip $[0,W]\times[0,\infty)$ and a set of $n$ axis-aligned rectangles of width at most $W$. The goal is to find a non-overlapping packing of all rectangles into the strip such that the height of the packing is minimized. A well-studied and frequently used practical constraint is to allow only those packings that are guillotine separable, i.e., every rectangle in the packing can be obtained by recursively applying a sequence of edge-to-edge axis-parallel cuts (guillotine cuts) that do not intersect any item of the solution. In this paper, we study approximation algorithms for the Guillotine Strip Packing problem (GSP), i.e., the Strip Packing problem where we require additionally that the packing needs to be guillotine separable. This problem generalizes the classical Bin Packing problem and also makespan minimization on identical machines, and thus it is already strongly NP-hard. Moreover, due to a reduction from the Partition problem, it is NP-hard to obtain a polynomial-time $(3/2-\varepsilon)$-approximation algorithm for GSP for any $\varepsilon>0$ (exactly as Strip Packing). We provide a matching polynomial time $(3/2+\varepsilon)$-approximation algorithm for GSP. Furthermore, we present a pseudo-polynomial time $(1+\varepsilon)$-approximation algorithm for GSP. This is surprising as it is NP-hard to obtain a $(5/4-\varepsilon)$-approximation algorithm for (general) Strip Packing in pseudo-polynomial time. Thus, our results essentially settle the approximability of GSP for both the polynomial and the pseudo-polynomial settings.

We consider the following oblivious sketching problem: given $\epsilon \in (0,1/3)$ and $n \geq d/\epsilon^2$, design a distribution $\mathcal{D}$ over $\mathbb{R}^{k \times nd}$ and a function $f: \mathbb{R}^k \times \mathbb{R}^{nd} \rightarrow \mathbb{R}$, so that for any $n \times d$ matrix $A$, $$\Pr_{S \sim \mathcal{D}} [(1-\epsilon) \|A\|_{op} \leq f(S(A),S) \leq (1+\epsilon)\|A\|_{op}] \geq 2/3,$$ where $\|A\|_{op}$ is the operator norm of $A$ and $S(A)$ denotes $S \cdot A$, interpreting $A$ as a vector in $\mathbb{R}^{nd}$. We show a tight lower bound of $k = \Omega(d^2/\epsilon^2)$ for this problem. Our result considerably strengthens the result of Nelson and Nguyen (ICALP, 2014), as it (1) applies only to estimating the operator norm, which can be estimated given any OSE, and (2) applies to distributions over general linear operators $S$ which treat $A$ as a vector and compute $S(A)$, rather than the restricted class of linear operators corresponding to matrix multiplication. Our technique also implies the first tight bounds for approximating the Schatten $p$-norm for even integers $p$ via general linear sketches, improving the previous lower bound from $k = \Omega(n^{2-6/p})$ [Regev, 2014] to $k = \Omega(n^{2-4/p})$. Importantly, for sketching the operator norm up to a factor of $\alpha$, where $\alpha - 1 = \Omega(1)$, we obtain a tight $k = \Omega(n^2/\alpha^4)$ bound, matching the upper bound of Andoni and Nguyen (SODA, 2013), and improving the previous $k = \Omega(n^2/\alpha^6)$ lower bound. Finally, we also obtain the first lower bounds for approximating Ky Fan norms.

This paper presents local minimax regret lower bounds for adaptively controlling linear-quadratic-Gaussian (LQG) systems. We consider smoothly parametrized instances and provide an understanding of when logarithmic regret is impossible which is both instance specific and flexible enough to take problem structure into account. This understanding relies on two key notions: That of local-uninformativeness; when the optimal policy does not provide sufficient excitation for identification of the optimal policy, and yields a degenerate Fisher information matrix; and that of information-regret-boundedness, when the small eigenvalues of a policy-dependent information matrix are boundable in terms of the regret of that policy. Combined with a reduction to Bayesian estimation and application of Van Trees' inequality, these two conditions are sufficient for proving regret bounds on order of magnitude $\sqrt{T}$ in the time horizon, $T$. This method yields lower bounds that exhibit tight dimensional dependencies and scale naturally with control-theoretic problem constants. For instance, we are able to prove that systems operating near marginal stability are fundamentally hard to learn to control. We further show that large classes of systems satisfy these conditions, among them any state-feedback system with both $A$- and $B$-matrices unknown. Most importantly, we also establish that a nontrivial class of partially observable systems, essentially those that are over-actuated, satisfy these conditions, thus providing a $\sqrt{T}$ lower bound also valid for partially observable systems. Finally, we turn to two simple examples which demonstrate that our lower bound captures classical control-theoretic intuition: our lower bounds diverge for systems operating near marginal stability or with large filter gain -- these can be arbitrarily hard to (learn to) control.

Consider a matrix $\mathbf{F} \in \mathbb{K}^{m \times n}$ of univariate polynomials over a field~$\mathbb{K}$. We study the problem of computing the column rank profile of $\mathbf{F}$. To this end we first give an algorithm which improves the minimal kernel basis algorithm of Zhou, Labahn, and Storjohann (Proceedings ISSAC 2012). We then provide a second algorithm which computes the column rank profile of $\mathbf{F}$ with a rank-sensitive complexity of $O\tilde{~}(r^{\omega-2} n (m+D))$ operations in $\mathbb{K}$. Here, $D$ is the sum of row degrees of $\mathbf{F}$, $\omega$ is the exponent of matrix multiplication, and $O\tilde{~}(\cdot)$ hides logarithmic factors.

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.

We consider the exploration-exploitation trade-off in reinforcement learning and we show that an agent imbued with a risk-seeking utility function is able to explore efficiently, as measured by regret. The parameter that controls how risk-seeking the agent is can be optimized exactly, or annealed according to a schedule. We call the resulting algorithm K-learning and show that the corresponding K-values are optimistic for the expected Q-values at each state-action pair. The K-values induce a natural Boltzmann exploration policy for which the `temperature' parameter is equal to the risk-seeking parameter. This policy achieves an expected regret bound of $\tilde O(L^{3/2} \sqrt{S A T})$, where $L$ is the time horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the total number of elapsed time-steps. This bound is only a factor of $L$ larger than the established lower bound. K-learning can be interpreted as mirror descent in the policy space, and it is similar to other well-known methods in the literature, including Q-learning, soft-Q-learning, and maximum entropy policy gradient, and is closely related to optimism and count based exploration methods. K-learning is simple to implement, as it only requires adding a bonus to the reward at each state-action and then solving a Bellman equation. We conclude with a numerical example demonstrating that K-learning is competitive with other state-of-the-art algorithms in practice.

北京阿比特科技有限公司