亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The quantization problem aims to find the best possible approximation of probability measures on ${\mathbb{R}}^d$ using finite, discrete measures. The Wasserstein distance is a typical choice to measure the quality of the approximation. This contribution investigates the properties and robustness of the entropy-regularized quantization problem, which relaxes the standard quantization problem. The proposed approximation technique naturally adopts the softmin function, which is well known for its robustness in terms of theoretical and practicability standpoints. Moreover, we use the entropy-regularized Wasserstein distance to evaluate the quality of the soft quantization problem's approximation, and we implement a stochastic gradient approach to achieve the optimal solutions. The control parameter in our proposed method allows for the adjustment of the optimization problem's difficulty level, providing significant advantages when dealing with exceptionally challenging problems of interest. As well, this contribution empirically illustrates the performance of the method in various expositions.

相關內容

Given some binary matrix $M$, suppose we are presented with the collection of its rows and columns in independent arbitrary orderings. From this information, are we able to recover the unique original orderings and matrix? We present an algorithm that identifies whether there is a unique ordering associated with a set of rows and columns, and outputs either the unique correct orderings for the rows and columns or the full collection of all valid orderings and valid matrices. We show that there is a constant $c > 0$ such that the algorithm terminates in $O(n^2)$ time with high probability and in expectation for random $n \times n$ binary matrices with i.i.d.\ Bernoulli $(p)$ entries $(m_{ij})_{ij=1}^n$ such that $\frac{c\log^2(n)}{n(\log\log(n))^2} \leq p \leq \frac{1}{2}$.

We study the problem of testing whether a symmetric $d \times d$ input matrix $A$ is symmetric positive semidefinite (PSD), or is $\epsilon$-far from the PSD cone, meaning that $\lambda_{\min}(A) \leq - \epsilon \|A\|_p$, where $\|A\|_p$ is the Schatten-$p$ norm of $A$. In applications one often needs to quickly tell if an input matrix is PSD, and a small distance from the PSD cone may be tolerable. We consider two well-studied query models for measuring efficiency, namely, the matrix-vector and vector-matrix-vector query models. We first consider one-sided testers, which are testers that correctly classify any PSD input, but may fail on a non-PSD input with a tiny failure probability. Up to logarithmic factors, in the matrix-vector query model we show a tight $\widetilde{\Theta}(1/\epsilon^{p/(2p+1)})$ bound, while in the vector-matrix-vector query model we show a tight $\widetilde{\Theta}(d^{1-1/p}/\epsilon)$ bound, for every $p \geq 1$. We also show a strong separation between one-sided and two-sided testers in the vector-matrix-vector model, where a two-sided tester can fail on both PSD and non-PSD inputs with a tiny failure probability. In particular, for the important case of the Frobenius norm, we show that any one-sided tester requires $\widetilde{\Omega}(\sqrt{d}/\epsilon)$ queries. However we introduce a bilinear sketch for two-sided testing from which we construct a Frobenius norm tester achieving the optimal $\widetilde{O}(1/\epsilon^2)$ queries. We also give a number of additional separations between adaptive and non-adaptive testers. Our techniques have implications beyond testing, providing new methods to approximate the spectrum of a matrix with Frobenius norm error using dimensionality reduction in a way that preserves the signs of eigenvalues.

For some $\epsilon > 10^{-36}$ we give a randomized $3/2-\epsilon$ approximation algorithm for metric TSP.

Given a conjunctive query $Q$ and a database $\mathbf{D}$, a direct access to the answers of $Q$ over $\mathbf{D}$ is the operation of returning, given an index $j$, the $j^{\mathsf{th}}$ answer for some order on its answers. While this problem is $\#\mathsf{P}$-hard in general with respect to combined complexity, many conjunctive queries have an underlying structure that allows for a direct access to their answers for some lexicographical ordering that takes polylogarithmic time in the size of the database after a polynomial time precomputation. Previous work has precisely characterised the tractable classes and given fine-grained lower bounds on the precomputation time needed depending on the structure of the query. In this paper, we generalise these tractability results to the case of signed conjunctive queries, that is, conjunctive queries that may contain negative atoms. Our technique is based on a class of circuits that can represent relational data. We first show that this class supports tractable direct access after a polynomial time preprocessing. We then give bounds on the size of the circuit needed to represent the answer set of signed conjunctive queries depending on their structure. Both results combined together allow us to prove the tractability of direct access for a large class of conjunctive queries. On the one hand, we recover the known tractable classes from the literature in the case of positive conjunctive queries. On the other hand, we generalise and unify known tractability results about negative conjunctive queries -- that is, queries having only negated atoms. In particular, we show that the class of $\beta$-acyclic negative conjunctive queries and the class of bounded nest set width negative conjunctive queries admit tractable direct access.

In the impartial selection problem, a subset of agents up to a fixed size $k$ among a group of $n$ is to be chosen based on votes cast by the agents themselves. A selection mechanism is impartial if no agent can influence its own chance of being selected by changing its vote. It is $\alpha$-optimal if, for every instance, the ratio between the votes received by the selected subset is at least a fraction of $\alpha$ of the votes received by the subset of size $k$ with the highest number of votes. We study deterministic impartial mechanisms in a more general setting with arbitrarily weighted votes and provide the first approximation guarantee, roughly $1/\lceil 2n/k\rceil$. When the number of agents to select is large enough compared to the total number of agents, this yields an improvement on the previously best known approximation ratio of $1/k$ for the unweighted setting. We further show that our mechanism can be adapted to the impartial assignment problem, in which multiple sets of up to $k$ agents are to be selected, with a loss in the approximation ratio of $1/2$.

Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,\delta)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $\delta$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $\delta=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in W$_\infty$ distance. We show that by combining our new techniques with a careful localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.

We study several polygonal curve problems under the Fr\'{e}chet distance via algebraic geometric methods. Let $\mathbb{X}_m^d$ and $\mathbb{X}_k^d$ be the spaces of all polygonal curves of $m$ and $k$ vertices in $\mathbb{R}^d$, respectively. We assume that $k \leq m$. Let $\mathcal{R}^d_{k,m}$ be the set of ranges in $\mathbb{X}_m^d$ for all possible metric balls of polygonal curves in $\mathbb{X}_k^d$ under the Fr\'{e}chet distance. We prove a nearly optimal bound of $O(dk\log (km))$ on the VC dimension of the range space $(\mathbb{X}_m^d,\mathcal{R}_{k,m}^d)$, improving on the previous $O(d^2k^2\log(dkm))$ upper bound and approaching the current $\Omega(dk\log k)$ lower bound. Our upper bound also holds for the weak Fr\'{e}chet distance. We also obtain exact solutions that are hitherto unknown for curve simplification, range searching, nearest neighbor search, and distance oracle.

Constructing a similarity graph from a set $X$ of data points in $\mathbb{R}^d$ is the first step of many modern clustering algorithms. However, typical constructions of a similarity graph have high time complexity, and a quadratic space dependency with respect to $|X|$. We address this limitation and present a new algorithmic framework that constructs a sparse approximation of the fully connected similarity graph while preserving its cluster structure. Our presented algorithm is based on the kernel density estimation problem, and is applicable for arbitrary kernel functions. We compare our designed algorithm with the well-known implementations from the scikit-learn library and the FAISS library, and find that our method significantly outperforms the implementation from both libraries on a variety of datasets.

We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value-at-risk (CVaR) and average top-$k$ loss. We present Prospect, a stochastic gradient-based algorithm that only requires tuning a single learning rate hyperparameter, and prove that it enjoys linear convergence for smooth regularized losses. This contrasts with previous algorithms that either require tuning multiple hyperparameters or potentially fail to converge due to biased gradient estimates or inadequate regularization. Empirically, we show that Prospect can converge 2-3$\times$ faster than baselines such as stochastic gradient and stochastic saddle-point methods on distribution shift and fairness benchmarks spanning tabular, vision, and language domains.

A universal partial cycle (or upcycle) for $\mathcal{A}^n$ is a cyclic sequence that covers each word of length $n$ over the alphabet $\mathcal{A}$ exactly once -- like a De Bruijn cycle, except that we also allow a wildcard symbol $\mathord{\diamond}$ that can represent any letter of $\mathcal{A}$. Chen et al. in 2017 and Goeckner et al. in 2018 showed that the existence and structure of upcycles are highly constrained, unlike those of De Bruijn cycles, which exist for any alphabet size and word length. Moreover, it was not known whether any upcycles existed for $n \ge 5$. We present several examples of upcycles over both binary and non-binary alphabets for $n = 8$. We generalize two graph-theoretic representations of De Bruijn cycles to upcycles. We then introduce novel approaches to constructing new upcycles from old ones. Notably, given any upcycle for an alphabet of size $a$, we show how to construct an upcycle for an alphabet of size $ak$ for any $k \in \mathbb{N}$, so each example generates an infinite family of upcycles. We also define folds and lifts of upcycles, which relate upcycles with differing densities of $\mathord{\diamond}$ characters. In particular, we show that every upcycle lifts to a De Bruijn cycle. Our constructions rely on a different generalization of De Bruijn cycles known as perfect necklaces, and we introduce several new examples of perfect necklaces. We extend the definitions of certain pseudorandomness properties to partial words and determine which are satisfied by all upcycles, then draw a conclusion about linear feedback shift registers. Finally, we prove new nonexistence results based on the word length $n$, alphabet size, and $\mathord{\diamond}$ density.

北京阿比特科技有限公司