亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Multi-marginal optimal transport (MOT) is a generalization of optimal transport to multiple marginals. Optimal transport has evolved into an important tool in many machine learning applications, and its multi-marginal extension opens up for addressing new challenges in the field of machine learning. However, the usage of MOT has been largely impeded by its computational complexity which scales exponentially in the number of marginals. Fortunately, in many applications, such as barycenter or interpolation problems, the cost function adheres to structures, which has recently been exploited for developing efficient computational methods. In this work we derive computational bounds for these methods. With $m$ marginal distributions supported on $n$ points, we provide a $ \mathcal{\tilde O}(d(G)m n^2\epsilon^{-2})$ bound for a $\epsilon$-accuracy when the problem is associated with a tree with diameter $d(G)$. For the special case of the Wasserstein barycenter problem, which corresponds to a star-shaped tree, our bound is in alignment with the existing complexity bound for it.

相關內容

In this work, we examine sampling problems with non-smooth potentials. We propose a novel Markov chain Monte Carlo algorithm for sampling from non-smooth potentials. We provide a non-asymptotical analysis of our algorithm and establish a polynomial-time complexity $\tilde {\cal O}(d\varepsilon^{-1})$ to obtain $\varepsilon$ total variation distance to the target density, better than most existing results under the same assumptions. Our method is based on the proximal bundle method and an alternating sampling framework. This framework requires the so-called restricted Gaussian oracle, which can be viewed as a sampling counterpart of the proximal mapping in convex optimization. One key contribution of this work is a fast algorithm that realizes the restricted Gaussian oracle for any convex non-smooth potential with bounded Lipschitz constant.

Let $\mathbf{X}$ be a random variable uniformly distributed on the discrete cube $\{ -1,1\} ^{n}$, and let $T_{\rho}$ be the noise operator acting on Boolean functions $f:\{ -1,1\} ^{n}\to\{ 0,1\} $, where $\rho\in[0,1]$ is the noise parameter, representing the correlation coefficient between each coordination of $\mathbf{X}$ and its noise-corrupted version. Given a convex function $\Phi$ and the mean $\mathbb{E}f(\mathbf{X})=a\in[0,1]$, which Boolean function $f$ maximizes the $\Phi$-stability $\mathbb{E}[\Phi(T_{\rho}f(\mathbf{X}))]$ of $f$? Special cases of this problem include the (symmetric and asymmetric) $\alpha$-stability problems and the "Most Informative Boolean Function" problem. In this paper, we provide several upper bounds for the maximal $\Phi$-stability. When specializing $\Phi$ to some particular forms, by these upper bounds, we partially resolve Mossel and O'Donnell's conjecture on $\alpha$-stability with $\alpha>2$, Li and M\'edard's conjecture on $\alpha$-stability with $1<\alpha<2$, and Courtade and Kumar's conjecture on the "Most Informative Boolean Function" which corresponds to a conjecture on $\alpha$-stability with $\alpha=1$. Our proofs are based on discrete Fourier analysis, optimization theory, and improvements of the Friedgut--Kalai--Naor (FKN) theorem. Our improvements of the FKN theorem are sharp or asymptotically sharp for certain cases.

The discrete Wasserstein barycenter problem is a minimum-cost mass transport problem for a set of probability measures with finite support. In this paper, we show that finding a barycenter of sparse support is hard, even in dimension 2 and for only 3 measures. We prove this claim by showing that a special case of an intimately related decision problem SCMP -- does there exist a measure with a non-mass-splitting transport cost and support size below prescribed bounds? -- is NP-hard for all rational data. Our proof is based on a reduction from planar 3-dimensional matching and follows a strategy laid out by Spieksma and Woeginger (1996) for a reduction to planar, minimum circumference 3-dimensional matching. While we closely mirror the actual steps of their proof, the arguments themselves differ fundamentally due to the complex nature of the discrete barycenter problem. Containment of SCMP in NP will remain open. We prove that, for a given measure, sparsity and cost of an optimal transport to a set of measures can be verified in polynomial time in the size of a bit encoding of the measure. However, the encoding size of a barycenter may be exponential in the encoding size of the underlying measures.

We develop a minimax rate analysis to describe the reason that deep neural networks (DNNs) perform better than other standard methods. For nonparametric regression problems, it is well known that many standard methods attain the minimax optimal rate of estimation errors for smooth functions, and thus, it is not straightforward to identify the theoretical advantages of DNNs. This study tries to fill this gap by considering the estimation for a class of non-smooth functions that have singularities on hypersurfaces. Our findings are as follows: (i) We derive the generalization error of a DNN estimator and prove that its convergence rate is almost optimal. (ii) We elucidate a phase diagram of estimation problems, which describes the situations where the DNNs outperform a general class of estimators, including kernel methods, Gaussian process methods, and others. We additionally show that DNNs outperform harmonic analysis based estimators. This advantage of DNNs comes from the fact that a shape of singularity can be successfully handled by their multi-layered structure.

Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications for regression and optimization. It is well known that a serious downside for kernel-based models is the high computational cost; given a dataset of $n$ samples, the cost grows as $\mathcal{O}(n^3)$. Existing sparse approximation methods can yield a significant reduction in the computational cost, effectively reducing the real world cost down to as low as $\mathcal{O}(n)$ in certain cases. Despite this remarkable empirical success, significant gaps remain in the existing results for the analytical confidence bounds on the error due to approximation. In this work, we provide novel confidence intervals for the Nystr\"om method and the sparse variational Gaussian processes approximation method. Our confidence intervals lead to improved error bounds in both regression and optimization. We establish these confidence intervals using novel interpretations of the approximate (surrogate) posterior variance of the models.

Much of the past work on fairness in machine learning has focused on forcing the predictions of classifiers to have similar statistical properties for individuals of different demographics. Yet, such methods often simply perform a rescaling of the classifier scores and ignore whether individuals of different groups have similar features. Our proposed method, Optimal Transport to Fairness (OTF), applies Optimal Transport (OT) to take this similarity into account by quantifying unfairness as the smallest cost of OT between a classifier and any score function that satisfies fairness constraints. For a flexible class of linear fairness constraints, we show a practical way to compute OTF as an unfairness cost term that can be added to any standard classification setting. Experiments show that OTF can be used to achieve an effective trade-off between predictive power and fairness.

We define the $d$-defective incidence chromatic number of a graph, generalizing the notion of incidence chromatic number, and determine it for some classes of graphs including trees, complete bipartite graphs, complete graphs, and outerplanar graphs. Fast algorithms for constructing the optimal $d$-defective incidence colorings of those graphs are presented.

We study ROUND-UFP and ROUND-SAP, two generalizations of the classical BIN PACKING problem that correspond to the unsplittable flow problem on a path (UFP) and the storage allocation problem (SAP), respectively. We are given a path with capacities on its edges and a set of tasks where for each task we are given a demand and a subpath. In ROUND-UFP, the goal is to find a packing of all tasks into a minimum number of copies (rounds) of the given path such that for each copy, the total demand of tasks on any edge does not exceed the capacity of the respective edge. In ROUND-SAP, the tasks are considered to be rectangles and the goal is to find a non-overlapping packing of these rectangles into a minimum number of rounds such that all rectangles lie completely below the capacity profile of the edges. We show that in contrast to BIN PACKING, both the problems do not admit an asymptotic polynomial-time approximation scheme (APTAS), even when all edge capacities are equal. However, for this setting, we obtain asymptotic $(2+\varepsilon)$-approximations for both problems. For the general case, we obtain an $O(\log\log n)$-approximation algorithm and an $O(\log\log\frac{1}{\delta})$-approximation under $(1+\delta)$-resource augmentation for both problems. For the intermediate setting of the no bottleneck assumption (i.e., the maximum task demand is at most the minimum edge capacity), we obtain absolute $12$- and asymptotic $(16+\varepsilon)$-approximation algorithms for ROUND-UFP and ROUND-SAP, respectively.

Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions. Yet their algorithmic complexity generally prevents their direct use on large scale datasets. Among the possible strategies to alleviate this issue, practitioners can rely on computing estimates of these distances over subsets of data, {\em i.e.} minibatches. While computationally appealing, we highlight in this paper some limits of this strategy, arguing it can lead to undesirable smoothing effects. As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behavior. We discuss the associated theoretical properties, such as unbiased estimators, existence of gradients and concentration bounds. Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

北京阿比特科技有限公司