亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We formulate a free probabilistic analog of the Wasserstein manifold on $\mathbb{R}^d$ (the formal Riemannian manifold of smooth probability densities on $\mathbb{R}^d$), and we use it to study smooth non-commutative transport of measure. The points of the free Wasserstein manifold $\mathscr{W}(\mathbb{R}^{*d})$ are smooth tracial non-commutative functions $V$ with quadratic growth at $\infty$, which correspond to minus the log-density in the classical setting. The space of smooth tracial non-commutative functions used here is a new one whose definition and basic properties we develop in the paper; they are scalar-valued functions of self-adjoint $d$-tuples from arbitrary tracial von Neumann algebras that can be approximated by trace polynomials. The space of non-commutative diffeomorphisms $\mathscr{D}(\mathbb{R}^{*d})$ acts on $\mathscr{W}(\mathbb{R}^{*d})$ by transport, and the basic relationship between tangent vectors for $\mathscr{D}(\mathbb{R}^{*d})$ and tangent vectors for $\mathscr{W}(\mathbb{R}^{*d})$ is described using the Laplacian $L_V$ associated to $V$ and its pseudo-inverse $\Psi_V$ (when defined). Following similar arguments to arXiv:1204.2182, arXiv:1701.00132, and arXiv:1906.10051 in the new setting, we give a rigorous proof for the existence of smooth transport along any path $t \mapsto V_t$ when $V$ is sufficiently close $(1/2) \sum_j \operatorname{tr}(x_j^2)$, as well as smooth triangular transport.

相關內容

When are inferences (whether Direct-Likelihood, Bayesian, or Frequentist) obtained from partial data valid? This paper answers this question by offering a new theory about inference with missing data. It proves that as the sample size increases and the extent of missingness decreases, the mean-loglikelihood function generated by partial data and that ignores the missingness mechanism will almost surely converge uniformly to that which would have been generated by complete data; and if the data are Missing at Random (or "partially missing at random"), this convergence depends only on sample size. Thus, inferences from partial data, such as posterior modes, uncertainty estimates, confidence intervals, likelihood ratios, and indeed, all quantities or features derived from the partial-data loglikelihood function, will be consistently estimated. They will approximate their complete-data analogues. This adds to previous research which has only proved the consistency of the posterior mode. Practical implications of this result are discussed, and the theory is verified using a previous study of International Human Rights Law.

We propose nonparametric estimators for the second-order central moments of spherical random fields within a functional data context. We consider a measurement framework where each field among an identically distributed collection of spherical random fields is sampled at a few random directions, possibly subject to measurement error. The collection of fields could be i.i.d. or serially dependent. Though similar setups have already been explored for random functions defined on the unit interval, the nonparametric estimators proposed in the literature often rely on local polynomials, which do not readily extend to the (product) spherical setting. We therefore formulate our estimation procedure as a variational problem involving a generalized Tikhonov regularization term. The latter favours smooth covariance/autocovariance functions, where the smoothness is specified by means of suitable Sobolev-like pseudo-differential operators. Using the machinery of reproducing kernel Hilbert spaces, we establish representer theorems that fully characterizing the form of our estimators. We determine their uniform rates of convergence as the number of fields diverges, both for the dense (increasing number of spatial samples) and sparse (bounded number of spatial samples) regimes. We moreover validate and demonstrate the practical feasibility of our estimation procedure in a simulation setting, assuming a fixed number of samples per field. Our numerical estimation procedure leverages the sparsity and second-order Kronecker structure of our setup to reduce the computational and memory requirements by approximately three orders of magnitude compared to a naive implementation would require.

We develop a probabilistic characterisation of trajectorial expansion rates in non-autonomous stochastic dynamical systems that can be defined over a finite time interval and used for the subsequent uncertainty quantification in Lagrangian (trajectory-based) predictions. These expansion rates are quantified via certain divergences (pre-metrics) between probability measures induced by the laws of the stochastic flow associated with the underlying dynamics. We construct scalar fields of finite-time divergence/expansion rates, show their existence and space-time continuity for general stochastic flows. Combining these divergence rate fields with our 'information inequalities' derived in allows for quantification and mitigation of the uncertainty in path-based observables estimated from simplified models in a way that is amenable to algorithmic implementations, and it can be utilised in information-geometric analysis of statistical estimation and inference, as well as in a data-driven machine/deep learning of coarse-grained models. We also derive a link between the divergence rates and finite-time Lyapunov exponents for probability measures and for path-based observables.

This paper studies a distributionally robust portfolio optimization model with a cardinality constraint for limiting the number of invested assets. We formulate this model as a mixed-integer semidefinite optimization (MISDO) problem by means of the moment-based uncertainty set of probability distributions of asset returns. To exactly solve large-scale problems, we propose a specialized cutting-plane algorithm that is based on bilevel optimization reformulation. We prove the finite convergence of the algorithm. We also apply a matrix completion technique to lower-level SDO problems to make their problem sizes much smaller. Numerical experiments demonstrate that our cutting-plane algorithm is significantly faster than the state-of-the-art MISDO solver SCIP-SDP. We also show that our portfolio optimization model can achieve good investment performance compared with the conventional mean-variance model.

The cumulative distribution or probability density of a random variable, which is itself a function of a high number of independent real-valued random variables, can be formulated as high-dimensional integrals of an indicator or a Dirac $\delta$ function, respectively. To approximate the distribution or density at a point, we carry out preintegration with respect to one suitably chosen variable, then apply a Quasi-Monte Carlo method to compute the integral of the resulting smoother function. Interpolation is then used to reconstruct the distribution or density on an interval. We provide rigorous regularity and error analysis for the preintegrated function to show that our estimators achieve nearly first order convergence. Numerical results support the theory.

The central levels problem asserts that the subgraph of the $(2m+1)$-dimensional hypercube induced by all bitstrings with at least $m+1-\ell$ many 1s and at most $m+\ell$ many 1s, i.e., the vertices in the middle $2\ell$ levels, has a Hamilton cycle for any $m\geq 1$ and $1\le \ell\le m+1$. This problem was raised independently by Buck and Wiedemann, Savage, by Gregor and \v{S}krekovski, and by Shen and Williams, and it is a common generalization of the well-known middle levels problem, namely the case $\ell=1$, and classical binary Gray codes, namely the case $\ell=m+1$. In this paper we present a general constructive solution of the central levels problem. Our results also imply the existence of optimal cycles through any sequence of $\ell$ consecutive levels in the $n$-dimensional hypercube for any $n\ge 1$ and $1\le \ell \le n+1$. Moreover, extending an earlier construction by Streib and Trotter, we construct a Hamilton cycle through the $n$-dimensional hypercube, $n\geq 2$, that contains the symmetric chain decomposition constructed by Greene and Kleitman in the 1970s, and we provide a loopless algorithm for computing the corresponding Gray code.

Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.

We propose the Wasserstein Auto-Encoder (WAE)---a new algorithm for building a generative model of the data distribution. WAE minimizes a penalized form of the Wasserstein distance between the model distribution and the target distribution, which leads to a different regularizer than the one used by the Variational Auto-Encoder (VAE). This regularizer encourages the encoded training distribution to match the prior. We compare our algorithm with several other techniques and show that it is a generalization of adversarial auto-encoders (AAE). Our experiments show that WAE shares many of the properties of VAEs (stable training, encoder-decoder architecture, nice latent manifold structure) while generating samples of better quality, as measured by the FID score.

We investigate the training and performance of generative adversarial networks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs. As our main theoretical contribution, we clarify the situation with bias in GAN loss functions raised by recent work: we show that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GANs are unbiased, but learning a discriminator based on samples leads to biased gradients for the generator parameters. We also discuss the issue of kernel choice for the MMD critic, and characterize the kernel corresponding to the energy distance used for the Cramer GAN critic. Being an integral probability metric, the MMD benefits from training strategies recently developed for Wasserstein GANs. In experiments, the MMD GAN is able to employ a smaller critic network than the Wasserstein GAN, resulting in a simpler and faster-training algorithm with matching performance. We also propose an improved measure of GAN convergence, the Kernel Inception Distance, and show how to use it to dynamically adapt learning rates during GAN training.

Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserstein GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only low-quality samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.

北京阿比特科技有限公司