亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this work, we propose a numerical method to compute the Wasserstein Hamiltonian flow (WHF), which is a Hamiltonian system on the probability density manifold. Many well-known PDE systems can be reformulated as WHFs. We use parameterized function as push-forward map to characterize the solution of WHF, and convert the PDE to a finite-dimensional ODE system, which is a Hamiltonian system in the phase space of the parameter manifold. We establish error analysis results for the continuous time approximation scheme in Wasserstein metric. For the numerical implementation, we use neural networks as push-forward maps. We apply an effective symplectic scheme to solve the derived Hamiltonian ODE system so that the method preserves some important quantities such as total energy. The computation is done by fully deterministic symplectic integrator without any neural network training. Thus, our method does not involve direct optimization over network parameters and hence can avoid the error introduced by stochastic gradient descent (SGD) methods, which is usually hard to quantify and measure. The proposed algorithm is a sampling-based approach that scales well to higher dimensional problems. In addition, the method also provides an alternative connection between the Lagrangian and Eulerian perspectives of the original WHF through the parameterized ODE dynamics.

相關內容

Networking:IFIP International Conferences on Networking。 Explanation:國際網絡會議。 Publisher:IFIP。 SIT:

Uncertainty in timing information pertaining to the start time of microphone recordings and sources' emission time pose significant challenges in various applications, such as joint microphones and sources localization. Traditional optimization methods, which directly estimate this unknown timing information (UTIm), often fall short compared to approaches exploiting the low-rank property (LRP). LRP encompasses an additional low-rank structure, facilitating a linear constraint on UTIm to help formulate related low-rank structure information. This method allows us to attain globally optimal solutions for UTIm, given proper initialization. However, the initialization process often involves randomness, leading to suboptimal, local minimum values. This paper presents a novel, combined low-rank approximation (CLRA) method designed to mitigate the effects of this random initialization. We introduce three new LRP variants, underpinned by mathematical proof, which allow the UTIm to draw on a richer pool of low-rank structural information. Utilizing this augmented low-rank structural information from both LRP and the proposed variants, we formulate four linear constraints on the UTIm. Employing the proposed CLRA algorithm, we derive global optimal solutions for the UTIm via these four linear constraints.Experimental results highlight the superior performance of our method over existing state-of-the-art approaches, measured in terms of both the recovery number and reduced estimation errors of UTIm.

Optimal Transport has sparked vivid interest in recent years, in particular thanks to the Wasserstein distance, which provides a geometrically sensible and intuitive way of comparing probability measures. For computational reasons, the Sliced Wasserstein (SW) distance was introduced as an alternative to the Wasserstein distance, and has seen uses for training generative Neural Networks (NNs). While convergence of Stochastic Gradient Descent (SGD) has been observed practically in such a setting, there is to our knowledge no theoretical guarantee for this observation. Leveraging recent works on convergence of SGD on non-smooth and non-convex functions by Bianchi et al. (2022), we aim to bridge that knowledge gap, and provide a realistic context under which fixed-step SGD trajectories for the SW loss on NN parameters converge. More precisely, we show that the trajectories approach the set of (sub)-gradient flow equations as the step decreases. Under stricter assumptions, we show a much stronger convergence result for noised and projected SGD schemes, namely that the long-run limits of the trajectories approach a set of generalised critical points of the loss function.

An asymptotic preserving and energy stable scheme for the Euler-Poisson system under the quasineutral scaling is designed and analysed. Correction terms are introduced in the convective fluxes and the electrostatic potential, which lead to the dissipation of mechanical energy and the entropy stability. The resolution of the semi-implicit in time finite volume in space fully-discrete scheme involves two steps: the solution of an elliptic problem for the potential and an explicit evaluation for the density and velocity. The proposed scheme possesses several physically relevant attributes, such as the the entropy stability and the consistency with the weak formulation of the continuous Euler-Poisson system. The AP property of the scheme, i.e. the boundedness of the mesh parameters with respect to the Debye length and its consistency with the quasineutral limit system, is shown. The results of numerical case studies are presented to substantiate the robustness and efficiency of the proposed method.

We study the problem of finding a Hamiltonian cycle under the promise that the input graph has a minimum degree of at least $n/2$, where $n$ denotes the number of vertices in the graph. The classical theorem of Dirac states that such graphs (a.k.a. Dirac graphs) are Hamiltonian, i.e., contain a Hamiltonian cycle. Moreover, finding a Hamiltonian cycle in Dirac graphs can be done in polynomial time in the classical centralized model. This paper presents a randomized distributed CONGEST algorithm that finds w.h.p. a Hamiltonian cycle (as well as maximum matching) within $O(\log n)$ rounds under the promise that the input graph is a Dirac graph. This upper bound is in contrast to general graphs in which both the decision and search variants of Hamiltonicity require $\tilde{\Omega}(n^2)$ rounds, as shown by Bachrach et al. [PODC'19]. In addition, we consider two generalizations of Dirac graphs: Ore graphs and Rahman-Kaykobad graphs [IPL'05]. In Ore graphs, the sum of the degrees of every pair of non-adjacent vertices is at least $n$, and in Rahman-Kaykobad graphs, the sum of the degrees of every pair of non-adjacent vertices plus their distance is at least $n+1$. We show how our algorithm for Dirac graphs can be adapted to work for these more general families of graphs.

The Sliced Wasserstein (SW) distance has become a popular alternative to the Wasserstein distance for comparing probability measures. Widespread applications include image processing, domain adaptation and generative modelling, where it is common to optimise some parameters in order to minimise SW, which serves as a loss function between discrete probability measures (since measures admitting densities are numerically unattainable). All these optimisation problems bear the same sub-problem, which is minimising the Sliced Wasserstein energy. In this paper we study the properties of $\mathcal{E}: Y \longmapsto \mathrm{SW}_2^2(\gamma_Y, \gamma_Z)$, i.e. the SW distance between two uniform discrete measures with the same amount of points as a function of the support $Y \in \mathbb{R}^{n \times d}$ of one of the measures. We investigate the regularity and optimisation properties of this energy, as well as its Monte-Carlo approximation $\mathcal{E}_p$ (estimating the expectation in SW using only $p$ samples) and show convergence results on the critical points of $\mathcal{E}_p$ to those of $\mathcal{E}$, as well as an almost-sure uniform convergence. Finally, we show that in a certain sense, Stochastic Gradient Descent methods minimising $\mathcal{E}$ and $\mathcal{E}_p$ converge towards (Clarke) critical points of these energies.

The landscape of applications and subroutines relying on shortest path computations continues to grow steadily. This growth is driven by the undeniable success of shortest path algorithms in theory and practice. It also introduces new challenges as the models and assessing the optimality of paths become more complicated. Hence, multiple recent publications in the field adapt existing labeling methods in an ad-hoc fashion to their specific problem variant without considering the underlying general structure: they always deal with multi-criteria scenarios and those criteria define different partial orders on the paths. In this paper, we introduce the partial order shortest path problem (POSP), a generalization of the multi-objective shortest path problem (MOSP) and in turn also of the classical shortest path problem. POSP captures the particular structure of many shortest path applications as special cases. In this generality, we study optimality conditions or the lack of them, depending on the objective functions' properties. Our final contribution is a big lookup table summarizing our findings and providing the reader an easy way to choose among the most recent multicriteria shortest path algorithms depending on their problem's weight structure. Examples range from time-dependent shortest path and bottleneck path problems to the fuzzy shortest path problem and complex financial weight functions studied in the public transportation community. Our results hold for general digraphs and therefore surpass previous generalizations that were limited to acyclic graphs.

There are multiple cluster randomised trial designs that vary in when the clusters cross between control and intervention states, when observations are made within clusters, and how many observations are made at that time point. Identifying the most efficient study design is complex though, owing to the correlation between observations within clusters and over time. In this article, we present a review of statistical and computational methods for identifying optimal cluster randomised trial designs. We also adapt methods from the experimental design literature for experimental designs with correlated observations to the cluster trial context. We identify three broad classes of methods: using exact formulae for the treatment effect estimator variance for specific models to derive algorithms or weights for cluster sequences; generalised methods for estimating weights for experimental units; and, combinatorial optimisation algorithms to select an optimal subset of experimental units. We also discuss methods for rounding weights to whole numbers of clusters and extensions to non-Gaussian models. We present results from multiple cluster trial examples that compare the different methods, including problems involving determining optimal allocation of clusters across a set of cluster sequences, and selecting the optimal number of single observations to make in each cluster-period for both Gaussian and non-Gaussian models, and including exchangeable and exponential decay covariance structures.

This paper proposes Gait Decomposition (G.D), a method of mathematically decomposing snake movements, and Gait Parameter Gradient (GPG), a method of optimizing decomposed gait parameters. G.D is a method that can express the snake gait mathematically and concisely from generating movement using the curve function to the motor control order when generating movement of snake robot. Through this method, the gait of the snake robot can be intuitively classified into a matrix, as well as flexibly adjusting the parameters of the curve function required for gait generation. This can solve the problem that parameter tuning, which is the reason why it is difficult for a snake robot to practical use, is difficult. Therefore, if this G.D is applied to snake robots, various gaits can be generated with a few of parameters, so snake robots can be used in many fields. We also implemented the GPG algorithm to optimize the gait curve function as well as define the gait of the snake robot through G.D.

The machine learning explosion has created a prominent trend in modern computer hardware towards low precision floating-point operations. In response, there have been growing efforts to use low and mixed precision in general scientific computing. One important area that has received limited exploration is time-integration methods, which are used for solving differential equations that are ubiquitous in science and engineering applications. In this work, we develop two new approaches for leveraging mixed precision in exponential time integration methods. The first approach is based on a reformulation of the exponential Rosenbrock--Euler method allowing for low precision computations in matrix exponentials independent of the particular algorithm for matrix exponentiation. The second approach is based on an inexact and incomplete Arnoldi procedure in Krylov approximation methods for computing matrix exponentials and is agnostic to the chosen integration method. We show that both approaches improve accuracy compared to using purely low precision and offer better efficiency than using only double precision when solving an advection-diffusion-reaction partial differential equation.

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.

北京阿比特科技有限公司