亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The main computational cost per iteration of adaptive cubic regularization methods for solving large-scale nonconvex problems is the computation of the step $s_k$, which requires an approximate minimizer of the cubic model. We propose a new approach in which this minimizer is sought in a low dimensional subspace that, in contrast to classical approaches, is reused for a number of iterations. A regularized Newton step to correct $s_k$ is also incorporated whenever needed. We show that our method increases efficiency while preserving the worst-case complexity of classical cubic regularized methods. We also explore the use of rational Krylov subspaces for the subspace minimization, to overcome some of the issues encountered when using polynomial Krylov subspaces. We provide several experimental results illustrating the gains of the new approach when compared to classic implementations.

相關內容

Many real-world processes have complex tail dependence structures that cannot be characterized using classical Gaussian processes. More flexible spatial extremes models exhibit appealing extremal dependence properties but are often exceedingly prohibitive to fit and simulate from in high dimensions. In this paper, we develop a new spatial extremes model that has flexible and non-stationary dependence properties, and we integrate it in the encoding-decoding structure of a variational autoencoder (XVAE), whose parameters are estimated via variational Bayes combined with deep learning. The XVAE can be used as a spatio-temporal emulator that characterizes the distribution of potential mechanistic model output states and produces outputs that have the same statistical properties as the inputs, especially in the tail. As an aside, our approach also provides a novel way of making fast inference with complex extreme-value processes. Through extensive simulation studies, we show that our XVAE is substantially more time-efficient than traditional Bayesian inference while also outperforming many spatial extremes models with a stationary dependence structure. To further demonstrate the computational power of the XVAE, we analyze a high-resolution satellite-derived dataset of sea surface temperature in the Red Sea, which includes 30 years of daily measurements at 16703 grid cells. We find that the extremal dependence strength is weaker in the interior of Red Sea and it has decreased slightly over time.

We introduce the modified planar rotator method (MPRS), a physically inspired machine learning method for spatial/temporal regression. MPRS is a non-parametric model which incorporates spatial or temporal correlations via short-range, distance-dependent ``interactions'' without assuming a specific form for the underlying probability distribution. Predictions are obtained by means of a fully autonomous learning algorithm which employs equilibrium conditional Monte Carlo simulations. MPRS is able to handle scattered data and arbitrary spatial dimensions. We report tests on various synthetic and real-word data in one, two and three dimensions which demonstrate that the MPRS prediction performance (without parameter tuning) is competitive with standard interpolation methods such as ordinary kriging and inverse distance weighting. In particular, MPRS is a particularly effective gap-filling method for rough and non-Gaussian data (e.g., daily precipitation time series). MPRS shows superior computational efficiency and scalability for large samples. Massive data sets involving millions of nodes can be processed in a few seconds on a standard personal computer.

The complexity class Quantum Statistical Zero-Knowledge ($\mathsf{QSZK}$) captures computational difficulties of the time-bounded quantum state testing problem with respect to the trace distance, known as the Quantum State Distinguishability Problem (QSDP) introduced by Watrous (FOCS 2002). However, QSDP is in $\mathsf{QSZK}$ merely within the constant polarizing regime, similar to its classical counterpart shown by Sahai and Vadhan (JACM 2003) due to the polarization lemma (error reduction for SDP). Recently, Berman, Degwekar, Rothblum, and Vasudevan (TCC 2019) extended the $\mathsf{SZK}$ containment for SDP beyond the polarizing regime via the time-bounded distribution testing problems with respect to the triangular discrimination and the Jensen-Shannon divergence. Our work introduces proper quantum analogs for these problems by defining quantum counterparts for triangular discrimination. We investigate whether the quantum analogs behave similarly to their classical counterparts and examine the limitations of existing approaches to polarization regarding quantum distances. These new $\mathsf{QSZK}$-complete problems improve $\mathsf{QSZK}$ containments for QSDP beyond the polarizing regime and establish a simple $\mathsf{QSZK}$-hardness for the quantum entropy difference problem (QEDP) defined by Ben-Aroya, Schwartz, and Ta-Shma (ToC 2010). Furthermore, we prove that QSDP with some exponentially small errors is in $\mathsf{PP}$, while the same problem without error is in $\mathsf{NQP}$.

We present ReCAT, a recursive composition augmented Transformer that is able to explicitly model hierarchical syntactic structures of raw texts without relying on gold trees during both learning and inference. Existing research along this line restricts data to follow a hierarchical tree structure and thus lacks inter-span communications. To overcome the problem, we propose a novel contextual inside-outside (CIO) layer that learns contextualized representations of spans through bottom-up and top-down passes, where a bottom-up pass forms representations of high-level spans by composing low-level spans, while a top-down pass combines information inside and outside a span. By stacking several CIO layers between the embedding layer and the attention layers in Transformer, the ReCAT model can perform both deep intra-span and deep inter-span interactions, and thus generate multi-grained representations fully contextualized with other spans. Moreover, the CIO layers can be jointly pre-trained with Transformers, making ReCAT enjoy scaling ability, strong performance, and interpretability at the same time. We conduct experiments on various sentence-level and span-level tasks. Evaluation results indicate that ReCAT can significantly outperform vanilla Transformer models on all span-level tasks and baselines that combine recursive networks with Transformers on natural language inference tasks. More interestingly, the hierarchical structures induced by ReCAT exhibit strong consistency with human-annotated syntactic trees, indicating good interpretability brought by the CIO layers.

In this paper we propose and analyze a virtual element method for the two dimensional non-symmetric diffusion-convection eigenvalue problem in order to derive a priori and a posteriori error estimates. Under the classic assumptions of the meshes, and with the aid of the classic theory of compact operators, we prove error estimates for the eigenvalues and eigenfunctions. Also, we develop an a posteriori error estimator which, in one hand, results to be reliable and on the other, with standard bubble functions arguments, also results to be efficient. We test our method on domains where the complex eigenfunctions are not sufficiently regular, in order to assess the performance of the estimator that we compare with the uniform refinement given by the a priori analysis

We present a registration method for model reduction of parametric partial differential equations with dominating advection effects and moving features. Registration refers to the use of a parameter-dependent mapping to make the set of solutions to these equations more amicable for approximation using classical reduced basis methods. The proposed approach utilizes concepts from optimal transport theory, as we utilize Monge embeddings to construct these mappings in a purely data-driven way. The method relies on one interpretable hyper-parameter. We discuss how our approach relates to existing works that combine model order reduction and optimal transport theory. Numerical results are provided to demonstrate the effect of the registration. This includes a model problem where the solution is itself a probability density and one where it is not.

Effective application of mathematical models to interpret biological data and make accurate predictions often requires that model parameters are identifiable. Approaches to assess the so-called structural identifiability of models are well-established for ordinary differential equation models, yet there are no commonly adopted approaches that can be applied to assess the structural identifiability of the partial differential equation (PDE) models that are requisite to capture spatial features inherent to many phenomena. The differential algebra approach to structural identifiability has recently been demonstrated to be applicable to several specific PDE models. In this brief article, we present general methodology for performing structural identifiability analysis on partially observed linear reaction-advection-diffusion (RAD) PDE models. We show that the differential algebra approach can always, in theory, be applied to linear RAD models. Moreover, despite the perceived complexity introduced by the addition of advection and diffusion terms, identifiability of spatial analogues of non-spatial models cannot decrease structural identifiability. Finally, we show that our approach can also be applied to a class of non-linear PDE models that are linear in the unobserved variables, and conclude by discussing future possibilities and computational cost of performing structural identifiability analysis on more general PDE models in mathematical biology.

We study the multivariate deconvolution problem of recovering the distribution of a signal from independent and identically distributed observations additively contaminated with random errors (noise) from a known distribution. For errors with independent coordinates having ordinary smooth densities, we derive an inversion inequality relating the $L^1$-Wasserstein distance between two distributions of the signal to the $L^1$-distance between the corresponding mixture densities of the observations. This smoothing inequality outperforms existing inversion inequalities. As an application of the inversion inequality to the Bayesian framework, we consider $1$-Wasserstein deconvolution with Laplace noise in dimension one using a Dirichlet process mixture of normal densities as a prior measure on the mixing distribution (or distribution of the signal). We construct an adaptive approximation of the sampling density by convolving the Laplace density with a well-chosen mixture of normal densities and show that the posterior measure concentrates around the sampling density at a nearly minimax rate, up to a log-factor, in the $L^1$-distance. The same posterior law is also shown to automatically adapt to the unknown Sobolev regularity of the mixing density, thus leading to a new Bayesian adaptive estimation procedure for mixing distributions with regular densities under the $L^1$-Wasserstein metric. We illustrate utility of the inversion inequality also in a frequentist setting by showing that an appropriate isotone approximation of the classical kernel deconvolution estimator attains the minimax rate of convergence for $1$-Wasserstein deconvolution in any dimension $d\geq 1$, when only a tail condition is required on the latent mixing density and we derive sharp lower bounds for these problems

A non-intrusive proper generalized decomposition (PGD) strategy, coupled with an overlapping domain decomposition (DD) method, is proposed to efficiently construct surrogate models of parametric linear elliptic problems. A parametric multi-domain formulation is presented, with local subproblems featuring arbitrary Dirichlet interface conditions represented through the traces of the finite element functions used for spatial discretization at the subdomain level, with no need for additional auxiliary basis functions. The linearity of the operator is exploited to devise low-dimensional problems with only few active boundary parameters. An overlapping Schwarz method is used to glue the local surrogate models, solving a linear system for the nodal values of the parametric solution at the interfaces, without introducing Lagrange multipliers to enforce the continuity in the overlapping region. The proposed DD-PGD methodology relies on a fully algebraic formulation allowing for real-time computation based on the efficient interpolation of the local surrogate models in the parametric space, with no additional problems to be solved during the execution of the Schwarz algorithm. Numerical results for parametric diffusion and convection-diffusion problems are presented to showcase the accuracy of the DD-PGD approach, its robustness in different regimes and its superior performance with respect to standard high-fidelity DD methods.

This note presents a refined local approximation for the logarithm of the ratio between the negative multinomial probability mass function and a multivariate normal density, both having the same mean-covariance structure. This approximation, which is derived using Stirling's formula and a meticulous treatment of Taylor expansions, yields an upper bound on the Hellinger distance between the jittered negative multinomial distribution and the corresponding multivariate normal distribution. Upper bounds on the Le Cam distance between negative multinomial and multivariate normal experiments ensue.

北京阿比特科技有限公司