亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We propose a CPU-GPU heterogeneous computing method for solving time-evolution partial differential equation problems many times with guaranteed accuracy, in short time-to-solution and low energy-to-solution. On a single-GH200 node, the proposed method improved the computation speed by 86.4 and 8.67 times compared to the conventional method run only on CPU and only on GPU, respectively. Furthermore, the energy-to-solution was reduced by 32.2-fold (from 9944 J to 309 J) and 7.01-fold (from 2163 J to 309 J) when compared to using only the CPU and GPU, respectively. Using the proposed method on the Alps supercomputer, a 51.6-fold and 6.98-fold speedup was attained when compared to using only the CPU and GPU, respectively, and a high weak scaling efficiency of 94.3% was obtained up to 1,920 compute nodes. These implementations were realized using directive-based parallel programming models while enabling portability, indicating that directives are highly effective in analyses in heterogeneous computing environments.

相關內容

 中央處理器(CPU,Central Processing Unit),電子計算機的主要設備之一。其功能主要是解釋計算機指令以及處理計算機軟件中的數據。

Generalized linear mixed models (GLMMs) are a widely used tool in statistical analysis. The main bottleneck of many computational approaches lies in the inversion of the high dimensional precision matrices associated with the random effects. Such matrices are typically sparse; however, the sparsity pattern resembles a multi partite random graph, which does not lend itself well to default sparse linear algebra techniques. Notably, we show that, for typical GLMMs, the Cholesky factor is dense even when the original precision is sparse. We thus turn to approximate iterative techniques, in particular to the conjugate gradient (CG) method. We combine a detailed analysis of the spectrum of said precision matrices with results from random graph theory to show that CG-based methods applied to high-dimensional GLMMs typically achieve a fixed approximation error with a total cost that scales linearly with the number of parameters and observations. Numerical illustrations with both real and simulated data confirm the theoretical findings, while at the same time illustrating situations, such as nested structures, where CG-based methods struggle.

Many natural computational problems in computer science, mathematics, physics, and other sciences amount to deciding if two objects are equivalent. Often this equivalence is defined in terms of group actions. A natural question is to ask when two objects can be distinguished by polynomial functions that are invariant under the group action. For finite groups, this is the usual notion of equivalence, but for continuous groups like the general linear groups it gives rise to a new notion, called orbit closure intersection. It captures, among others, the graph isomorphism problem, noncommutative PIT, null cone problems in invariant theory, equivalence problems for tensor networks, and the classification of multiparty quantum states. Despite recent algorithmic progress in celebrated special cases, the computational complexity of general orbit closure intersection problems is currently quite unclear. In particular, tensors seem to give rise to the most difficult problems. In this work we start a systematic study of orbit closure intersection from the complexity-theoretic viewpoint. To this end, we define a complexity class TOCI that captures the power of orbit closure intersection problems for general tensor actions, give an appropriate notion of algebraic reductions that imply polynomial-time reductions in the usual sense, but are amenable to invariant-theoretic techniques, identify natural tensor problems that are complete for TOCI, including the equivalence of 2D tensor networks with constant physical dimension, and show that the graph isomorphism problem can be reduced to these complete problems, hence GI$\subseteq$TOCI. As such, our work establishes the first lower bound on the computational complexity of orbit closure intersection problems, and it explains the difficulty of finding unconditional polynomial-time algorithms beyond special cases, as has been observed in the literature.

In decision-making, maxitive functions are used for worst-case and best-case evaluations. Maxitivity gives rise to a rich structure that is well-studied in the context of the pointwise order. In this article, we investigate maxitivity with respect to general preorders and provide a representation theorem for such functionals. The results are illustrated for different stochastic orders in the literature, including the usual stochastic order, the increasing convex/concave order, and the dispersive order.

This study presents a scalable Bayesian estimation algorithm for sparse estimation in exploratory item factor analysis based on a classical Bayesian estimation method, namely Bayesian joint modal estimation (BJME). BJME estimates the model parameters and factor scores that maximize the complete-data joint posterior density. Simulation studies show that the proposed algorithm has high computational efficiency and accuracy in variable selection over latent factors and the recovery of the model parameters. Moreover, we conducted a real data analysis using large-scale data from a psychological assessment that targeted the Big Five personality traits. This result indicates that the proposed algorithm achieves computationally efficient parameter estimation and extracts the interpretable factor loading structure.

We develop both first and second order numerical optimization methods to solve non-smooth optimization problems featuring a shared sparsity penalty, constrained by differential equations with uncertainty. To alleviate the curse of dimensionality we use tensor product approximations. To handle the non-smoothness of the objective function we introduce a smoothed version of the shared sparsity objective. We consider both a benchmark elliptic PDE constraint, and a more realistic topology optimization problem. We demonstrate that the error converges linearly in iterations and the smoothing parameter, and faster than algebraically in the number of degrees of freedom, consisting of the number of quadrature points in one variable and tensor ranks. Moreover, in the topology optimization problem, the smoothed shared sparsity penalty actually reduces the tensor ranks compared to the unpenalised solution. This enables us to find a sparse high-resolution design under a high-dimensional uncertainty.

We present a new $hp$-version space-time discontinuous Galerkin (dG) finite element method for the numerical approximation of parabolic evolution equations on general spatial meshes consisting of polygonal/polyhedral (polytopic) elements, giving rise to prismatic space-time elements. A key feature of the proposed method is the use of space-time elemental polynomial bases of \emph{total} degree, say $p$, defined in the physical coordinate system, as opposed to standard dG-time-stepping methods whereby spatial elemental bases are tensorized with temporal basis functions. This approach leads to a fully discrete $hp$-dG scheme using less degrees of freedom for each time step, compared to standard dG time-stepping schemes employing tensorized space-time, with acceptable deterioration of the approximation properties. A second key feature of the new space-time dG method is the incorporation of very general spatial meshes consisting of possibly polygonal/polyhedral elements with \emph{arbitrary} number of faces. A priori error bounds are shown for the proposed method in various norms. An extensive comparison among the new space-time dG method, the (standard) tensorized space-time dG methods, the classical dG-time-stepping, and conforming finite element method in space, is presented in a series of numerical experiments.

We propose a novel, highly efficient, second-order accurate, long-time unconditionally stable numerical scheme for a class of finite-dimensional nonlinear models that are of importance in geophysical fluid dynamics. The scheme is highly efficient in the sense that only a (fixed) symmetric positive definite linear problem (with varying right hand sides) is involved at each time-step. The solutions to the scheme are uniformly bounded for all time. We show that the scheme is able to capture the long-time dynamics of the underlying geophysical model, with the global attractors as well as the invariant measures of the scheme converge to those of the original model as the step size approaches zero. In our numerical experiments, we take an indirect approach, using long-term statistics to approximate the invariant measures. Our results suggest that the convergence rate of the long-term statistics, as a function of terminal time, is approximately first order using the Jensen-Shannon metric and half-order using the L1 metric. This implies that very long time simulation is needed in order to capture a few significant digits of long time statistics (climate) correct. Nevertheless, the second order scheme's performance remains superior to that of the first order one, requiring significantly less time to reach a small neighborhood of statistical equilibrium for a given step size.

The gradient bounds of generalized barycentric coordinates play an essential role in the $H^1$ norm approximation error estimate of generalized barycentric interpolations. Similarly, the $H^k$ norm, $k>1$, estimate needs upper bounds of high-order derivatives, which are not available in the literature. In this paper, we derive such upper bounds for the Wachspress generalized barycentric coordinates on simple convex $d$-dimensional polytopes, $d\ge 1$. The result can be used to prove optimal convergence for Wachspress-based polytopal finite element approximation of, for example, fourth-order elliptic equations. Another contribution of this paper is to compare various shape-regularity conditions for simple convex polytopes, and to clarify their relations using knowledge from convex geometry.

The rapid advancement of quantum technologies calls for the design and deployment of quantum-safe cryptographic protocols and communication networks. There are two primary approaches to achieving quantum-resistant security: quantum key distribution (QKD) and post-quantum cryptography (PQC). While each offers unique advantages, both have drawbacks in practical implementation. In this work, we introduce the pros and cons of these protocols and explore how they can be combined to achieve a higher level of security and/or improved performance in key distribution. We hope our discussion inspires further research into the design of hybrid cryptographic protocols for quantum-classical communication networks.

This work presents a novel global digital image correlation (DIC) method, based on a newly developed convolution finite element (C-FE) approximation. The convolution approximation can rely on the mesh of linear finite elements and enables arbitrarily high order approximations without adding more degrees of freedom. Therefore, the C-FE based DIC can be more accurate than {the} usual FE based DIC by providing highly smooth and accurate displacement and strain results with the same element size. The detailed formulation and implementation of the method have been discussed in this work. The controlling parameters in the method include the polynomial order, patch size, and dilation. A general choice of the parameters and their potential adaptivity have been discussed. The proposed DIC method has been tested by several representative examples, including the DIC challenge 2.0 benchmark problems, with comparison to the usual FE based DIC. C-FE outperformed FE in all the DIC results for the tested examples. This work demonstrates the potential of C-FE and opens a new avenue to enable highly smooth, accurate, and robust DIC analysis for full-field displacement and strain measurements.

北京阿比特科技有限公司