一区二区三区四区五区无码-国产一级视频在线高清播放

In this paper, we construct and analyze divergence-free finite element methods for the Stokes problem on smooth domains. The discrete spaces are based on the Scott-Vogelius finite element pair of arbitrary polynomial degree greater than two. By combining the Piola transform with the classical isoparametric framework, and with a judicious choice of degrees of freedom, we prove that the method converges with optimal order in the energy norm. We also show that the discrete velocity error converges with optimal order in the $L^2$-norm. Numerical experiments are presented, which support the theoretical results.

相關內容

平滑

關注 1

近似 · 隨機采樣 · Performer · Integration · 隨機梯度下降 ·

2024 年 5 月 30 日

On the randomized Euler scheme for SDEs with integral-form drift

Pawe? Przyby?owicz,Micha? Sobieraj

In this paper, we investigate the problem of strong approximation of the solution of SDEs in the case when the drift coefficient is given in the integral form. Such drift often appears when analyzing stochastic dynamics of optimization procedures in machine learning problems. We discuss connections of the defined randomized Euler approximation scheme with the perturbed version of the stochastic gradient descent (SGD) algorithm. We investigate its upper error bounds, in terms of the discretization parameter n and the size M of the random sample drawn at each step of the algorithm, in different subclasses of coefficients of the underlying SDE. Finally, the results of numerical experiments performed by using GPU architecture are also reported.

塊 · 相互獨立的 · PAC學習理論 · 泛化理論 · 約束 ·

2024 年 5 月 30 日

Length independent generalization bounds for deep SSM architectures with stability constraints

Dániel Rácz,Mihály Petreczky,Bálint Daróczy

from arxiv, 25 pages, no figures, under submission

Many state-of-the-art models trained on long-range sequences, for example S4, S5 or LRU, are made of sequential blocks combining State-Space Models (SSMs) with neural networks. In this paper we provide a PAC bound that holds for these kind of architectures with stable SSM blocks and does not depend on the length of the input sequence. Imposing stability of the SSM blocks is a standard practice in the literature, and it is known to help performance. Our results provide a theoretical justification for the use of stable SSM blocks as the proposed PAC bound decreases as the degree of stability of the SSM blocks increases.

Analysis · 分解的 · 因子分析 · PCA · 潛變量/隱變量 ·

2024 年 5 月 30 日

A unified framework of principal component analysis and factor analysis

Shifeng Xiong

from arxiv, 24 pages, 2 figures

Principal component analysis and factor analysis are fundamental multivariate analysis methods. In this paper a unified framework to connect them is introduced. Under a general latent variable model, we present matrix optimization problems from the viewpoint of loss function minimization, and show that the two methods can be viewed as solutions to the optimization problems with specific loss functions. Specifically, principal component analysis can be derived from a broad class of loss functions including the L2 norm, while factor analysis corresponds to a modified L0 norm problem. Related problems are discussed, including algorithms, penalized maximum likelihood estimation under the latent variable model, and a principal component factor model. These results can lead to new tools of data analysis and research topics.

優化器 · Continuity · 向量化 · Extensibility · 連續優化 ·

2024 年 5 月 30 日

A random-key GRASP for combinatorial optimization

Antonio A. Chaves,Mauricio G. C. Resende,Ricardo M. A. Silva

from arxiv, 24 pages, 8 figures

This paper proposes a problem-independent GRASP metaheuristic using the random-key optimizer (RKO) paradigm. GRASP (greedy randomized adaptive search procedure) is a metaheuristic for combinatorial optimization that repeatedly applies a semi-greedy construction procedure followed by a local search procedure. The best solution found over all iterations is returned as the solution of the GRASP. Continuous GRASP (C-GRASP) is an extension of GRASP for continuous optimization in the unit hypercube. A random-key optimizer (RKO) uses a vector of random keys to encode a solution to a combinatorial optimization problem. It uses a decoder to evaluate a solution encoded by the vector of random keys. A random-key GRASP is a C-GRASP where points in the unit hypercube are evaluated employing a decoder. We describe random key GRASP consisting of a problem-independent component and a problem-dependent decoder. As a proof of concept, the random-key GRASP is tested on five NP-hard combinatorial optimization problems: traveling salesman problem, tree of hubs location problem, Steiner triple covering problem, node capacitated graph partitioning problem, and job sequencing and tool switching problem.

原點 · 錯誤率 · 控制器 · 樣本 · 統計量 ·

2024 年 5 月 30 日

The assessment of replicability using the sum of p-values

Leonhard Held,Samuel Pawel,Charlotte Micheloud

from arxiv, 6 figures, 0 tables, 1 box

Statistical significance of both the original and the replication study is a commonly used criterion to assess replication attempts, also known as the two-trials rule in drug development. However, replication studies are sometimes conducted although the original study is non-significant, in which case Type-I error rate control across both studies is no longer guaranteed. We propose an alternative method to assess replicability using the sum of p-values from the two studies. The approach provides a combined p-value and can be calibrated to control the overall Type-I error rate at the same level as the two-trials rule but allows for replication success even if the original study is non-significant. The unweighted version requires a less restrictive level of significance at replication if the original study is already convincing which facilitates sample size reductions of up to 10%. Downweighting the original study accounts for possible bias and requires a more stringent significance level and larger samples sizes at replication. Data from four large-scale replication projects are used to illustrate and compare the proposed method with the two-trials rule, meta-analysis and Fisher's combination method.

標量 · 線性的 · Extensibility · 講稿 · 跡 ·

2024 年 5 月 29 日

A high-order Eulerian-Lagrangian Runge-Kutta finite volume (EL-RK-FV) method for scalar nonlinear conservation laws

Jiajie Chen,Joseph Nakao,Jing-Mei Qiu,Yang Yang

from arxiv, 29 pages

We present a class of high-order Eulerian-Lagrangian Runge-Kutta finite volume methods that can numerically solve Burgers' equation with shock formations, which could be extended to general scalar conservation laws. Eulerian-Lagrangian (EL) and semi-Lagrangian (SL) methods have recently seen increased development and have become a staple for allowing large time-stepping sizes. Yet, maintaining relatively large time-stepping sizes post shock formation remains quite challenging. Our proposed scheme integrates the partial differential equation on a space-time region partitioned by linear approximations to the characteristics determined by the Rankine-Hugoniot jump condition. We trace the characteristics forward in time and present a merging procedure for the mesh cells to handle intersecting characteristics due to shocks. Following this partitioning, we write the equation in a time-differential form and evolve with Runge-Kutta methods in a method-of-lines fashion. High-resolution methods such as ENO and WENO-AO schemes are used for spatial reconstruction. Extension to higher dimensions is done via dimensional splitting. Numerical experiments demonstrate our scheme's high-order accuracy and ability to sharply capture post-shock solutions with large time-stepping sizes.

優化器 · 圖 · 機器人 · Agent · 成對型 ·

2024 年 5 月 29 日

An optimal algorithm for geodesic mutual visibility on hexagonal grids

Sahar Badri,Serafino Cicerone,Alessia Di Fonso,Gabriele Di Stefano

from arxiv, 24 pages, 13 figures

For a set of robots (or agents) moving in a graph, two properties are highly desirable: confidentiality (i.e., a message between two agents must not pass through any intermediate agent) and efficiency (i.e., messages are delivered through shortest paths). These properties can be obtained if the \textsc{Geodesic Mutual Visibility} (GMV, for short) problem is solved: oblivious robots move along the edges of the graph, without collisions, to occupy some vertices that guarantee they become pairwise geodesic mutually visible. This means there is a shortest path (i.e., a ``geodesic'') between each pair of robots along which no other robots reside. In this work, we optimally solve GMV on finite hexagonal grids $G_k$. This, in turn, requires first solving a graph combinatorial problem, i.e. determining the maximum number of mutually visible vertices in $G_k$.

Attention · 泛函 · Performer · CASE · 變換 ·

2024 年 5 月 29 日

Are queries and keys always relevant? A case study on Transformer wave functions

Riccardo Rende,Luciano Loris Viteritti

from arxiv, 9 pages, 4 figures

The dot product attention mechanism, originally designed for natural language processing (NLP) tasks, is a cornerstone of modern Transformers. It adeptly captures semantic relationships between word pairs in sentences by computing a similarity overlap between queries and keys. In this work, we explore the suitability of Transformers, focusing on their attention mechanisms, in the specific domain of the parametrization of variational wave functions to approximate ground states of quantum many-body spin Hamiltonians. Specifically, we perform numerical simulations on the two-dimensional $J_1$-$J_2$ Heisenberg model, a common benchmark in the field of quantum-many body systems on lattice. By comparing the performance of standard attention mechanisms with a simplified version that excludes queries and keys, relying solely on positions, we achieve competitive results while reducing computational cost and parameter usage. Furthermore, through the analysis of the attention maps generated by standard attention mechanisms, we show that the attention weights become effectively input-independent at the end of the optimization. We support the numerical results with analytical calculations, providing physical insights of why queries and keys should be, in principle, omitted from the attention mechanism when studying large systems. Interestingly, the same arguments can be extended to the NLP domain, in the limit of long input sentences.

樣本 · 核化 · CASES · 規范化的 · 多峰值 ·

2024 年 5 月 28 日

Sampling metastable systems using collective variables and Jarzynski-Crooks paths

Christoph Sch?nle,Marylou Gabrié,Tony Lelièvre,Gabriel Stoltz

We consider the problem of sampling a high dimensional multimodal target probability measure. We assume that a good proposal kernel to move only a subset of the degrees of freedoms (also known as collective variables) is known a priori. This proposal kernel can for example be built using normalizing flows. We show how to extend the move from the collective variable space to the full space and how to implement an accept-reject step in order to get a reversible chain with respect to a target probability measure. The accept-reject step does not require to know the marginal of the original measure in the collective variable (namely to know the free energy). The obtained algorithm admits several variants, some of them being very close to methods which have been proposed previously in the literature. We show how the obtained acceptance ratio can be expressed in terms of the work which appears in the Jarzynski-Crooks equality, at least for some variants. Numerical illustrations demonstrate the efficiency of the approach on various simple test cases, and allow us to compare the variants of the algorithm.

Microsoft Surface · Neural Networks · Networking · MoDELS · 損失函數（機器學習） ·

2021 年 5 月 28 日

Incorporating prior financial domain knowledge into neural networks for implied volatility surface prediction

Yu Zheng,Yongxin Yang,Bowei Chen

from arxiv, 8 pages, SIGKDD 2021

In this paper we develop a novel neural network model for predicting implied volatility surface. Prior financial domain knowledge is taken into account. A new activation function that incorporates volatility smile is proposed, which is used for the hidden nodes that process the underlying asset price. In addition, financial conditions, such as the absence of arbitrage, the boundaries and the asymptotic slope, are embedded into the loss function. This is one of the very first studies which discuss a methodological framework that incorporates prior financial domain knowledge into neural network architecture design and model training. The proposed model outperforms the benchmarked models with the option data on the S&P 500 index over 20 years. More importantly, the domain knowledge is satisfied empirically, showing the model is consistent with the existing financial theories and conditions related to implied volatility surface.