日本一区二区三区不卡网站-亚洲国产一区二区三区欧美

Although Regge finite element functions are not continuous, useful generalizations of nonlinear derivatives like the curvature, can be defined using them. This paper is devoted to studying the convergence of the finite element lifting of a generalized (distributional) Gauss curvature defined using a metric tensor in the Regge finite element space. Specifically, we investigate the interplay between the polynomial degree of the curvature lifting by Lagrange elements and the degree of the metric tensor in the Regge finite element space. Previously, a superconvergence result, where convergence rate of one order higher than expected, was obtained when the metric is the canonical Regge interpolant of the exact metric. In this work, we show that an even higher order can be obtained if the degree of the curvature lifting is reduced by one polynomial degre and if at least linear Regge elements are used. These improved convergence rates are confirmed by numerical examples.

相關內容

曲率

關注 1

大學 · 合一 · CASES · 變換 · 講稿 ·

2024 年 3 月 5 日

Sharing proofs with predicative theories through universe polymorphic elaboration

Thiago Felicissimo,Frédéric Blanqui

from arxiv, Journal version of //doi.org/10.4230/LIPIcs.CSL.2023.19 to be submitted to LMCS, also supersedes arXiv:2211.05700

As the development of formal proofs is a time-consuming task, it is important to devise ways of sharing the already written proofs to prevent wasting time redoing them. One of the challenges in this domain is to translate proofs written in proof assistants based on impredicative logics to proof assistants based on predicative logics, whenever impredicativity is not used in an essential way. In this paper we present a transformation for sharing proofs with a core predicative system supporting prenex universe polymorphism (like in Agda). It consists in trying to elaborate each term into a predicative universe polymorphic term as general as possible. The use of universe polymorphism is justified by the fact that mapping each universe to a fixed one in the target theory is not sufficient in most cases. During the elaboration, we need to solve unification problems in the equational theory of universe levels. In order to do this, we give a complete characterization of when a single equation admits a most general unifier. This characterization is then employed in a partial algorithm which uses a constraint-postponement strategy for trying to solve unification problems. The proposed translation is of course partial, but in practice allows one to translate many proofs that do not use impredicativity in an essential way. Indeed, it was implemented in the tool Predicativize and then used to translate semi-automatically many non-trivial developments from Matita's library to Agda, including proofs of Bertrand's Postulate and Fermat's Little Theorem, which (as far as we know) were not available in Agda yet.

正則的 · Integration · 確切的 · 情景 · 線性的 ·

2024 年 3 月 5 日

An efficient frequency-independent numerical method for computing the far-field pattern induced by polygonal obstacles

A. Gibbs,S. Langdon

For problems of time-harmonic scattering by rational polygonal obstacles, embedding formulae express the far-field pattern induced by any incident plane wave in terms of the far-field patterns for a relatively small (frequency-independent) set of canonical incident angles. Although these remarkable formulae are exact in theory, here we demonstrate that: (i) they are highly sensitive to numerical errors in practice, and (ii) direct calculation of the coefficients in these formulae may be impossible for particular sets of canonical incident angles, even in exact arithmetic. Only by overcoming these practical issues can embedding formulae provide a highly efficient approach to computing the far-field pattern induced by a large number of incident angles. Here we address challenges (i) and (ii), supporting our theory with numerical experiments. Challenge (i) is solved using techniques from computational complex analysis: we reformulate the embedding formula as a complex contour integral and prove that this is much less sensitive to numerical errors. In practice, this contour integral can be efficiently evaluated by residue calculus. Challenge (ii) is addressed using techniques from numerical linear algebra: we oversample, considering more canonical incident angles than are necessary, thus expanding the set of valid coefficient vectors. The coefficient vector can then be selected using either a least squares approach or column subset selection.

可行 · 規范化的 · 正交 · 規范化 · 變換 ·

2024 年 3 月 5 日

A feasible and unitary quantum programming language

Alejandro Díaz-Caro,Emmanuel Hainry,Romain Péchoux,Mário Silva

We introduce a novel quantum programming language featuring higher-order programs and quantum controlflow which ensures that all qubit transformations are unitary. Our language boasts a type system guaranteeingboth unitarity and polynomial-time normalization. Unitarity is achieved by using a special modality forsuperpositions while requiring orthogonality among superposed terms. Polynomial-time normalization isachieved using a linear-logic-based type discipline employing Barber and Plotkin duality along with a specificmodality to account for potential duplications. This type discipline also guarantees that derived values havepolynomial size. Our language seamlessly combines the two modalities: quantum circuit programs upholdunitarity, and all programs are evaluated in polynomial time, ensuring their feasibility.

稀疏 · 不變 · 樣例 · 線性的 · 噪聲 ·

2024 年 3 月 5 日

Noise misleads rotation invariant algorithms on sparse targets

Manfred K. Warmuth,Wojciech Kot?owski,Matt Jones,Ehsan Amid

It is well known that the class of rotation invariant algorithms are suboptimal even for learning sparse linear problems when the number of examples is below the "dimension" of the problem. This class includes any gradient descent trained neural net with a fully-connected input layer (initialized with a rotationally symmetric distribution). The simplest sparse problem is learning a single feature out of $d$ features. In that case the classification error or regression loss grows with $1-k/n$ where $k$ is the number of examples seen. These lower bounds become vacuous when the number of examples $k$ reaches the dimension $d$. We show that when noise is added to this sparse linear problem, rotation invariant algorithms are still suboptimal after seeing $d$ or more examples. We prove this via a lower bound for the Bayes optimal algorithm on a rotationally symmetrized problem. We then prove much lower upper bounds on the same problem for simple non-rotation invariant algorithms. Finally we analyze the gradient flow trajectories of many standard optimization algorithms in some simple cases and show how they veer toward or away from the sparse targets. We believe that our trajectory categorization will be useful in designing algorithms that can exploit sparse targets and our method for proving lower bounds will be crucial for analyzing other families of algorithms that admit different classes of invariances.

模型評估 · 泛化理論 · Networking · 正則化項 · 權重衰減 ·

2024 年 3 月 4 日

To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets

Darshil Doshi,Aritra Das,Tianyu He,Andrey Gromov

from arxiv, 9+20 pages, 7+25 figures, 2 tables

Robust generalization is a major challenge in deep learning, particularly when the number of trainable parameters is very large. In general, it is very difficult to know if the network has memorized a particular set of examples or understood the underlying rule (or both). Motivated by this challenge, we study an interpretable model where generalizing representations are understood analytically, and are easily distinguishable from the memorizing ones. Namely, we consider multi-layer perceptron (MLP) and Transformer architectures trained on modular arithmetic tasks, where ($\xi \cdot 100\%$) of labels are corrupted (\emph{i.e.} some results of the modular operations in the training set are incorrect). We show that (i) it is possible for the network to memorize the corrupted labels \emph{and} achieve $100\%$ generalization at the same time; (ii) the memorizing neurons can be identified and pruned, lowering the accuracy on corrupted data and improving the accuracy on uncorrupted data; (iii) regularization methods such as weight decay, dropout and BatchNorm force the network to ignore the corrupted data during optimization, and achieve $100\%$ accuracy on the uncorrupted dataset; and (iv) the effect of these regularization methods is (``mechanistically'') interpretable: weight decay and dropout force all the neurons to learn generalizing representations, while BatchNorm de-amplifies the output of memorizing neurons and amplifies the output of the generalizing ones. Finally, we show that in the presence of regularization, the training dynamics involves two consecutive stages: first, the network undergoes \emph{grokking} dynamics reaching high train \emph{and} test accuracy; second, it unlearns the memorizing representations, where the train accuracy suddenly jumps from $100\%$ to $100 (1-\xi)\%$.

Extensibility · 統計量 · 秩 · CASE · 統計方法 ·

2024 年 3 月 4 日

Graphical n-sample tests of correspondence of distributions

Konstantinos Konstantinou,Tomá? Mrkvi?ka,Mari Myllym?ki

from arxiv, 20 pages, 14 figures

Classical tests are available for the two-sample test of correspondence of distribution functions. From these, the Kolmogorov-Smirnov test provides also the graphical interpretation of the test results, in different forms. Here, we propose modifications of the Kolmogorov-Smirnov test with higher power. The proposed tests are based on the so-called global envelope test which allows for graphical interpretation, similarly as the Kolmogorov-Smirnov test. The tests are based on rank statistics and are suitable also for the comparison of $n$ samples, with $n \geq 2$. We compare the alternatives for the two-sample case through an extensive simulation study and discuss their interpretation. Finally, we apply the tests to real data. Specifically, we compare the height distributions between boys and girls at different ages, as well as sepal length distributions of different flower species using the proposed methodologies.

CASE · 冪法 · 多重集 · Seven · Extensibility ·

2024 年 3 月 4 日

Counting occurrences of patterns in permutations

Andrew R Conway,Anthony J Guttmann

from arxiv, 32 pages. Updated references from previous version. Removal on earlier discussion of Stieltjes sequences, which was incomplete and confusing

We develop a new, powerful method for counting elements in a multiset. As a first application, we use this algorithm to study the number of occurrences of patterns in a permutation. For patterns of length 3 there are two Wilf classes, and the general behaviour of these is reasonably well-known. We slightly extend some of the known results in that case, and exhaustively study the case of patterns of length 4, about which there is little previous knowledge. For such patterns, there are seven Wilf classes, and based on extensive enumerations and careful series analysis, we have conjectured the asymptotic behaviour for all classes.

推斷 · 統計量 · 置信度 · 樣本 · 覆蓋 ·

2024 年 3 月 2 日

Simulation-based, Finite-sample Inference for Privatized Data

Jordan Awan,Zhanyu Wang

from arxiv, 25 pages before references and appendices, 42 pages total, 10 figures, 9 tables

Privacy protection methods, such as differentially private mechanisms, introduce noise into resulting statistics which often produces complex and intractable sampling distributions. In this paper, we propose a simulation-based "repro sample" approach to produce statistically valid confidence intervals and hypothesis tests, which builds on the work of Xie and Wang (2022). We show that this methodology is applicable to a wide variety of private inference problems, appropriately accounts for biases introduced by privacy mechanisms (such as by clamping), and improves over other state-of-the-art inference methods such as the parametric bootstrap in terms of the coverage and type I error of the private inference. We also develop significant improvements and extensions for the repro sample methodology for general models (not necessarily related to privacy), including 1) modifying the procedure to ensure guaranteed coverage and type I errors, even accounting for Monte Carlo error, and 2) proposing efficient numerical algorithms to implement the confidence intervals and $p$-values.

損失函數（機器學習） · 查準率/準確率 · 泛函 · Machine Learning · 損失 ·

2024 年 3 月 2 日

MPIPN: A Multi Physics-Informed PointNet for solving parametric acoustic-structure systems

Chu Wang,Jinhong Wu,Yanzhi Wang,Zhijian Zha,Qi Zhou

from arxiv, The number of figures is 16. The number of tables is 5. The number of words is 9717

Machine learning is employed for solving physical systems governed by general nonlinear partial differential equations (PDEs). However, complex multi-physics systems such as acoustic-structure coupling are often described by a series of PDEs that incorporate variable physical quantities, which are referred to as parametric systems. There are lack of strategies for solving parametric systems governed by PDEs that involve explicit and implicit quantities. In this paper, a deep learning-based Multi Physics-Informed PointNet (MPIPN) is proposed for solving parametric acoustic-structure systems. First, the MPIPN induces an enhanced point-cloud architecture that encompasses explicit physical quantities and geometric features of computational domains. Then, the MPIPN extracts local and global features of the reconstructed point-cloud as parts of solving criteria of parametric systems, respectively. Besides, implicit physical quantities are embedded by encoding techniques as another part of solving criteria. Finally, all solving criteria that characterize parametric systems are amalgamated to form distinctive sequences as the input of the MPIPN, whose outputs are solutions of systems. The proposed framework is trained by adaptive physics-informed loss functions for corresponding computational domains. The framework is generalized to deal with new parametric conditions of systems. The effectiveness of the MPIPN is validated by applying it to solve steady parametric acoustic-structure coupling systems governed by the Helmholtz equations. An ablation experiment has been implemented to demonstrate the efficacy of physics-informed impact with a minority of supervised data. The proposed method yields reasonable precision across all computational domains under constant parametric conditions and changeable combinations of parametric conditions for acoustic-structure systems.

假陰性 · 大語言模型 · MoDELS · 語言模型化 · FAST ·

2024 年 3 月 1 日

AtP*: An efficient and scalable method for localizing LLM behaviour to components

János Kramár,Tom Lieberum,Rohin Shah,Neel Nanda

Activation Patching is a method of directly computing causal attributions of behavior to model components. However, applying it exhaustively requires a sweep with cost scaling linearly in the number of model components, which can be prohibitively expensive for SoTA Large Language Models (LLMs). We investigate Attribution Patching (AtP), a fast gradient-based approximation to Activation Patching and find two classes of failure modes of AtP which lead to significant false negatives. We propose a variant of AtP called AtP*, with two changes to address these failure modes while retaining scalability. We present the first systematic study of AtP and alternative methods for faster activation patching and show that AtP significantly outperforms all other investigated methods, with AtP* providing further significant improvement. Finally, we provide a method to bound the probability of remaining false negatives of AtP* estimates.