青青国产成人久久激情91,97国产精品无码免费视频,国产性爱免费观看,国产无遮挡色视频真人免费的,国产伊人精品导航

Supercomputers have revolutionized how industries and scientific fields process large amounts of data. These machines group hundreds or thousands of computing nodes working together to execute time-consuming programs that require a large amount of computational resources. Over the years, supercomputers have expanded to include new and different technologies characterizing them as heterogeneous. However, executing a program in a heterogeneous environment requires attention to a specific aspect of performance degradation: load imbalance. In this research, we address the challenges associated with load imbalance when scheduling many homogeneous tasks in a heterogeneous environment. To address this issue, we introduce the concept of adaptive asynchronous work-stealing. This approach collects information about the nodes and utilizes it to improve work-stealing aspects, such as victim selection and task offloading. Additionally, the proposed approach eliminates the need for extra threads to communicate information, thereby reducing overhead when implementing a fully asynchronous approach. Our experimental results demonstrate a performance improvement of approximately 10.1\% compared to other conventional and state-of-the-art implementations.

相關內容

Performer

關注 10

大學 · 合一 · CASES · 變換 · 講稿 ·

2024 年 3 月 5 日

Sharing proofs with predicative theories through universe polymorphic elaboration

Thiago Felicissimo,Frédéric Blanqui

from arxiv, Journal version of //doi.org/10.4230/LIPIcs.CSL.2023.19 to be submitted to LMCS, also supersedes arXiv:2211.05700

As the development of formal proofs is a time-consuming task, it is important to devise ways of sharing the already written proofs to prevent wasting time redoing them. One of the challenges in this domain is to translate proofs written in proof assistants based on impredicative logics to proof assistants based on predicative logics, whenever impredicativity is not used in an essential way. In this paper we present a transformation for sharing proofs with a core predicative system supporting prenex universe polymorphism (like in Agda). It consists in trying to elaborate each term into a predicative universe polymorphic term as general as possible. The use of universe polymorphism is justified by the fact that mapping each universe to a fixed one in the target theory is not sufficient in most cases. During the elaboration, we need to solve unification problems in the equational theory of universe levels. In order to do this, we give a complete characterization of when a single equation admits a most general unifier. This characterization is then employed in a partial algorithm which uses a constraint-postponement strategy for trying to solve unification problems. The proposed translation is of course partial, but in practice allows one to translate many proofs that do not use impredicativity in an essential way. Indeed, it was implemented in the tool Predicativize and then used to translate semi-automatically many non-trivial developments from Matita's library to Agda, including proofs of Bertrand's Postulate and Fermat's Little Theorem, which (as far as we know) were not available in Agda yet.

離散化 · 操作 · 損失 · 損失函數（機器學習） · Performer ·

2024 年 3 月 5 日

Accelerating the convergence of Newton's method for nonlinear elliptic PDEs using Fourier neural operators

Joubine Aghili,Emmanuel Franck,Romain Hild,Victor Michel-Dansac,Vincent Vigon

It is well known that Newton's method, especially when applied to large problems such as the discretization of nonlinear partial differential equations (PDEs), can have trouble converging if the initial guess is too far from the solution. This work focuses on accelerating this convergence, in the context of the discretization of nonlinear elliptic PDEs. We first provide a quick review of existing methods, and justify our choice of learning an initial guess with a Fourier neural operator (FNO). This choice was motivated by the mesh-independence of such operators, whose training and evaluation can be performed on grids with different resolutions. The FNO is trained using a loss minimization over generated data, loss functions based on the PDE discretization. Numerical results, in one and two dimensions, show that the proposed initial guess accelerates the convergence of Newton's method by a large margin compared to a naive initial guess, especially for highly nonlinear or anisotropic problems.

Weight · Analysis · 線性的 · 得分 · MoDELS ·

2024 年 3 月 4 日

Covariate adjustment in randomized experiments with missing outcomes and covariates

Anqi Zhao,Peng Ding,Fan Li

Covariate adjustment can improve precision in analyzing randomized experiments. With fully observed data, regression adjustment and propensity score weighting are asymptotically equivalent in improving efficiency over unadjusted analysis. When some outcomes are missing, we consider combining these two adjustment methods with inverse probability of observation weighting for handling missing outcomes, and show that the equivalence between the two methods breaks down. Regression adjustment no longer ensures efficiency gain over unadjusted analysis unless the true outcome model is linear in covariates or the outcomes are missing completely at random. Propensity score weighting, in contrast, still guarantees efficiency over unadjusted analysis, and including more covariates in adjustment never harms asymptotic efficiency. Moreover, we establish the value of using partially observed covariates to secure additional efficiency by the missingness indicator method, which imputes all missing covariates by zero and uses the union of the completed covariates and corresponding missingness indicators as the new, fully observed covariates. Based on these findings, we recommend using regression adjustment in combination with the missingness indicator method if the linear outcome model or missing complete at random assumption is plausible and using propensity score weighting with the missingness indicator method otherwise.

MoDELS · Continuity · Performer · 離散化 · 模型評估 ·

2024 年 3 月 4 日

Speech emotion recognition from voice messages recorded in the wild

Lucía Gómez-Zaragozá,óscar Valls,Rocío del Amor,María José Castro-Bleda,Valery Naranjo,Mariano Alca?iz Raya,Javier Marín-Morales

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Emotion datasets used for Speech Emotion Recognition (SER) often contain acted or elicited speech, limiting their applicability in real-world scenarios. In this work, we used the Emotional Voice Messages (EMOVOME) database, including spontaneous voice messages from conversations of 100 Spanish speakers on a messaging app, labeled in continuous and discrete emotions by expert and non-expert annotators. We created speaker-independent SER models using the eGeMAPS features, transformer-based models and their combination. We compared the results with reference databases and analyzed the influence of annotators and gender fairness. The pre-trained Unispeech-L model and its combination with eGeMAPS achieved the highest results, with 61.64% and 55.57% Unweighted Accuracy (UA) for 3-class valence and arousal prediction respectively, a 10% improvement over baseline models. For the emotion categories, 42.58% UA was obtained. EMOVOME performed lower than the acted RAVDESS database. The elicited IEMOCAP database also outperformed EMOVOME in the prediction of emotion categories, while similar results were obtained in valence and arousal. Additionally, EMOVOME outcomes varied with annotator labels, showing superior results and better fairness when combining expert and non-expert annotations. This study significantly contributes to the evaluation of SER models in real-life situations, advancing in the development of applications for analyzing spontaneous voice messages.

CASE · 冪法 · 多重集 · Seven · Extensibility ·

2024 年 3 月 4 日

Counting occurrences of patterns in permutations

Andrew R Conway,Anthony J Guttmann

from arxiv, 32 pages. Updated references from previous version. Removal on earlier discussion of Stieltjes sequences, which was incomplete and confusing

We develop a new, powerful method for counting elements in a multiset. As a first application, we use this algorithm to study the number of occurrences of patterns in a permutation. For patterns of length 3 there are two Wilf classes, and the general behaviour of these is reasonably well-known. We slightly extend some of the known results in that case, and exhaustively study the case of patterns of length 4, about which there is little previous knowledge. For such patterns, there are seven Wilf classes, and based on extensive enumerations and careful series analysis, we have conjectured the asymptotic behaviour for all classes.

卷積 · 層 · Performer · 有向 · Performance ·

2024 年 3 月 2 日

Performance evaluation of acceleration of convolutional layers on OpenEdgeCGRA

Nicolò Carpentieri,Juan Sapriza,Davide Schiavone,Daniele Jahier Pagliari,David Atienza,Maurizio Martina,Alessio Burrello

Recently, efficiently deploying deep learning solutions on the edge has received increasing attention. New platforms are emerging to support the increasing demand for flexibility and high performance. In this work, we explore the efficient mapping of convolutional layers on an open-hardware, low-power Coarse-Grain Reconfigurable Array (CGRA), namely OpenEdgeCGRA. We explore both direct implementations of convolution and solutions that transform it into a matrix multiplication through an Im2col transformation, and experiment with various tensor parallelism axes. We show that for this hardware target, direct convolution, coupled with weight parallelism reaches the best latency and energy efficiency, outperforming a CPU implementation by 3.4x and 9.9x in terms of energy and latency, respectively.

損失函數（機器學習） · 查準率/準確率 · 泛函 · Machine Learning · 損失 ·

2024 年 3 月 2 日

MPIPN: A Multi Physics-Informed PointNet for solving parametric acoustic-structure systems

Chu Wang,Jinhong Wu,Yanzhi Wang,Zhijian Zha,Qi Zhou

from arxiv, The number of figures is 16. The number of tables is 5. The number of words is 9717

Machine learning is employed for solving physical systems governed by general nonlinear partial differential equations (PDEs). However, complex multi-physics systems such as acoustic-structure coupling are often described by a series of PDEs that incorporate variable physical quantities, which are referred to as parametric systems. There are lack of strategies for solving parametric systems governed by PDEs that involve explicit and implicit quantities. In this paper, a deep learning-based Multi Physics-Informed PointNet (MPIPN) is proposed for solving parametric acoustic-structure systems. First, the MPIPN induces an enhanced point-cloud architecture that encompasses explicit physical quantities and geometric features of computational domains. Then, the MPIPN extracts local and global features of the reconstructed point-cloud as parts of solving criteria of parametric systems, respectively. Besides, implicit physical quantities are embedded by encoding techniques as another part of solving criteria. Finally, all solving criteria that characterize parametric systems are amalgamated to form distinctive sequences as the input of the MPIPN, whose outputs are solutions of systems. The proposed framework is trained by adaptive physics-informed loss functions for corresponding computational domains. The framework is generalized to deal with new parametric conditions of systems. The effectiveness of the MPIPN is validated by applying it to solve steady parametric acoustic-structure coupling systems governed by the Helmholtz equations. An ablation experiment has been implemented to demonstrate the efficacy of physics-informed impact with a minority of supervised data. The proposed method yields reasonable precision across all computational domains under constant parametric conditions and changeable combinations of parametric conditions for acoustic-structure systems.

假陰性 · 大語言模型 · MoDELS · 語言模型化 · FAST ·

2024 年 3 月 1 日

AtP*: An efficient and scalable method for localizing LLM behaviour to components

János Kramár,Tom Lieberum,Rohin Shah,Neel Nanda

Activation Patching is a method of directly computing causal attributions of behavior to model components. However, applying it exhaustively requires a sweep with cost scaling linearly in the number of model components, which can be prohibitively expensive for SoTA Large Language Models (LLMs). We investigate Attribution Patching (AtP), a fast gradient-based approximation to Activation Patching and find two classes of failure modes of AtP which lead to significant false negatives. We propose a variant of AtP called AtP*, with two changes to address these failure modes while retaining scalability. We present the first systematic study of AtP and alternative methods for faster activation patching and show that AtP significantly outperforms all other investigated methods, with AtP* providing further significant improvement. Finally, we provide a method to bound the probability of remaining false negatives of AtP* estimates.

預測器/決策函數 · 可辨認的 · 評論員 · Extensibility · 正則化項 ·

2024 年 3 月 1 日

Multivariate Bayesian variable selection with application to multi-trait genetic fine mapping

Travis Canida,Hongjie Ke,Shuo Chen,Zhenayo Ye,Tianzhou Ma

from arxiv, 46 pages, 4 figures

Variable selection has played a critical role in modern statistical learning and scientific discoveries. Numerous regularization and Bayesian variable selection methods have been developed in the past two decades for variable selection, but most of these methods consider selecting variables for only one response. As more data is being collected nowadays, it is common to analyze multiple related responses from the same study. Existing multivariate variable selection methods select variables for all responses without considering the possible heterogeneity across different responses, i.e. some features may only predict a subset of responses but not the rest. Motivated by the multi-trait fine mapping problem in genetics to identify the causal variants for multiple related traits, we developed a novel multivariate Bayesian variable selection method to select critical predictors from a large number of grouped predictors that target at multiple correlated and possibly heterogeneous responses. Our new method is featured by its selection at multiple levels, its incorporation of prior biological knowledge to guide selection and identification of best subset of responses predictors target at. We showed the advantage of our method via extensive simulations and a real fine mapping example to identify causal variants associated with different subsets of addictive behaviors.

優化器 · INTERACT · Networking · 知識 (knowledge) · Performer ·

2022 年 5 月 11 日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Niall Creech,Natalia Criado Pacheco,Simon Miles

from arxiv, 28 pages

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.