男插曲女视频免费观看_蜜桃AV噜噜一区二区三区麻豆_欧美精品午夜理论片不卡在线播放_美女的隐私秘视频网站图片_点击进入不卡无毒毛片_九九99久久精品国产免费_免费看好看黄的视频中国在线观看

RRAM-based multi-core systems improve the energy efficiency and performance of CNNs. Thereby, the distributed parallel execution of convolutional layers causes critical data dependencies that limit the potential speedup. This paper presents synchronization techniques for parallel inference of convolutional layers on RRAM-based CIM architectures. We propose an architecture optimization that enables efficient data exchange and discuss the impact of different architecture setups on the performance. The corresponding compiler algorithms are optimized for high speedup and low memory consumption during CNN inference. We achieve more than 99% of the theoretical acceleration limit with a marginal data transmission overhead of less than 4% for state-of-the-art CNN benchmarks.

相關內容

層(ceng)

關注 2

移動平均 · Learning · 預測器/決策函數 · Performer · 批量學習 ·

2023 年 12 月 13 日

Mixed moving average field guided learning for spatio-temporal data

Imma Valentina Curato,Orkun Furat,Lorenzo Proietti,Bennet Stroeh

Influenced mixed moving average fields are a versatile modeling class for spatio-temporal data. However, their predictive distribution is not generally known. Under this modeling assumption, we define a novel spatio-temporal embedding and a theory-guided machine learning approach that employs a generalized Bayesian algorithm to make ensemble forecasts. We employ Lipschitz predictors and determine fixed-time and any-time PAC Bayesian bounds in the batch learning setting. Performing causal forecast is a highlight of our methodology as its potential application to data with spatial and temporal short and long-range dependence. We then test the performance of our learning methodology by using linear predictors and data sets simulated from a spatio-temporal Ornstein-Uhlenbeck process.

估計/估計量 · Weight · motivation · Performer · Analysis ·

2023 年 12 月 12 日

Proving the stability estimates of variational least-squares Kernel-Based methods

Meng Chen,Leevan Ling,Dongfang Yun

Motivated by the need for the rigorous analysis of the numerical stability of variational least-squares kernel-based methods for solving second-order elliptic partial differential equations, we provide previously lacking stability inequalities. This fills a significant theoretical gap in the previous work [Comput. Math. Appl. 103 (2021) 1-11], which provided error estimates based on a conjecture on the stability. With the stability estimate now rigorously proven, we complete the theoretical foundations and compare the convergence behavior to the proven rates. Furthermore, we establish another stability inequality involving weighted-discrete norms, and provide a theoretical proof demonstrating that the exact quadrature weights are not necessary for the weighted least-squares kernel-based collocation method to converge. Our novel theoretical insights are validated by numerical examples, which showcase the relative efficiency and accuracy of these methods on data sets with large mesh ratios. The results confirm our theoretical predictions regarding the performance of variational least-squares kernel-based method, least-squares kernel-based collocation method, and our new weighted least-squares kernel-based collocation method. Most importantly, our results demonstrate that all methods converge at the same rate, validating the convergence theory of weighted least-squares in our proven theories.

Learning · 通用動力公司 · Performer · MoDELS · 泛化理論 ·

2023 年 12 月 12 日

Rethinking Gauss-Newton for learning over-parameterized models

Michael Arbel,Romain Menegaux,Pierre Wolinski

This work studies the global convergence and implicit bias of Gauss Newton's (GN) when optimizing over-parameterized one-hidden layer networks in the mean-field regime. We first establish a global convergence result for GN in the continuous-time limit exhibiting a faster convergence rate compared to GD due to improved conditioning. We then perform an empirical study on a synthetic regression task to investigate the implicit bias of GN's method. While GN is consistently faster than GD in finding a global optimum, the learned model generalizes well on test data when starting from random initial weights with a small variance and using a small step size to slow down convergence. Specifically, our study shows that such a setting results in a hidden learning phenomenon, where the dynamics are able to recover features with good generalization properties despite the model having sub-optimal training and test performances due to an under-optimized linear layer. This study exhibits a trade-off between the convergence speed of GN and the generalization ability of the learned solution.

層 · 近似 · 線性的 · 縮放 · Networking ·

2023 年 12 月 11 日

DYAD: A Descriptive Yet Abjuring Density efficient approximation to linear neural network layers

Sarin Chandy,Varun Gangal,Yi Yang,Gabriel Maggiotti

from arxiv, Accepted at WANT workshop at NeurIPS 2023; code at //github.com/asappresearch/dyad

We devise, implement and performance-asses DYAD, a layer which can serve as a faster and more memory-efficient approximate replacement for linear layers, (nn.Linear() in Pytorch). These layers appear in common subcomponents, such as in the ff module of Transformers. DYAD is based on a bespoke near-sparse matrix structure which approximates the dense "weight" matrix W that matrix-multiplies the input in the typical realization of such a layer, a.k.a DENSE. Our alternative near-sparse matrix structure is decomposable to a sum of 2 matrices permutable to a block-sparse counterpart. These can be represented as 3D tensors, which in unison allow a faster execution of matrix multiplication with the mini-batched input matrix X compared to DENSE (O(rows(W ) x cols(W )) --> O( rows(W ) x cols(W ) # of blocks )). As the crux of our experiments, we pretrain both DYAD and DENSE variants of 2 sizes of the OPT arch and 1 size of the Pythia arch, including at different token scales of the babyLM benchmark. We find DYAD to be competitive (>= 90%) of DENSE performance on zero-shot (e.g. BLIMP), few-shot (OPENLM) and finetuning (GLUE) benchmarks, while being >=7-15% faster to train on-GPU even at 125m scale, besides surfacing larger speedups at increasing scale and model width.

Unstructured · Extensibility · Performer · Integration · SimPLe ·

2023 年 12 月 9 日

Rapid evaluation of Newtonian potentials on planar domains

Zewen Shen,Kirill Serkh

from arxiv, 25 pages, 5 tables, 11 figures. Accepted by SIAM J. Sci. Comput

The accurate and efficient evaluation of Newtonian potentials over general 2-D domains is important for the numerical solution of Poisson's equation and volume integral equations. In this paper, we present a simple and efficient high-order algorithm for computing the Newtonian potential over a planar domain discretized by an unstructured mesh. The algorithm is based on the use of Green's third identity for transforming the Newtonian potential into a collection of layer potentials over the boundaries of the mesh elements, which can be easily evaluated by the Helsing-Ojala method. One important component of our algorithm is the use of high-order (up to order 20) bivariate polynomial interpolation in the monomial basis, for which we provide extensive justification. The performance of our algorithm is illustrated through several numerical experiments.

DeepFakes · MoDELS · TOOLS · XAI · 圖片分類 ·

2023 年 12 月 8 日

An adversarial attack approach for eXplainable AI evaluation on deepfake detection models

Balachandar Gowrisankar,Vrizlynn L. L. Thing

With the rising concern on model interpretability, the application of eXplainable AI (XAI) tools on deepfake detection models has been a topic of interest recently. In image classification tasks, XAI tools highlight pixels influencing the decision given by a model. This helps in troubleshooting the model and determining areas that may require further tuning of parameters. With a wide range of tools available in the market, choosing the right tool for a model becomes necessary as each one may highlight different sets of pixels for a given image. There is a need to evaluate different tools and decide the best performing ones among them. Generic XAI evaluation methods like insertion or removal of salient pixels/segments are applicable for general image classification tasks but may produce less meaningful results when applied on deepfake detection models due to their functionality. In this paper, we perform experiments to show that generic removal/insertion XAI evaluation methods are not suitable for deepfake detection models. We also propose and implement an XAI evaluation approach specifically suited for deepfake detection models.

Projection · Weight · CASES · BASIC ·

2023 年 12 月 8 日

A recursive construction for projective Reed-Muller codes

Rodrigo San-José

We give a recursive construction for projective Reed-Muller codes in terms of affine Reed-Muller codes and projective Reed-Muller codes in fewer variables. From this construction, we obtain the dimension of the subfield subcodes of projective Reed-Muller codes for some particular degrees that give codes with good parameters. Moreover, from this recursive construction we are able to derive a lower bound for the generalized Hamming weights of projective Reed-Muller codes which, together with the basic properties of the generalized Hamming weights, allows us to determine most of the weight hierarchy of projective Reed-Muller codes in many cases.

共線性 · 數據集增強 · CASES · 判別器 · MoDELS ·

2023 年 12 月 8 日

Collinear datasets augmentation using Procrustes validation sets

Sergey Kucheryavskiy,Sergei Zhilin

from arxiv, 21 pages, 7 figures

In this paper, we propose a new method for the augmentation of numeric and mixed datasets. The method generates additional data points by utilizing cross-validation resampling and latent variable modeling. It is particularly efficient for datasets with moderate to high degrees of collinearity, as it directly utilizes this property for generation. The method is simple, fast, and has very few parameters, which, as shown in the paper, do not require specific tuning. It has been tested on several real datasets; here, we report detailed results for two cases, prediction of protein in minced meat based on near infrared spectra (fully numeric data with high degree of collinearity) and discrimination of patients referred for coronary angiography (mixed data, with both numeric and categorical variables, and moderate collinearity). In both cases, artificial neural networks were employed for developing the regression and the discrimination models. The results show a clear improvement in the performance of the models; thus for the prediction of meat protein, fitting the model to the augmented data resulted in a reduction in the root mean squared error computed for the independent test set by 1.5 to 3 times.

線性的 · 分解的 · SimPLe · 樣例 · 泛函 ·

2023 年 12 月 8 日

On the convergence of restarted Anderson acceleration for symmetric linear systems

Hans De Sterck,Oliver A. Krzysik,Adam Smith

Anderson acceleration (AA) is a technique for accelerating the convergence of an underlying fixed-point iteration. AA is widely used within computational science, with applications ranging from electronic structure calculation to the training of neural networks. Despite AA's widespread use, relatively little is understood about it theoretically. An important and unanswered question in this context is: To what extent can AA actually accelerate convergence of the underlying fixed-point iteration? While simple enough to state, this question appears rather difficult to answer. For example, it is unanswered even in the simplest (non-trivial) case where the underlying fixed-point iteration consists of applying a two-dimensional affine function. In this note we consider a restarted variant of AA applied to solve symmetric linear systems with restart window of size one. Several results are derived from the analytical solution of a nonlinear eigenvalue problem characterizing residual propagation of the AA iteration. This includes a complete characterization of the method to solve $2 \times 2$ linear systems, rigorously quantifying how the asymptotic convergence factor depends on the initial iterate, and quantifying by how much AA accelerates the underlying fixed-point iteration. We also prove that even if the underlying fixed-point iteration diverges, the associated AA iteration may still converge.

entity · Performer · 圖 · 知識圖譜 · MoDELS ·

2019 年 6 月 4 日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Deepak Nathani,Jatin Chauhan,Charu Sharma,Manohar Kaul

from arxiv, accepted as long paper in ACL 2019

The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.