亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

RRAM-based multi-core systems improve the energy efficiency and performance of CNNs. Thereby, the distributed parallel execution of convolutional layers causes critical data dependencies that limit the potential speedup. This paper presents synchronization techniques for parallel inference of convolutional layers on RRAM-based CIM architectures. We propose an architecture optimization that enables efficient data exchange and discuss the impact of different architecture setups on the performance. The corresponding compiler algorithms are optimized for high speedup and low memory consumption during CNN inference. We achieve more than 99% of the theoretical acceleration limit with a marginal data transmission overhead of less than 4% for state-of-the-art CNN benchmarks.

相關內容

signSGD is popular in nonconvex optimization due to its communication efficiency. Yet, existing analyses of signSGD rely on assuming that data are sampled with replacement in each iteration, contradicting the practical implementation where data are randomly reshuffled and sequentially fed into the algorithm. We bridge this gap by proving the first convergence result of signSGD with random reshuffling (SignRR) for nonconvex optimization. Given the dataset size $n$, the number of epochs of data passes $T$, and the variance bound of a stochastic gradient $\sigma^2$, we show that SignRR has the same convergence rate $O(\log(nT)/\sqrt{nT} + \|\sigma\|_1)$ as signSGD \citep{bernstein2018signsgd}. We then present SignRVR and SignRVM, which leverage variance-reduced gradients and momentum updates respectively, both converging at $O(\log(nT)/\sqrt{nT})$. In contrast with the analysis of signSGD, our results do not require an extremely large batch size in each iteration to be of the same order as the total number of iterations \citep{bernstein2018signsgd} or the signs of stochastic and true gradients match element-wise with a minimum probability of 1/2 \citep{safaryan2021stochastic}. We also extend our algorithms to cases where data are distributed across different machines, yielding dist-SignRVR and dist-SignRVM, both converging at $O(\log(n_0T)/\sqrt{n_0T})$, where $n_0$ is the dataset size of a single machine. We back up our theoretical findings through experiments on simulated and real-world problems, verifying that randomly reshuffled sign methods match or surpass existing baselines.

This paper presents a new algorithm for generating random inverse-Wishart matrices that directly generates the Cholesky factor of the matrix without computing the factorization. Whenever parameterized in terms of a precision matrix $\Omega=\Sigma^{-1}$, or its Cholesky factor, instead of a covariance matrix $\Sigma$, the new algorithm is more efficient than the current standard algorithm.

Estimating the structure of directed acyclic graphs (DAGs) from observational data remains a significant challenge in machine learning. Most research in this area concentrates on learning a single DAG for the entire population. This paper considers an alternative setting where the graph structure varies across individuals based on available "contextual" features. We tackle this contextual DAG problem via a neural network that maps the contextual features to a DAG, represented as a weighted adjacency matrix. The neural network is equipped with a novel projection layer that ensures the output matrices are sparse and satisfy a recently developed characterization of acyclicity. We devise a scalable computational framework for learning contextual DAGs and provide a convergence guarantee and an analytical gradient for backpropagating through the projection layer. Our experiments suggest that the new approach can recover the true context-specific graph where existing approaches fail.

Forced alignment systems automatically determine boundaries between segments in speech data, given an orthographic transcription. These tools are commonplace in phonetics to facilitate the use of speech data that would be infeasible to manually transcribe and segment. In the present paper, we describe a new neural network-based forced alignment system, the Mason-Alberta Phonetic Segmenter (MAPS). The MAPS aligner serves as a testbed for two possible improvements we pursue for forced alignment systems. The first is treating the acoustic model in a forced aligner as a tagging task, rather than a classification task, motivated by the common understanding that segments in speech are not truly discrete and commonly overlap. The second is an interpolation technique to allow boundaries more precise than the common 10 ms limit in modern forced alignment systems. We compare configurations of our system to a state-of-the-art system, the Montreal Forced Aligner. The tagging approach did not generally yield improved results over the Montreal Forced Aligner. However, a system with the interpolation technique had a 27.92% increase relative to the Montreal Forced Aligner in the amount of boundaries within 10 ms of the target on the test set. We also reflect on the task and training process for acoustic modeling in forced alignment, highlighting how the output targets for these models do not match phoneticians' conception of similarity between phones and that reconciliation of this tension may require rethinking the task and output targets or how speech itself should be segmented.

In this work, we introduce Regularity Structures B-series which are used for describing solutions of singular stochastic partial differential equations (SPDEs). We define composition and substitutions of these B-series and as in the context of B-series for ordinary differential equations, these operations can rewritten via products and Hopf algebras which have been used for building up renormalised models. These models provide a suitable topology for solving singular SPDEs. This new construction sheds a new light on these products and open interesting perspectives for the study of singular SPDEs in connection with B-series.

We propose a novel approach to numerically approximate McKean-Vlasov stochastic differential equations (MV-SDE) using stochastic gradient descent (SGD) while avoiding the use of interacting particle systems. The technique of SGD is deployed to solve a Euclidean minimization problem, which is obtained by first representing the MV-SDE as a minimization problem over the set of continuous functions of time, and then by approximating the domain with a finite-dimensional subspace. Convergence is established by proving certain intermediate stability and moment estimates of the relevant stochastic processes (including the tangent ones). Numerical experiments illustrate the competitive performance of our SGD based method compared to the IPS benchmarks. This work offers a theoretical foundation for using the SGD method in the context of numerical approximation of MV-SDEs, and provides analytical tools to study its stability and convergence.

In this article, an efficient numerical method for computing finite-horizon controllability Gramians in Cholesky-factored form is proposed. The method is applicable to general dense matrices of moderate size and produces a Cholesky factor of the Gramian without computing the full product. In contrast to other methods applicable to this task, the proposed method is a generalization of the scaling-and-squaring approach for approximating the matrix exponential. It exploits a similar doubling formula for the Gramian, and thereby keeps the required computational effort modest. Most importantly, a rigorous backward error analysis is provided, which guarantees that the approximation is accurate to the round-off error level in double precision. This accuracy is illustrated in practice on a large number of standard test examples. The method has been implemented in the Julia package FiniteHorizonGramians.jl, which is available online under the MIT license. Code for reproducing the experimental results is included in this package, as well as code for determining the optimal method parameters. The analysis can thus easily be adapted to a different finite-precision arithmetic.

We propose a neural network-based meta-learning method to efficiently solve partial differential equation (PDE) problems. The proposed method is designed to meta-learn how to solve a wide variety of PDE problems, and uses the knowledge for solving newly given PDE problems. We encode a PDE problem into a problem representation using neural networks, where governing equations are represented by coefficients of a polynomial function of partial derivatives, and boundary conditions are represented by a set of point-condition pairs. We use the problem representation as an input of a neural network for predicting solutions, which enables us to efficiently predict problem-specific solutions by the forwarding process of the neural network without updating model parameters. To train our model, we minimize the expected error when adapted to a PDE problem based on the physics-informed neural network framework, by which we can evaluate the error even when solutions are unknown. We demonstrate that our proposed method outperforms existing methods in predicting solutions of PDE problems.

The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.

Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.

北京阿比特科技有限公司