国产一国产一级毛片A久久久,极度变态SM玩弄孕妇,一级做A免费视频在线观看

Online speech recognition, where the model only accesses context to the left, is an important and challenging use case for ASR systems. In this work, we investigate augmenting neural encoders for online ASR by incorporating structured state-space sequence models (S4), a family of models that provide a parameter-efficient way of accessing arbitrarily long left context. We performed systematic ablation studies to compare variants of S4 models and propose two novel approaches that combine them with convolutions. We found that the most effective design is to stack a small S4 using real-valued recurrent weights with a local convolution, allowing them to work complementarily. Our best model achieves WERs of 4.01%/8.53% on test sets from Librispeech, outperforming Conformers with extensively tuned convolution.

相關內容

Conformer

關注 0

泛函 · 近似 · 情景 · 離散化 · TEAM ·

2024 年 2 月 15 日

Sketching stochastic valuation functions

Milan Vojnovi\' c,Yiliu Wang

We consider the problem of sketching a set valuation function, which is defined as the expectation of a valuation function of independent random item values. We show that for monotone subadditive or submodular valuation functions satisfying a weak homogeneity condition, or certain other conditions, there exist discretized distributions of item values with $O(k\log(k))$ support sizes that yield a sketch valuation function which is a constant-factor approximation, for any value query for a set of items of cardinality less than or equal to $k$. The discretized distributions can be efficiently computed by an algorithm for each item's value distribution separately. Our results hold under conditions that accommodate a wide range of valuation functions arising in applications, such as the value of a team corresponding to the best performance of a team member, constant elasticity of substitution production functions exhibiting diminishing returns used in economics and consumer theory, and others. Sketch valuation functions are particularly valuable for finding approximate solutions to optimization problems such as best set selection and welfare maximization. They enable computationally efficient evaluation of approximate value oracle queries and provide an approximation guarantee for the underlying optimization problem.

MoDELS · 情景 · Performer · 講稿 · 相關系數 ·

2024 年 2 月 14 日

Long-form evaluation of model editing

Domenic Rosati,Robie Gonzales,Jinkun Chen,Xuemin Yu,Melis Erkan,Yahya Kayani,Satya Deepika Chavatapalli,Frank Rudzicz,Hassan Sajjad

Evaluations of model editing currently only use the `next few token' completions after a prompt. As a result, the impact of these methods on longer natural language generation is largely unknown. We introduce long-form evaluation of model editing (\textbf{\textit{LEME}}) a novel evaluation protocol that measures the efficacy and impact of model editing in long-form generative settings. Our protocol consists of a machine-rated survey and a classifier which correlates well with human ratings. Importantly, we find that our protocol has very little relationship with previous short-form metrics (despite being designed to extend efficacy, generalization, locality, and portability into a long-form setting), indicating that our method introduces a novel set of dimensions for understanding model editing methods. Using this protocol, we benchmark a number of model editing techniques and present several findings including that, while some methods (ROME and MEMIT) perform well in making consistent edits within a limited scope, they suffer much more from factual drift than other methods. Finally, we present a qualitative analysis that illustrates common failure modes in long-form generative settings including internal consistency, lexical cohesion, and locality issues.

結構方程模型(Structural Equation Modeling) · 潛變量/隱變量 · MoDELS · INFORMS · Processing（編程語言） ·

2024 年 2 月 14 日

Quasi-Akaike information criterion of structural equation modeling with latent variables for diffusion processes

Shogo Kusano,Masayuki Uchida

from arxiv, 45pages, 7figures

We consider a model selection problem for structural equation modeling (SEM) with latent variables for diffusion processes based on high-frequency data. First, we propose the quasi-Akaike information criterion of the SEM and study the asymptotic properties. Next, we consider the situation where the set of competing models includes some misspecified parametric models. It is shown that the probability of choosing the misspecified models converges to zero. Furthermore, examples and simulation results are given.

圖 · 圖形處理器 · Networking · Neural Networks · Performer ·

2024 年 2 月 14 日

Multiscale graph neural networks with adaptive mesh refinement for accelerating mesh-based simulations

Roberto Perera,Vinamra Agrawal

Mesh-based Graph Neural Networks (GNNs) have recently shown capabilities to simulate complex multiphysics problems with accelerated performance times. However, mesh-based GNNs require a large number of message-passing (MP) steps and suffer from over-smoothing for problems involving very fine mesh. In this work, we develop a multiscale mesh-based GNN framework mimicking a conventional iterative multigrid solver, coupled with adaptive mesh refinement (AMR), to mitigate challenges with conventional mesh-based GNNs. We use the framework to accelerate phase field (PF) fracture problems involving coupled partial differential equations with a near-singular operator due to near-zero modulus inside the crack. We define the initial graph representation using all mesh resolution levels. We perform a series of downsampling steps using Transformer MP GNNs to reach the coarsest graph followed by upsampling steps to reach the original graph. We use skip connectors from the generated embedding during coarsening to prevent over-smoothing. We use Transfer Learning (TL) to significantly reduce the size of training datasets needed to simulate different crack configurations and loading conditions. The trained framework showed accelerated simulation times, while maintaining high accuracy for all cases compared to physics-based PF fracture model. Finally, this work provides a new approach to accelerate a variety of mesh-based engineering multiphysics problems

推斷 · 樣本 · MoDELS · 控制器 · 經驗池 ·

2024 年 2 月 13 日

On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling

Marcin Sendera,Minsu Kim,Sarthak Mittal,Pablo Lemos,Luca Scimeca,Jarrid Rector-Brooks,Alexandre Adam,Yoshua Bengio,Nikolay Malkin

from arxiv, 21 pages; code: //github.com/GFNOrg/gfn-diffusion

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at //github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.

泛函 · 流形 · MoDELS · 建模場景 · 經驗分布 ·

2024 年 2 月 13 日

Manifold functional multiple regression model with LRD error term

Diana P. Ovalle-Mu?oz,M. Dolores Ruiz-Medina

This paper considers the problem of manifold functional multiple regression with functional response, time--varying scalar regressors, and functional error term displaying Long Range Dependence (LRD) in time. Specifically, the error term is given by a manifold multifractionally integrated functional time series (see, e.g., Ovalle--Mu\~noz \& Ruiz--Medina, 2024)). The manifold is defined by a connected and compact two--point homogeneous space. The functional regression parameters have support in the manifold. The Generalized Least--Squares (GLS) estimator of the vector functional regression parameter is computed, and its asymptotic properties are analyzed under a totally specified and misspecified model scenario. A multiscale residual correlation analysis in the simulation study undertaken illustrates the empirical distributional properties of the errors at different spherical resolution levels.

MoDELS · 類別 · Pattern Recognition · 估計/估計量 · 操作 ·

2024 年 2 月 13 日

Algebraic methods for solving recognition problems with non-crossing classes

Anvar Kabulov,Alimdzhan Babadzhanov,Islambek Saymanov

from arxiv, I will rework and improve it and post it again

In this paper, we propose to consider various models of pattern recognition. At the same time, it is proposed to consider models in the form of two operators: a recognizing operator and a decision rule. Algebraic operations are introduced on recognizing operators, and based on the application of these operators, a family of recognizing algorithms is created. An upper estimate is constructed for the model, which guarantees the completeness of the extension.

近似 · 過采樣 · 樣本 · 近似誤差 · 分解的 ·

2024 年 2 月 13 日

Randomized least-squares with minimal oversampling and interpolation in general spaces

Abdellah Chkifa,Matthieu Dolbeault

from arxiv, 17 pages

In approximation of functions based on point values, least-squares methods provide more stability than interpolation, at the expense of increasing the sampling budget. We show that near-optimal approximation error can nevertheless be achieved, in an expected $L^2$ sense, as soon as the sample size $m$ is larger than the dimension $n$ of the approximation space by a constant ratio. On the other hand, for $m=n$, we obtain an interpolation strategy with a stability factor of order $n$. The proposed sampling algorithms are greedy procedures based on arXiv:0808.0163 and arXiv:1508.03261, with polynomial computational complexity.

估計/估計量 · 線性的 · 極大似然估計 · 線性回歸 · Weight ·

2024 年 2 月 13 日

Nonparametric velocity estimation in stochastic convection-diffusion equations from multiple local measurements

Claudia Strauch,Anton Tiepner

from arxiv, 37 pages, 1 figure

We investigate pointwise estimation of the function-valued velocity field of a second-order linear SPDE. Based on multiple spatially localised measurements, we construct a weighted augmented MLE and study its convergence properties as the spatial resolution of the observations tends to zero and the number of measurements increases. By imposing H\"older smoothness conditions, we recover the pointwise convergence rate known to be minimax-optimal in the linear regression framework. The optimality of the rate in the current setting is verified by adapting the lower bound ansatz based on the RKHS of local measurements to the nonparametric situation.

線性的 · Weight · 圖 · Performer · 稀疏 ·

2024 年 2 月 13 日

Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

Caleb Ju,Serif Yesil,Mengyuan Sun,Chandra Chekuri,Edgar Solomonik

from arxiv, Updates to funding and small revisions

Positive linear programs (LPs) model many graph and operations research problems. One can solve for a $(1+\epsilon)$-approximation for positive LPs, for any selected $\epsilon$, in polylogarithmic depth and near-linear work via variations of the multiplicative weight update (MWU) method. Despite extensive theoretical work on these algorithms through the decades, their empirical performance is not well understood. In this work, we implement and test an efficient parallel algorithm for solving positive LP relaxations, and apply it to graph problems such as densest subgraph, bipartite matching, vertex cover and dominating set. We accelerate the algorithm via a new step size search heuristic. Our implementation uses sparse linear algebra optimization techniques such as fusion of vector operations and use of sparse format. Furthermore, we devise an implicit representation for graph incidence constraints. We demonstrate the parallel scalability with the use of threading OpenMP and MPI on the Stampede2 supercomputer. We compare this implementation with exact libraries and specialized libraries for the above problems in order to evaluate MWU's practical standing for both accuracy and performance among other methods. Our results show this implementation is faster than general purpose LP solvers (IBM CPLEX, Gurobi) in all of our experiments, and in some instances, outperforms state-of-the-art specialized parallel graph algorithms.