成人不卡顿免费视频在线,国产欧美日韩精品A在线播放,欧洲一级欧美三级在线观看

This tutorial serves as an introduction to recently developed non-asymptotic methods in the theory of -- mainly linear -- system identification. We emphasize tools we deem particularly useful for a range of problems in this domain, such as the covering technique, the Hanson-Wright Inequality and the method of self-normalized martingales. We then employ these tools to give streamlined proofs of the performance of various least-squares based estimators for identifying the parameters in autoregressive models. We conclude by sketching out how the ideas presented herein can be extended to certain nonlinear identification problems.

相關內容

TOOLS

關注 1

這個新版本的工具會議系列恢復了從1989年到2012年的50個會議的傳統。工具最初是“面向對象語言和系統的技術”，后來發展到包括軟件技術的所有創新方面。今天許多最重要的軟件概念都是在這里首次引入的。2019年TOOLS 50+1在俄羅斯喀山附近舉行，以同樣的創新精神、對所有與軟件相關的事物的熱情、科學穩健性和行業適用性的結合以及歡迎該領域所有趨勢和社區的開放態度，延續了該系列。官網鏈接： · 論文 · 數值分析 ·

2024 年 7 月 26 日

Using the Transport of Intensity and the Transport of Phase Equation for Phase Retrieval

Clemens Kirisits,Kemal Raik,Otmar Scherzer,Christina Strohmenger,Jikai Yan

We investigate the transport of intensity equation (TIE) and the transport of phase equation (TPE) for solving the phase retrieval problem. Both the TIE and the TPE are derived from the paraxial Helmholtz equation and relate phase information to the intensity. The TIE is usually favored since the TPE is nonlinear. The main contribution of this paper is a discussion of preferential use of either one of the two equations or potential benefits of a hybrid use. Moreover, we discuss the solution of the TPE with the method of characteristics and with viscosity methods. Both the TIE and the viscosity method are numerically implemented with finite element methods.

優化器 · Machine Learning · Networking · Spark · Learning ·

2024 年 7 月 24 日

Sparks of Quantum Advantage and Rapid Retraining in Machine Learning

William Troy

from arxiv, Fixed figure 2 in v2

The advent of quantum computing holds the potential to revolutionize various fields by solving complex problems more efficiently than classical computers. Despite this promise, practical quantum advantage is hindered by current hardware limitations, notably the small number of qubits and high noise levels. In this study, we leverage adiabatic quantum computers to optimize Kolmogorov-Arnold Networks, a powerful neural network architecture for representing complex functions with minimal parameters. By modifying the network to use Bezier curves as the basis functions and formulating the optimization problem into a Quadratic Unconstrained Binary Optimization problem, we create a fixed-sized solution space, independent of the number of training samples. Our approach demonstrates sparks of quantum advantage through faster training times compared to classical optimizers such as the Adam, Stochastic Gradient Descent, Adaptive Gradient, and simulated annealing. Additionally, we introduce a novel rapid retraining capability, enabling the network to be retrained with new data without reprocessing old samples, thus enhancing learning efficiency in dynamic environments. Experimental results on initial training of classification and regression tasks validate the efficacy of our approach, showcasing significant speedups and comparable performance to classical methods. While experiments on retraining demonstrate a sixty times speed up using adiabatic quantum computing based optimization compared to that of the gradient descent based optimizers, with theoretical models allowing this speed up to be even larger! Our findings suggest that with further advancements in quantum hardware and algorithm optimization, quantum-optimized machine learning models could have broad applications across various domains, with initial focus on rapid retraining.

Machine Learning · Learning · MoDELS · Networking · 縮放 ·

2024 年 7 月 24 日

Application of Machine Learning and Convex Limiting to Subgrid Flux Modeling in the Shallow-Water Equations

Ilya Timofeyev,Alexey Schwarzmann,Dmitri Kuzmin

We propose a combination of machine learning and flux limiting for property-preserving subgrid scale modeling in the context of flux-limited finite volume methods for the one-dimensional shallow-water equations. The numerical fluxes of a conservative target scheme are fitted to the coarse-mesh averages of a monotone fine-grid discretization using a neural network to parametrize the subgrid scale components. To ensure positivity preservation and the validity of local maximum principles, we use a flux limiter that constrains the intermediate states of an equivalent fluctuation form to stay in a convex admissible set. The results of our numerical studies confirm that the proposed combination of machine learning with monolithic convex limiting produces meaningful closures even in scenarios for which the network was not trained.

CASE · 組合性 · DIS · 有向 · TOOLS ·

2024 年 7 月 23 日

Behavioural Metrics: Compositionality of the Kantorovich Lifting and an Application to Up-To Techniques

Keri D'Angelo,Sebastian Gurke,Johanna Maria Kirss,Barbara K?nig,Matina Najafi,Wojciech Ró?owski,Paul Wild

Behavioural distances of transition systems modelled via coalgebras for endofunctors generalize traditional notions of behavioural equivalence to a quantitative setting, in which states are equipped with a measure of how (dis)similar they are. Endowing transition systems with such distances essentially relies on the ability to lift functors describing the one-step behavior of the transition systems to the category of pseudometric spaces. We consider the category theoretic generalization of the Kantorovich lifting from transportation theory to the case of lifting functors to quantale-valued relations, which subsumes equivalences, preorders and (directed) metrics. We use tools from fibred category theory, which allow one to see the Kantorovich lifting as arising from an appropriate fibred adjunction. Our main contributions are compositionality results for the Kantorovich lifting, where we show that that the lifting of a composed functor coincides with the composition of the liftings. In addition, we describe how to lift distributive laws in the case where one of the two functors is polynomial (with finite coproducts). These results are essential ingredients for adapting up-to-techniques to the case of quantale-valued behavioural distances. Up-to techniques are a well-known coinductive technique for efficiently showing lower bounds for behavioural distances. We illustrate the results of our paper in two case studies.

估計/估計量 · Markov · 馬爾可夫鏈 · 平穩的 · 平穩分布 ·

2024 年 7 月 23 日

Optimistic Estimation of Convergence in Markov Chains with the Average-Mixing Time

Geoffrey Wolfer,Pierre Alquier

The convergence rate of a Markov chain to its stationary distribution is typically assessed using the concept of total variation mixing time. However, this worst-case measure often yields pessimistic estimates and is challenging to infer from observations. In this paper, we advocate for the use of the average-mixing time as a more optimistic and demonstrably easier-to-estimate alternative. We further illustrate its applicability across a range of settings, from two-point to countable spaces, and discuss some practical implications.

向量化 · MoDELS · 泛化理論 · 語言模型化 · 有偏 ·

2024 年 7 月 22 日

Analyzing the Generalization and Reliability of Steering Vectors

Daniel Tan,David Chanin,Aengus Lynch,Dimitrios Kanoulas,Brooks Paige,Adria Garriga-Alonso,Robert Kirk

Steering vectors (SVs) are a new approach to efficiently adjust language model behaviour at inference time by intervening on intermediate model activations. They have shown promise in terms of improving both capabilities and model alignment. However, the reliability and generalisation properties of this approach are unknown. In this work, we rigorously investigate these properties, and show that steering vectors have substantial limitations both in- and out-of-distribution. In-distribution, steerability is highly variable across different inputs. Depending on the concept, spurious biases can substantially contribute to how effective steering is for each input, presenting a challenge for the widespread use of steering vectors. Out-of-distribution, while steering vectors often generalise well, for several concepts they are brittle to reasonable changes in the prompt, resulting in them failing to generalise well. Overall, our findings show that while steering can work well in the right circumstances, there remain many technical difficulties of applying steering vectors to guide models' behaviour at scale.

優化器 · 泛函 · 凸函數 · Analysis · Pivotal（公司） ·

2024 年 7 月 21 日

Formalization of Complexity Analysis of the First-order Algorithms for Convex Optimization

Chenyi Li,Ziyu Wang,Wanyi He,Yuxuan Wu,Shengyang Xu,Zaiwen Wen

The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize the gradient for smooth functions and the subgradient for convex functions on a Hilbert space, laying the groundwork for the accurate formalization of algorithmic structures. Then, we extend our contribution by proving several properties of differentiable convex functions that have not yet been formalized in Mathlib. Finally, a comprehensive formalization of these algorithms is presented. These developments are not only noteworthy on their own but also serve as essential precursors to the formalization of a broader spectrum of numerical algorithms and their applications in machine learning as well as many other areas.

Subspace · MoDELS · Analysis · 可理解性 · 泛化理論 ·

2024 年 7 月 20 日

Co-Active Subspace Methods for the Joint Analysis of Adjacent Computer Models

Kellin N. Rumsey,Zachary K. Hardy,Cory Ahrens,Scott Vander Wiel

Active subspace (AS) methods are a valuable tool for understanding the relationship between the inputs and outputs of a Physics simulation. In this paper, an elegant generalization of the traditional ASM is developed to assess the co-activity of two computer models. This generalization, which we refer to as a Co-Active Subspace (C-AS) Method, allows for the joint analysis of two or more computer models allowing for thorough exploration of the alignment (or non-alignment) of the respective gradient spaces. We define co-active directions, co-sensitivity indices, and a scalar ``concordance" metric (and complementary ``discordance" pseudo-metric) and we demonstrate that these are powerful tools for understanding the behavior of a class of computer models, especially when used to supplement traditional AS analysis. Details for efficient estimation of the C-AS and an accompanying R package (github.com/knrumsey/concordance) are provided. Practical application is demonstrated through analyzing a set of simulated rate stick experiments for PBX 9501, a high explosive, offering insights into complex model dynamics.

頻率主義學派 · 覆蓋 · 估計/估計量 · 線性的 · PDE ·

2024 年 7 月 19 日

On the Frequentist Coverage of Bayes Posteriors in Nonlinear Inverse Problems

Youngsoo Baek,Katerina Papagiannouli,Sayan Mukherjee

from arxiv, 42 pages, 2 figures

We study the asymptotic frequentist coverage and Gaussian approximation of Bayes posterior credible sets in nonlinear inverse problems when a Gaussian prior is placed on the parameter of the PDE. The aim is to ensure valid frequentist coverage of Bayes credible intervals when estimating continuous linear functionals of the parameter. Our results show that Bayes credible intervals have conservative coverage under certain smoothness assumptions on the parameter and a compatibility condition between the likelihood and the prior, regardless of whether an efficient limit exists and/or Bernstein von-Mises theorem holds. In the latter case, our results yield a corollary with more relaxed sufficient conditions than previous works. We illustrate practical utility of the results through the example of estimating the conductivity coefficient of a second order elliptic PDE, where a near-$N^{-1/2}$ contraction rate and conservative coverage results are obtained for linear functionals that were shown not to be estimable efficiently.

泛函 · Networking · Neural Networks · ReLU · 層 ·

2024 年 7 月 17 日

Towards Lower Bounds on the Depth of ReLU Neural Networks

Christoph Hertrich,Amitabh Basu,Marco Di Summa,Martin Skutella

from arxiv, Authors' accepted manuscript for SIAM Journal on Discrete Mathematics. A preliminary conference version appeared at NeurIPS 2021

We contribute to a better understanding of the class of functions that can be represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning any function. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). As a by-product of our investigations, we settle an old conjecture about piecewise linear functions by Wang and Sun (2005) in the affirmative. We also present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.