清纯唯美另类亚洲欧美综合,日本成年黄色一区二区三区

A well-known line of work (Barron, 1993; Breiman, 1993; Klusowski & Barron, 2018) provides bounds on the width $n$ of a ReLU two-layer neural network needed to approximate a function $f$ over the ball $\mathcal{B}_R(\mathbb{R}^d)$ up to error $\epsilon$, when the Fourier based quantity $C_f = \frac{1}{(2\pi)^{d/2}} \int_{\mathbb{R}^d} \|\xi\|^2 |\hat{f}(\xi)| \ d\xi$ is finite. More recently Ongie et al. (2019) used the Radon transform as a tool for analysis of infinite-width ReLU two-layer networks. In particular, they introduce the concept of Radon-based $\mathcal{R}$-norms and show that a function defined on $\mathbb{R}^d$ can be represented as an infinite-width two-layer neural network if and only if its $\mathcal{R}$-norm is finite. In this work, we extend the framework of Ongie et al. (2019) and define similar Radon-based semi-norms ($\mathcal{R}, \mathcal{U}$-norms) such that a function admits an infinite-width neural network representation on a bounded open set $\mathcal{U} \subseteq \mathbb{R}^d$ when its $\mathcal{R}, \mathcal{U}$-norm is finite. Building on this, we derive sparse (finite-width) neural network approximation bounds that refine those of Breiman (1993); Klusowski & Barron (2018). Finally, we show that infinite-width neural network representations on bounded open sets are not unique and study their structure, providing a functional view of mode connectivity.

相關內容

Neural Networks

關注 1648

神經網絡（Neural Networks）是世界上三個最古老的神經建模學會的檔案期刊:國際神經網絡學會(INNS)、歐洲神經網絡學會(ENNS)和日本神經網絡學會(JNNS)。神經網絡提供了一個論壇，以發展和培育一個國際社會的學者和實踐者感興趣的所有方面的神經網絡和相關方法的計算智能。神經網絡歡迎高質量論文的提交，有助于全面的神經網絡研究，從行為和大腦建模，學習算法，通過數學和計算分析，系統的工程和技術應用，大量使用神經網絡的概念和技術。這一獨特而廣泛的范圍促進了生物和技術研究之間的思想交流，并有助于促進對生物啟發的計算智能感興趣的跨學科社區的發展。因此，神經網絡編委會代表的專家領域包括心理學，神經生物學，計算機科學，工程，數學，物理。該雜志發表文章、信件和評論以及給編輯的信件、社論、時事、軟件調查和專利信息。文章發表在五個部分之一:認知科學，神經科學，學習系統，數學和計算分析、工程和應用。官網地址：

激活函數 · 線性的 · Networking · 穩健性 · Neural Networks ·

2022 年 1 月 31 日

LinSyn: Synthesizing Tight Linear Bounds for Arbitrary Neural Network Activation Functions

Brandon Paulsen,Chao Wang

from arxiv, Published as a conference paper at TACAS 2022

The most scalable approaches to certifying neural network robustness depend on computing sound linear lower and upper bounds for the network's activation functions. Current approaches are limited in that the linear bounds must be handcrafted by an expert, and can be sub-optimal, especially when the network's architecture composes operations using, for example, multiplication such as in LSTMs and the recently popular Swish activation. The dependence on an expert prevents the application of robustness certification to developments in the state-of-the-art of activation functions, and furthermore the lack of tightness guarantees may give a false sense of insecurity about a particular model. To the best of our knowledge, we are the first to consider the problem of automatically computing tight linear bounds for arbitrary n-dimensional activation functions. We propose LinSyn, the first approach that achieves tight bounds for any arbitrary activation function, while only leveraging the mathematical definition of the activation function itself. Our approach leverages an efficient heuristic approach to synthesize bounds that are tight and usually sound, and then verifies the soundness (and adjusts the bounds if necessary) using the highly optimized branch-and-bound SMT solver, dReal. Even though our approach depends on an SMT solver, we show that the runtime is reasonable in practice, and, compared with state of the art, our approach often achieves 2-5X tighter final output bounds and more than quadruple certified robustness.

線性的 · 圖 · 正則化項 · 泛函 · 情景 ·

2022 年 1 月 31 日

XNLP-completeness for Parameterized Problems on Graphs with a Linear Structure

Hans L. Bodlaender,Carla Groenland,Hugo Jacob

In this paper, we show several parameterized problems to be complete for the class XNLP: parameterized problems that can be solved with a non-deterministic algorithm that uses $f(k)\log n$ space and $f(k)n^c$ time, with $f$ a computable function, $n$ the input size, $k$ the parameter and $c$ a constant. The problems include Maximum Regular Induced Subgraph and Max Cut parameterized by linear clique-width, Capacitated (Red-Blue) Dominating Set parameterized by pathwidth, Odd Cycle Transversal parameterized by a new parameter we call logarithmic linear clique-width (defined as $k/\log n$ for an $n$-vertex graph of linear clique-width $k$), and Bipartite Bandwidth.

Networking · MoDELS · 優化器 · Processing（編程語言） · 樣例 ·

2022 年 1 月 31 日

Network Programming via Computable Products

Dennis Volpano

from arxiv, 6 pages, 7 tables

The User Plane Function (UPF) aims to provide network services in the 3GPP 5G core network. These services need to be implemented on demand inexpensively with provable properties. Existing network dataplane programming languages are not up to the task. A new software paradigm is presented for the UPF. It is inspired by model checking a concurrent reactive system where conceptually each component of the system is modeled as an extended finite-state machine and their product is verified. We show how such a product can be computed for one example of a UPF and how its state invariants can be inferred, thereby eliminating the need to formally verify the product separately. Code can be generated from the product and regenerated on the fly to remain optimal for the probability distribution of network traffic the UPF must process.

配分函數 · 近似 · 估計/估計量 · 劃分 · 泛函 ·

2022 年 1 月 30 日

Polynomial-Time Approximation of Zero-Free Partition Functions

Penghui Yao,Yitong Yin,Xinyuan Zhang

Zero-free based algorithm is a major technique for deterministic approximate counting. In Barvinok's original framework[Bar17], by calculating truncated Taylor expansions, a quasi-polynomial time algorithm was given for estimating zero-free partition functions. Patel and Regts[PR17] later gave a refinement of Barvinok's framework, which gave a polynomial-time algorithm for a class of zero-free graph polynomials that can be expressed as counting induced subgraphs in bounded-degree graphs. In this paper, we give a polynomial-time algorithm for estimating classical and quantum partition functions specified by local Hamiltonians with bounded maximum degree, assuming a zero-free property for the temperature. Consequently, when the inverse temperature is close enough to zero by a constant gap, we have polynomial-time approximation algorithm for all such partition functions. Our result is based on a new abstract framework that extends and generalizes the approach of Patel and Regts.

ReLU · 秩 · 正則化項 · Networking · 線性的 ·

2022 年 1 月 30 日

Implicit Regularization Towards Rank Minimization in ReLU Networks

Nadav Timor,Gal Vardi,Ohad Shamir

We study the conjectured relationship between the implicit regularization in neural networks, trained with gradient-based methods, and rank minimization of their weight matrices. Previously, it was proved that for linear networks (of depth 2 and vector-valued outputs), gradient flow (GF) w.r.t. the square loss acts as a rank minimization heuristic. However, understanding to what extent this generalizes to nonlinear networks is an open problem. In this paper, we focus on nonlinear ReLU networks, providing several new positive and negative results. On the negative side, we prove (and demonstrate empirically) that, unlike the linear case, GF on ReLU networks may no longer tend to minimize ranks, in a rather strong sense (even approximately, for "most" datasets of size 2). On the positive side, we reveal that ReLU networks of sufficient depth are provably biased towards low-rank solutions in several reasonable settings.

通道 · CASES · Extensibility · 近似 · 解碼 ·

2022 年 1 月 30 日

Efficient Near-Optimal Codes for General Repeat Channels

Francisco Pernice,Ray Li,Mary Wootters

Given a probability distribution $\mathcal{D}$ over the non-negative integers, a $\mathcal{D}$-repeat channel acts on an input symbol by repeating it a number of times distributed as $\mathcal{D}$. For example, the binary deletion channel ($\mathcal{D}=Bernoulli$) and the Poisson repeat channel ($\mathcal{D}=Poisson$) are special cases. We say a $\mathcal{D}$-repeat channel is square-integrable if $\mathcal{D}$ has finite first and second moments. In this paper, we construct explicit codes for all square-integrable $\mathcal{D}$-repeat channels with rate arbitrarily close to the capacity, that are encodable and decodable in linear and quasi-linear time, respectively. We also consider possible extensions to the repeat channel model, and illustrate how our construction can be extended to an even broader class of channels capturing insertions, deletions, and substitutions. Our work offers an alternative, simplified, and more general construction to the recent work of Rubinstein (arXiv:2111.00261), who attains similar results to ours in the cases of the deletion channel and the Poisson repeat channel. It also improves on the decoding failure probability of the polar codes constructions of Tal et al. for the deletion channel (ISIT 2019) and certain insertion/deletion/substitution channels (arXiv:2102.02155). Our techniques follow closely the approaches of Guruswami and Li (IEEEToIT 2019) and Con and Shpilka (IEEEToIT 2020); what sets apart our work is that we show that a capacity-achieving code for the channels in question can be assumed to have an "approximate balance" in the frequency of zeros and ones of all sufficiently long substrings of all codewords. This allows us to attain near-capacity-achieving codes in a general setting. We consider this "approximate balance" result to be of independent interest, as it can be cast in much greater generality than repeat channels.

流 · 優化器 · 圖 · 可約的 · 分解的 ·

2022 年 1 月 30 日

An Asymptotically Optimal Algorithm for Maximum Matching in Dynamic Streams

Sepehr Assadi,Vihan Shah

from arxiv, Full version of the paper accepted to ITCS 2022. 42 pages, 5 Figures

We present an algorithm for the maximum matching problem in dynamic (insertion-deletions) streams with *asymptotically optimal* space complexity: for any $n$-vertex graph, our algorithm with high probability outputs an $\alpha$-approximate matching in a single pass using $O(n^2/\alpha^3)$ bits of space. A long line of work on the dynamic streaming matching problem has reduced the gap between space upper and lower bounds first to $n^{o(1)}$ factors [Assadi-Khanna-Li-Yaroslavtsev; SODA 2016] and subsequently to $\text{polylog}{(n)}$ factors [Dark-Konrad; CCC 2020]. Our upper bound now matches the Dark-Konrad lower bound up to $O(1)$ factors, thus completing this research direction. Our approach consists of two main steps: we first (provably) identify a family of graphs, similar to the instances used in prior work to establish the lower bounds for this problem, as the only "hard" instances to focus on. These graphs include an induced subgraph which is both sparse and contains a large matching. We then design a dynamic streaming algorithm for this family of graphs which is more efficient than prior work. The key to this efficiency is a novel sketching method, which bypasses the typical loss of $\text{polylog}{(n)}$-factors in space compared to standard $L_0$-sampling primitives, and can be of independent interest in designing optimal algorithms for other streaming problems.

估計/估計量 · 優化器 · 統計量 · Neural Networks · Networks ·

2022 年 1 月 29 日

Data-Driven Parameter Estimation

George V. Moustakides

Optimum parameter estimation methods require knowledge of a parametric probability density that statistically describes the available observations. In this work we examine Bayesian and non-Bayesian parameter estimation problems under a data-driven formulation where the necessary parametric probability density is replaced by available data. We present various data-driven versions that either result in neural network approximations of the optimum estimators or in well defined optimization problems that can be solved numerically. In particular, for the data-driven equivalent of non-Bayesian estimation we end up with optimization problems similar to the ones encountered for the design of generative networks.

香農 · 邊 · 圖 · 極小點 · SimPLe ·

2022 年 1 月 27 日

Vizing's and Shannon's Theorems for defective edge colouring

Pierre Aboulker,Guillaume Aubian,Chien-Chung Huang

We call a multigraph $(k,d)$-edge colourable if its edge set can be partitioned into $k$ subgraphs of maximum degree at most $d$ and denote as $\chi'_{d}(G)$ the minimum $k$ such that $G$ is $(k,d)$-edge colourable. We prove that for every integer $d$, every multigraph $G$ with maximum degree $\Delta$ is $(\lceil \frac{\Delta}{d} \rceil, d)$-edge colourable if $d$ is even and $(\lceil \frac{3\Delta - 1}{3d - 1} \rceil, d)$-edge colourable if $d$ is odd and these bounds are tight. We also prove that for every simple graph $G$, $\chi'_{d}(G) \in \{ \lceil \frac{\Delta}{d} \rceil, \lceil \frac{\Delta+1}{d} \rceil \}$ and characterize the values of $d$ and $\Delta$ for which it is NP-complete to compute $\chi'_d(G)$. These results generalize several classic results on the chromatic index of a graph by Shannon, Vizing, Holyer, Leven and Galil.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.