两个人的电影全免费观看720,日本一区二区三区不卡网站,日韩A级毛片免费视频,老司机国内精品久久久久精品,国产又色又爽又黄激视频在线

We consider the problem of joint learning of multiple linear dynamical systems. This has received significant attention recently under different types of assumptions on the model parameters. The setting we consider involves a collection of $m$ linear systems each of which resides on a node of a given undirected graph $G = ([m], \mathcal{E})$. We assume that the system matrices are marginally stable, and satisfy a smoothness constraint w.r.t $G$ -- akin to the quadratic variation of a signal on a graph. Given access to the states of the nodes over $T$ time points, we then propose two estimators for joint estimation of the system matrices, along with non-asymptotic error bounds on the mean-squared error (MSE). In particular, we show conditions under which the MSE converges to zero as $m$ increases, typically polynomially fast w.r.t $m$. The results hold under mild (i.e., $T \sim \log m$), or sometimes, even no assumption on $T$ (i.e. $T \geq 2$).

相關內容

線性的

關注 1

正則化項 · Learning · 數據集 · Performance · 變分自編碼 ·

2024 年 7 月 15 日

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

Tenglong Liu,Yang Li,Yixing Lan,Hao Gao,Wei Pan,Xin Xu

from arxiv, ICML 2024, 19 pages

In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced. To address this, existing methods often constrain the learned policy through policy regularization. However, these methods often suffer from the issue of unnecessary conservativeness, hampering policy improvement. This occurs due to the indiscriminate use of all actions from the behavior policy that generates the offline dataset as constraints. The problem becomes particularly noticeable when the quality of the dataset is suboptimal. Thus, we propose Adaptive Advantage-guided Policy Regularization (A2PR), obtaining high-advantage actions from an augmented behavior policy combined with VAE to guide the learned policy. A2PR can select high-advantage actions that differ from those present in the dataset, while still effectively maintaining conservatism from OOD actions. This is achieved by harnessing the VAE capacity to generate samples matching the distribution of the data points. We theoretically prove that the improvement of the behavior policy is guaranteed. Besides, it effectively mitigates value overestimation with a bounded performance gap. Empirically, we conduct a series of experiments on the D4RL benchmark, where A2PR demonstrates state-of-the-art performance. Furthermore, experimental results on additional suboptimal mixed datasets reveal that A2PR exhibits superior performance. Code is available at //github.com/ltlhuuu/A2PR.

統計量 · MoDELS · INFORMS · 信息理論 · 有向 ·

2024 年 7 月 13 日

Statistical Response of ENSO Complexity to Initial Condition and Model Parameter Perturbations

Marios Andreou,Nan Chen

from arxiv, This is the second revision version. 54 pages, 11 figures, 2 tables (1 in main text and 1 in the appendix), typeset in LaTeX. Submitted for peer-review to AMS' Journal of Climate (JCLI). For more info see //sites.google.com/wisc.edu/mariosandreou/cv-publications/statistical-response-enso-complexity

Studying the response of a climate system to perturbations has practical significance. Standard methods in computing the trajectory-wise deviation caused by perturbations may suffer from the chaotic nature that makes the model error dominate the true response after a short lead time. Statistical response, which computes the return described by the statistics, provides a systematic way of reaching robust outcomes with an appropriate quantification of the uncertainty and extreme events. In this paper, information theory is applied to compute the statistical response and find the most sensitive perturbation direction of different El Ni\~no-Southern Oscillation (ENSO) events to initial value and model parameter perturbations. Depending on the initial phase and the time horizon, different state variables contribute to the most sensitive perturbation direction. While initial perturbations in sea surface temperature (SST) and thermocline depth usually lead to the most significant response of SST at short- and long-range, respectively, initial adjustment of the zonal advection can be crucial to trigger strong statistical responses at medium-range around 5 to 7 months, especially at the transient phases between El Ni\~no and La Ni\~na. It is also shown that the response in the variance triggered by external random forcing perturbations, such as the wind bursts, often dominates the mean response, making the resulting most sensitive direction very different from the trajectory-wise methods. Finally, despite the strong non-Gaussian climatology distributions, using Gaussian approximations in the information theory is efficient and accurate for computing the statistical response, allowing the method to be applied to sophisticated operational systems.

得分 · 隨機動力系統 · 動力系統 · Processing（編程語言） · 縮放 ·

2024 年 7 月 13 日

Probabilistic Predictability of Stochastic Dynamical Systems

Tao Xu,Yushan Li,Jianping He

To assess the quality of a probabilistic prediction for stochastic dynamical systems (SDSs), scoring rules assign a numerical score based on the predictive distribution and the measured state. In this paper, we propose an $\epsilon$-logarithm score that generalizes the celebrated logarithm score by considering a neighborhood with radius $\epsilon$. We characterize the probabilistic predictability of an SDS by optimizing the expected score over the space of probability measures. We show how the probabilistic predictability is quantitatively determined by the neighborhood radius, the differential entropies of process noises, and the system dimension. Given any predictor, we provide approximations for the expected score with an error of scale $\mathcal{O}(\epsilon)$. In addition to the expected score, we also analyze the asymptotic behaviors of the score on individual trajectories. Specifically, we prove that the score on a trajectory can converge to the expected score when the process noises are independent and identically distributed. Moreover, the convergence speed against the trajectory length $T$ is of scale $\mathcal{O}(T^{-\frac{1}{2}})$ in the sense of probability. Finally, numerical examples are given to elaborate the results.

離散化 · 詞元分析器 · Learning · contrastive · 掩碼 ·

2024 年 7 月 12 日

On the Role of Discrete Tokenization in Visual Representation Learning

Tianqi Du,Yifei Wang,Yisen Wang

from arxiv, ICLR 2024 Spotlight

In the realm of self-supervised learning (SSL), masked image modeling (MIM) has gained popularity alongside contrastive learning methods. MIM involves reconstructing masked regions of input images using their unmasked portions. A notable subset of MIM methodologies employs discrete tokens as the reconstruction target, but the theoretical underpinnings of this choice remain underexplored. In this paper, we explore the role of these discrete tokens, aiming to unravel their benefits and limitations. Building upon the connection between MIM and contrastive learning, we provide a comprehensive theoretical understanding on how discrete tokenization affects the model's generalization capabilities. Furthermore, we propose a novel metric named TCAS, which is specifically designed to assess the effectiveness of discrete tokens within the MIM framework. Inspired by this metric, we contribute an innovative tokenizer design and propose a corresponding MIM method named ClusterMIM. It demonstrates superior performance on a variety of benchmark datasets and ViT backbones. Code is available at //github.com/PKU-ML/ClusterMIM.

Continuity · 流 · MoDELS · 輸出 · 可辨認的 ·

2024 年 7 月 10 日

Counting Distinct Elements in the Turnstile Model with Differential Privacy under Continual Observation

Palak Jain,Iden Kalemaj,Sofya Raskhodnikova,Satchit Sivakumar,Adam Smith

from arxiv, NeurIPS 2023

Privacy is a central challenge for systems that learn from sensitive data sets, especially when a system's outputs must be continuously updated to reflect changing data. We consider the achievable error for differentially private continual release of a basic statistic - the number of distinct items - in a stream where items may be both inserted and deleted (the turnstile model). With only insertions, existing algorithms have additive error just polylogarithmic in the length of the stream $T$. We uncover a much richer landscape in the turnstile model, even without considering memory restrictions. We show that every differentially private mechanism that handles insertions and deletions has worst-case additive error at least $T^{1/4}$ even under a relatively weak, event-level privacy definition. Then, we identify a parameter of the input stream, its maximum flippancy, that is low for natural data streams and for which we give tight parameterized error guarantees. Specifically, the maximum flippancy is the largest number of times that the contribution of a single item to the distinct elements count changes over the course of the stream. We present an item-level differentially private mechanism that, for all turnstile streams with maximum flippancy $w$, continually outputs the number of distinct elements with an $O(\sqrt{w} \cdot poly\log T)$ additive error, without requiring prior knowledge of $w$. We prove that this is the best achievable error bound that depends only on $w$, for a large range of values of $w$. When $w$ is small, the error of our mechanism is similar to the polylogarithmic in $T$ error in the insertion-only setting, bypassing the hardness in the turnstile model.

Neural Networks · Networking · 可約的 · Continuity · 推斷 ·

2021 年 6 月 21 日

A Survey of Quantization Methods for Efficient Neural Network Inference

Amir Gholami,Sehoon Kim,Zhen Dong,Zhewei Yao,Michael W. Mahoney,Kurt Keutzer

from arxiv, Book Chapter: Low-Power Computer Vision: Improving the Efficiency of Artificial Intelligence

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

Performer · 學成 · 維數災難 · 泛化理論 · 數學 ·

2021 年 5 月 9 日

The Modern Mathematics of Deep Learning

Julius Berner,Philipp Grohs,Gitta Kutyniok,Philipp Petersen

from arxiv, This review paper will appear as a book chapter in the book "Theory of Deep Learning" by Cambridge University Press

We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding what features are learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects of an architecture affect the behavior of a learning task in which way. We present an overview of modern approaches that yield partial answers to these questions. For selected approaches, we describe the main ideas in more detail.

異常點 · 異常檢測 · CIFAR-10 · Extensibility · Performance ·

2018 年 12 月 21 日

Deep Anomaly Detection with Outlier Exposure

Dan Hendrycks,Mantas Mazeika,Thomas G. Dietterich

from arxiv, ICLR 2019; PyTorch code available at //github.com/hendrycks/outlier-exposure

It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.

圖像分割 · 代價 · Performer · SCAN · Better ·

2018 年 1 月 31 日

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses

Marc Bosch,Christopher M. Gifford,Austin G. Dress,Clare W. Lau,Jeffrey G. Skibo,Gordon A. Christie

from arxiv, Accepted BMVC 17

Image segmentation is an important component of many image understanding systems. It aims to group pixels in a spatially and perceptually coherent manner. Typically, these algorithms have a collection of parameters that control the degree of over-segmentation produced. It still remains a challenge to properly select such parameters for human-like perceptual grouping. In this work, we exploit the diversity of segments produced by different choices of parameters. We scan the segmentation parameter space and generate a collection of image segmentation hypotheses (from highly over-segmented to under-segmented). These are fed into a cost minimization framework that produces the final segmentation by selecting segments that: (1) better describe the natural contours of the image, and (2) are more stable and persistent among all the segmentation hypotheses. We compare our algorithm's performance with state-of-the-art algorithms, showing that we can achieve improved results. We also show that our framework is robust to the choice of segmentation kernel that produces the initial set of hypotheses.

Neural Networks · 目標跟蹤 · 學成 · Networking · RNN ·

2018 年 1 月 6 日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Li Wang,Ting Liu,Bing Wang,Xulei Yang,Gang Wang

Recently, deep learning has achieved very promising results in visual object tracking. Deep neural networks in existing tracking methods require a lot of training data to learn a large number of parameters. However, training data is not sufficient for visual object tracking as annotations of a target object are only available in the first frame of a test sequence. In this paper, we propose to learn hierarchical features for visual object tracking by using tree structure based Recursive Neural Networks (RNN), which have fewer parameters than other deep neural networks, e.g. Convolutional Neural Networks (CNN). First, we learn RNN parameters to discriminate between the target object and background in the first frame of a test sequence. Tree structure over local patches of an exemplar region is randomly generated by using a bottom-up greedy search strategy. Given the learned RNN parameters, we create two dictionaries regarding target regions and corresponding local patches based on the learned hierarchical features from both top and leaf nodes of multiple random trees. In each of the subsequent frames, we conduct sparse dictionary coding on all candidates to select the best candidate as the new target location. In addition, we online update two dictionaries to handle appearance changes of target objects. Experimental results demonstrate that our feature learning algorithm can significantly improve tracking performance on benchmark datasets.