四虎亚洲精品高清在线观看,中文不卡AV在线播放

The work is devoted to the construction of a new interval arithmetic which would combine algorithmic efficiency and high quality estimation of the ranges of expressions.

相關內容

線性的

關注 1

Analysis · Performer · INFORMS · Better · 閾值 ·

2022 年 6 月 9 日

Outage Analysis of Age-of-Information for Multi-Source Systems

Guan-Yu Lin,Yu-Chih Huang,Yu-Pin Hsu

Age of information (AoI) is an effective performance metric measuring the freshness of information and is popular for applications involving status update. Most of the existing works have adopted average AoI as the metric, which cannot provide strict performance guarantees. In this work, the outage probability of the peak AoI exceeding a given threshold is analyzed in a multi-source system under round robin scheduling. Two queueing disciplines are considered, namely the first-come-first-serve (FCFS) queue and the single packet queue. For FCFS, upper and lower bounds on the outage probability are derived which coincides asymptotically, characterizing its true scaling. For the single packet queue, an upper bound is derived whose effectiveness is validated by the simulation results. The analysis concretizes the common belief that single packet queueing has a better AoI performance than FCFS. Moreover, it also reveals that the two disciplines would have similar asymptotic performance when the inter-arrival time is much larger than the total transmission time.

Learning · 知識 (knowledge) · 遷移學習 · 線性的 · 泛函 ·

2022 年 6 月 9 日

On Transfer Learning in Functional Linear Regression

Haotian Lin,Matthew Reimherr

from arxiv, 31 pages

This work studies the problem of transfer learning under the functional linear model framework, which aims to improve the fit of the target model by leveraging the knowledge from related source models. We measure the relatedness between target and source models using Reproducing Kernel Hilbert Spaces, allowing the type of knowledge being transferred to be interpreted by the structure of the spaces. Two algorithms are proposed: one transfers knowledge when the index of transferable sources is known, while the other one utilizes aggregation to achieve knowledge transfer without prior information about the sources. Furthermore, we establish the optimal convergence rates for excess risk, making the statistical gain via transfer learning mathematically provable. The effectiveness of the proposed algorithms is demonstrated on synthetic data as well as real financial data.

Analysis · 估計/估計量 · 泛函 · 線性的 · MoDELS ·

2022 年 6 月 8 日

Unified RKHS Methodology and Analysis for Functional Linear and Single-Index Models

Krishnakumar Balasubramanian,Hans-Georg Müller,Bharath K. Sriperumbudur

Functional linear and single-index models are core regression methods in functional data analysis and are widely used methods for performing regression when the covariates are observed random functions coupled with scalar responses in a wide range of applications. In the existing literature, however, the construction of associated estimators and the study of their theoretical properties is invariably carried out on a case-by-case basis for specific models under consideration. In this work, we provide a unified methodological and theoretical framework for estimating the index in functional linear and single-index models; in the later case the proposed approach does not require the specification of the link function. In terms of methodology, we show that the reproducing kernel Hilbert space (RKHS) based functional linear least-squares estimator, when viewed through the lens of an infinite-dimensional Gaussian Stein's identity, also provides an estimator of the index of the single-index model. On the theoretical side, we characterize the convergence rates of the proposed estimators for both linear and single-index models. Our analysis has several key advantages: (i) we do not require restrictive commutativity assumptions for the covariance operator of the random covariates on one hand and the integral operator associated with the reproducing kernel on the other hand; and (ii) we also allow for the true index parameter to lie outside of the chosen RKHS, thereby allowing for index mis-specification as well as for quantifying the degree of such index mis-specification. Several existing results emerge as special cases of our analysis.

Learning · MoDELS · Neural Networks · 縮放 · 泛函 ·

2022 年 6 月 8 日

Neural Basis Models for Interpretability

Filip Radenovic,Abhimanyu Dubey,Dhruv Mahajan

from arxiv, 17 pages including appendix. v2 includes link to source code available at //github.com/facebookresearch/nbm-spam

Due to the widespread use of complex machine learning models in real-world applications, it is becoming critical to explain model predictions. However, these models are typically black-box deep neural networks, explained post-hoc via methods with known faithfulness limitations. Generalized Additive Models (GAMs) are an inherently interpretable class of models that address this limitation by learning a non-linear shape function for each feature separately, followed by a linear model on top. However, these models are typically difficult to train, require numerous parameters, and are difficult to scale. We propose an entirely new subfamily of GAMs that utilizes basis decomposition of shape functions. A small number of basis functions are shared among all features, and are learned jointly for a given task, thus making our model scale much better to large-scale data with high-dimensional features, especially when features are sparse. We propose an architecture denoted as the Neural Basis Model (NBM) which uses a single neural network to learn these bases. On a variety of tabular and image datasets, we demonstrate that for interpretable machine learning, NBMs are the state-of-the-art in accuracy, model size, and, throughput and can easily model all higher-order feature interactions. Source code is available at //github.com/facebookresearch/nbm-spam.

Learning · 可約的 · MoDELS · Analysis · Machine Learning ·

2022 年 6 月 7 日

On the balance between the training time and interpretability of neural ODE for time series modelling

Yakov Golovanev,Alexander Hvatov

Most machine learning methods are used as a black box for modelling. We may try to extract some knowledge from physics-based training methods, such as neural ODE (ordinary differential equation). Neural ODE has advantages like a possibly higher class of represented functions, the extended interpretability compared to black-box machine learning models, ability to describe both trend and local behaviour. Such advantages are especially critical for time series with complicated trends. However, the known drawback is the high training time compared to the autoregressive models and long-short term memory (LSTM) networks widely used for data-driven time series modelling. Therefore, we should be able to balance interpretability and training time to apply neural ODE in practice. The paper shows that modern neural ODE cannot be reduced to simpler models for time-series modelling applications. The complexity of neural ODE is compared to or exceeds the conventional time-series modelling tools. The only interpretation that could be extracted is the eigenspace of the operator, which is an ill-posed problem for a large system. Spectra could be extracted using different classical analysis methods that do not have the drawback of extended time. Consequently, we reduce the neural ODE to a simpler linear form and propose a new view on time-series modelling using combined neural networks and an ODE system approach.

Learning · 圖 · 優化器 · 情景 · Performer ·

2022 年 6 月 7 日

Beyond spectral gap: The role of the topology in decentralized learning

Thijs Vogels,Hadrien Hendrikx,Martin Jaggi

from arxiv, Under review

In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model: more accurate gradients allow them to use larger learning rates and optimize faster. We consider the setting in which all workers sample from the same dataset, and communicate over a sparse graph (decentralized). In this setting, current theory fails to capture important aspects of real-world behavior. First, the 'spectral gap' of the communication graph is not predictive of its empirical performance in (deep) learning. Second, current theory does not explain that collaboration enables larger learning rates than training alone. In fact, it prescribes smaller learning rates, which further decrease as graphs become larger, failing to explain convergence in infinite graphs. This paper aims to paint an accurate picture of sparsely-connected distributed optimization when workers share the same data distribution. We quantify how the graph topology influences convergence in a quadratic toy problem and provide theoretical results for general smooth and (strongly) convex objectives. Our theory matches empirical observations in deep learning, and accurately describes the relative merits of different graph topologies.

模型平均 · MoDELS · 線性的 · 線性模型 · 估計/估計量 ·

2022 年 6 月 7 日

Jackknife Partially Linear Model Averaging for the Conditional Quantile Prediction

Jing Lv

Estimating the conditional quantile of the interested variable with respect to changes in the covariates is frequent in many economical applications as it can offer a comprehensive insight. In this paper, we propose a novel semiparametric model averaging to predict the conditional quantile even if all models under consideration are potentially misspecified. Specifically, we first build a series of non-nested partially linear sub-models, each with different nonlinear component. Then a leave-one-out cross-validation criterion is applied to choose the model weights. Under some regularity conditions, we have proved that the resulting model averaging estimator is asymptotically optimal in terms of minimizing the out-of-sample average quantile prediction error. Our modelling strategy not only effectively avoids the problem of specifying which a covariate should be nonlinear when one fits a partially linear model, but also results in a more accurate prediction than traditional model-based procedures because of the optimality of the selected weights by the cross-validation criterion. Simulation experiments and an illustrative application show that our proposed model averaging method is superior to other commonly used alternatives.

有偏 · 可約的 · 約束 · GROUP · CASE ·

2022 年 6 月 7 日

Selection in the Presence of Implicit Bias: The Advantage of Intersectional Constraints

Anay Mehrotra,Bary S. R. Pradelski,Nisheeth K. Vishnoi

from arxiv, This is the full version of a paper accepted for presentation in ACM FAccT 2022

In selection processes such as hiring, promotion, and college admissions, implicit bias toward socially-salient attributes such as race, gender, or sexual orientation of candidates is known to produce persistent inequality and reduce aggregate utility for the decision maker. Interventions such as the Rooney Rule and its generalizations, which require the decision maker to select at least a specified number of individuals from each affected group, have been proposed to mitigate the adverse effects of implicit bias in selection. Recent works have established that such lower-bound constraints can be very effective in improving aggregate utility in the case when each individual belongs to at most one affected group. However, in several settings, individuals may belong to multiple affected groups and, consequently, face more extreme implicit bias due to this intersectionality. We consider independently drawn utilities and show that, in the intersectional case, the aforementioned non-intersectional constraints can only recover part of the total utility achievable in the absence of implicit bias. On the other hand, we show that if one includes appropriate lower-bound constraints on the intersections, almost all the utility achievable in the absence of implicit bias can be recovered. Thus, intersectional constraints can offer a significant advantage over a reductionist dimension-by-dimension non-intersectional approach to reducing inequality.

Learning · 動力系統 · MoDELS · 分解 · 線性的 ·

2022 年 6 月 7 日

Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics

Noga Mudrik,Yenho Chen,Eva Yezerets,Christopher J. Rozell,Adam S. Charles

from arxiv, 25 pages, 15 figures

Learning interpretable representations of neural dynamics at a population level is a crucial first step to understanding how neural activity relates to perception and behavior. Models of neural dynamics often focus on either low-dimensional projections of neural activity, or on learning dynamical systems that explicitly relate to the neural state over time. We discuss how these two approaches are interrelated by considering dynamical systems as representative of flows on a low-dimensional manifold. Building on this concept, we propose a new decomposed dynamical system model that represents complex non-stationary and nonlinear dynamics of time-series data as a sparse combination of simpler, more interpretable components. The decomposed nature of the dynamics generalizes over previous switched approaches and enables modeling of overlapping and non-stationary drifts in the dynamics. We further present a dictionary learning-driven approach to model fitting, where we leverage recent results in tracking sparse vectors over time. We demonstrate that our model can learn efficient representations and smooth transitions between dynamical modes in both continuous-time and discrete-time examples. We show results on low-dimensional linear and nonlinear attractors to demonstrate that our decomposed dynamical systems model can well approximate nonlinear dynamics. Additionally, we apply our model to C. elegans data, illustrating a diversity of dynamics that is obscured when classified into discrete states.

優化器 · Extensibility · 最優化 · Automator · Neural Networks ·

2020 年 3 月 12 日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Tong Yu,Hong Zhu

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.