青柠在线观看免费高清1_九九九精品视频网站_久久精品一国产精品_韩国精品欧美一区二区三区_亚洲综合网中文字幕_日本三级2019中文字幕_岛国免费AV一区二区三区

A central problem in Binary Hypothesis Testing (BHT) is to determine the optimal tradeoff between the Type I error (referred to as false alarm) and Type II (referred to as miss) error. In this context, the exponential rate of convergence of the optimal miss error probability -- as the sample size tends to infinity -- given some (positive) restrictions on the false alarm probabilities is a fundamental question to address in theory. Considering the more realistic context of a BHT with a finite number of observations, this paper presents a new non-asymptotic result for the scenario with monotonic (sub-exponential decreasing) restriction on the Type I error probability, which extends the result presented by Strassen in 2009. Building on the use of concentration inequalities, we offer new upper and lower bounds to the optimal Type II error probability for the case of finite observations. Finally, the derived bounds are evaluated and interpreted numerically (as a function of the number samples) for some vanishing Type I error restrictions.

相關內容

優化器

關注 4

CASE · SMC · 圖 · 塊 · 極小點 ·

2021 年 11 月 12 日

Scheduling with Machine Conflicts

Moritz Buchem,Linda Kleist,Daniel Schmidt genannt Waldschmidt

from arxiv, 20 pages, 8 figures

We study the scheduling problem of makespan minimization while taking machine conflicts into account. Machine conflicts arise in various settings, e.g., shared resources for pre- and post-processing of tasks or spatial restrictions. In this context, each job has a blocking time before and after its processing time, i.e., three parameters. We seek for conflict-free schedules in which the blocking times of no two jobs intersect on conflicting machines. Given a set of jobs, a set of machines, and a graph representing machine conflicts, the problem SchedulingWithMachineConflicts (SMC), asks for a conflict-free schedule of minimum makespan. We show that, unless $\textrm{P}=\textrm{NP}$, SMC on $m$ machines does not allow for a $\mathcal{O}(m^{1-\varepsilon})$-approximation algorithm for any $\varepsilon>0$, even in the case of identical jobs and every choice of fixed positive parameters, including the unit case. Complementary, we provide approximation algorithms when a suitable collection of independent sets is given. Finally, we present polynomial time algorithms to solve the problem for the case of unit jobs on special graph classes. Most prominently, we solve it for bipartite graphs by using structural insights for conflict graphs of star forests.

泛函 · 樣例 · 回合 · 值域 · 分解 ·

2021 年 11 月 12 日

An Axiomatic Approach to Formalized Responsibility Ascription

Sarah Hiller,Jonas Israel,Jobst Heitzig

Quantified responsibility ascription in complex scenarios is of crucial importance in current debates regarding collective action, for example in the face of various environmental crises. Within this endeavor, we recently proposed considering a probabilistic view of causation, rather than the deterministic views employed in much of the previous formal responsibility literature, and presented a corresponding framework as well as initial candidate functions applicable to a range of scenarios. In the current paper, we extend this contribution by formally evaluating the qualities of proposed functions through an axiomatic approach. First, we decompose responsibility ascription into distinct contributing functions, before defining a number of desirable properties, or axioms, for these. Afterwards we evaluate the proposed functions regarding compliance with these axioms. We find an incompatibility between axioms determining upper and lower bounds in one of the contributing functions, imposing a choice for one variant - upper bound or lower bound. For the other contributing function we are able to axiomatically characterize one specific function. As the previously mentioned incompatibility extends to the combined responsibility function we finally present maximally axiom compliant combined functions for each variant - including the upper bound axiom or including the lower bound axiom.

Jensen不等式 · 估計/估計量 · INFORMS · 互信息 · 凸函數 ·

2021 年 11 月 12 日

A Reverse Jensen Inequality Result with Application to Mutual Information Estimation

Gerhard Wunder,Benedikt Gro?,Rick Fritschek,Rafael F. Schaefer

from arxiv, 6 pages, ITW 2021

The Jensen inequality is a widely used tool in a multitude of fields, such as for example information theory and machine learning. It can be also used to derive other standard inequalities such as the inequality of arithmetic and geometric means or the H\"older inequality. In a probabilistic setting, the Jensen inequality describes the relationship between a convex function and the expected value. In this work, we want to look at the probabilistic setting from the reverse direction of the inequality. We show that under minimal constraints and with a proper scaling, the Jensen inequality can be reversed. We believe that the resulting tool can be helpful for many applications and provide a variational estimation of mutual information, where the reverse inequality leads to a new estimator with superior training behavior compared to current estimators.

近似 · 塑造 · Networking · 可辨認的 · Performer ·

2021 年 11 月 12 日

Generalized Convexity Properties and Shape Based Approximation in Networks Reliability

Gabriela Cristescu,Vlad-Florin Dragoi,Sorin-Horatiu Hoara

from arxiv, 22 pages, 5 figures, 5 tables

Some properties of generalized convexity for sets and for functions are identified in case of the reliability polynomials of two dual minimal networks. A method of approximating the reliability polynomials of two dual minimal network is developed based on their mutual complementarity properties. The approximating objects are from the class of quadratic spline functions, constructed based both on interpolation conditions and on shape knowledge. It is proved that the approximant objects preserve the shape properties of the exact reliability polynomials. Numerical examples and simulations show the performance of the algorithm, both in terms of low complexity, small error and shape preserving. Possibilities of increasing the accuracy of approximation are discussed.

動量 · tuning · 優化器 · INTERACT · 注意力機制 ·

2021 年 11 月 11 日

Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum

Junhyung Lyle Kim,Panos Toulis,Anastasios Kyrillidis

Stochastic gradient descent with momentum (SGDM) is the dominant algorithm in many optimization scenarios, including convex optimization instances and non-convex neural network training. Yet, in the stochastic setting, momentum interferes with gradient noise, often leading to specific step size and momentum choices in order to guarantee convergence, set aside acceleration. Proximal point methods, on the other hand, have gained much attention due to their numerical stability and elasticity against imperfect tuning. Their stochastic accelerated variants though have received limited attention: how momentum interacts with the stability of (stochastic) proximal point methods remains largely unstudied. To address this, we focus on the convergence and stability of the stochastic proximal point algorithm with momentum (SPPAM), and show that SPPAM allows a faster linear convergence rate compared to stochastic proximal point algorithm (SPPA) with a better contraction factor, under proper hyperparameter tuning. In terms of stability, we show that SPPAM depends on problem constants more favorably than SGDM, allowing a wider range of step size and momentum that lead to convergence.

MoDELS · 可約的 · 對數幾率回歸 · binary · CASE ·

2021 年 11 月 11 日

Minding non-collapsibility of odds ratios when recalibrating risk prediction models

Mohsen Sadatsafavi,Hamid Tavakoli,Abdollah Safari

from arxiv, 12 Pages, 1 Figure, 1567 words

In clinical prediction modeling, model updating refers to the practice of modifying a prediction model before it is used in a new setting. In the context of logistic regression for a binary outcome, one of the simplest updating methods is a fixed odds-ratio transformation of predicted risks to improve calibration-in-the-large. Previous authors have proposed equations for calculating this odds-ratio based on the discrepancy between the prevalence in the original and the new population, or between the average of predicted and observed risks. We show that this method fails to consider the non-collapsibility of odds-ratio. Consequently, it under-corrects predicted risks, especially when predicted risks are more dispersed (i.e., for models with good discrimination). We suggest an approximate equation for recovering the conditional odds-ratio from the mean and variance of predicted risks. Brief simulations and a case study show that this approach reduces under-correction, sometimes substantially. R code for implementation is provided.

極小點 · 各向同性 · 過擬合 · 稀疏 · 計算學習理論 ·

2021 年 11 月 10 日

Tight bounds for minimum l1-norm interpolation of noisy data

Guillaume Wang,Konstantin Donhauser,Fanny Yang

from arxiv, 29 pages, 1 figure

We provide matching upper and lower bounds of order $\sigma^2/\log(d/n)$ for the prediction error of the minimum $\ell_1$-norm interpolator, a.k.a. basis pursuit. Our result is tight up to negligible terms when $d \gg n$, and is the first to imply asymptotic consistency of noisy minimum-norm interpolation for isotropic features and sparse ground truths. Our work complements the literature on "benign overfitting" for minimum $\ell_2$-norm interpolation, where asymptotic consistency can be achieved only when the features are effectively low-dimensional.

MoDELS · Machine Learning · 學成 · 線性的 · 線性模型 ·

2021 年 9 月 6 日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Yehuda Dar,Vidya Muthukumar,Richard G. Baraniuk

The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated with detrimental overfitting, and yet a wide range of interpolating models -- from simple linear models to deep neural networks -- have recently been observed to generalize extremely well on fresh test data. Indeed, the recently discovered double descent phenomenon has revealed that highly overparameterized models often improve over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theory and foundational empirical studies, even for the simplest case of the linear model. The underpinnings of this understanding have been laid in very recent analyses of overparameterized linear regression and related statistical learning tasks, which resulted in precise analytic characterizations of double descent. This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.

鏈路預測 · 圖 · 注意力機制 · Extensibility · Performer ·

2021 年 5 月 18 日

Link Prediction on N-ary Relational Facts: A Graph-based Approach

Quan Wang,Haifeng Wang,Yajuan Lyu,Yong Zhu

from arxiv, Accepted to Findings of ACL 2021

Link prediction on knowledge graphs (KGs) is a key research topic. Previous work mainly focused on binary relations, paying less attention to higher-arity relations although they are ubiquitous in real-world KGs. This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task. The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention. The fully-connected attention captures universal inter-vertex interactions, while with edge-aware attentive biases to particularly encode the graph structure and its heterogeneity. In this fashion, our approach fully models global and local dependencies in each n-ary fact, and hence can more effectively capture associations therein. Extensive evaluation verifies the effectiveness and superiority of our approach. It performs substantially and consistently better than current state-of-the-art across a variety of n-ary relational benchmarks. Our code is publicly available.

Performer · 估計/估計量 · 經驗風險最小化 · 經驗風險 · 方差 ·

2017 年 12 月 14 日

Variance-based regularization with convex objectives

John Duchi,Hongseok Namkoong

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.