斗破苍穹第四季25集免费观看_二区亚洲国产精品一区久久_在线观看黄A免费无毒网站_97人人模人人妻人人添_最新亚洲日韩一级_亚洲国严精品久久久久婷婷_免费播放一区二区三区视频

from arxiv, Yuanyuan Lin (E-mail: [email protected]), Associate Professor in Department of Statistics, The Chinese University of Hong Kong, Hong Kong is the corresponding author

Can two separate case-control studies, one about Hepatitis disease and the other about Fibrosis, for example, be combined together? It would be hugely beneficial if two or more separately conducted case-control studies, even for entirely irrelevant purposes, can be merged together with a unified analysis that produces better statistical properties, e.g., more accurate estimation of parameters. In this paper, we show that, when using the popular logistic regression model, the combined/integrative analysis produces a more accurate estimation of the slope parameters than the single case-control study. It is known that, in a single logistic case-control study, the intercept is not identifiable, contrary to prospective studies. In combined case-control studies, however, the intercepts are proved to be identifiable under mild conditions. The resulting maximum likelihood estimates of the intercepts and slopes are proved to be consistent and asymptotically normal, with asymptotic variances achieving the semiparametric efficiency lower bound.

相關內容

可(ke)辨認的

關注 4

損失函數（機器學習） · 估計/估計量 · 穩健性 · MASS · 泛函 ·

2021 年 7 月 26 日

From robust tests to Bayes-like posterior distributions

Yannick Baraud

In the Bayes paradigm and for a given loss function, we propose the construction of a new type of posterior distributions for estimating the law of an $n$-sample. The loss functions we have in mind are based on the total variation distance, the Hellinger distance as well as some $\mathbb{L}_{j}$-distances. We prove that, with a probability close to one, this new posterior distribution concentrates its mass in a neighbourhood of the law of the data, for the chosen loss function, provided that this law belongs to the support of the prior or, at least, lies close enough to it. We therefore establish that the new posterior distribution enjoys some robustness properties with respect to a possible misspecification of the prior, or more precisely, its support. For the total variation and squared Hellinger losses, we also show that the posterior distribution keeps its concentration properties when the data are only independent, hence not necessarily i.i.d., provided that most of their marginals are close enough to some probability distribution around which the prior puts enough mass. The posterior distribution is therefore also stable with respect to the equidistribution assumption. We illustrate these results by several applications. We consider the problems of estimating a location parameter or both the location and the scale of a density in a nonparametric framework. Finally, we also tackle the problem of estimating a density, with the squared Hellinger loss, in a high-dimensional parametric model under some sparcity conditions. The results established in this paper are non-asymptotic and provide, as much as possible, explicit constants.

估計/估計量 · 對數似然 · 極大似然估計 · 似然 · 再參數化/重參數化 ·

2021 年 7 月 26 日

Human Pose Regression with Residual Log-likelihood Estimation

Jiefeng Li,Siyuan Bian,Ailing Zeng,Can Wang,Bo Pang,Wentao Liu,Cewu Lu

from arxiv, ICCV 2021 Oral

Heatmap-based methods dominate in the field of human pose estimation by modelling the output distribution through likelihood heatmaps. In contrast, regression-based methods are more efficient but suffer from inferior performance. In this work, we explore maximum likelihood estimation (MLE) to develop an efficient and effective regression-based methods. From the perspective of MLE, adopting different regression losses is making different assumptions about the output density function. A density function closer to the true distribution leads to a better regression performance. In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution. Concretely, RLE learns the change of the distribution instead of the unreferenced underlying distribution to facilitate the training process. With the proposed reparameterization design, our method is compatible with off-the-shelf flow models. The proposed method is effective, efficient and flexible. We show its potential in various human pose estimation tasks with comprehensive experiments. Compared to the conventional regression paradigm, regression with RLE bring 12.4 mAP improvement on MSCOCO without any test-time overhead. Moreover, for the first time, especially on multi-person pose estimation, our regression method is superior to the heatmap-based methods. Our code is available at //github.com/Jeff-sjtu/res-loglikelihood-regression

估計/估計量 · MCMC · 蒙特卡羅 · 馬爾可夫鏈 · 蒙特卡羅方法 ·

2021 年 7 月 25 日

A Survey of Monte Carlo Methods for Parameter Estimation

D. Luengo,L. Martino,M. Bugallo,V. Elvira,S. S?rkk?

Statistical signal processing applications usually require the estimation of some parameters of interest given a set of observed data. These estimates are typically obtained either by solving a multi-variate optimization problem, as in the maximum likelihood (ML) or maximum a posteriori (MAP) estimators, or by performing a multi-dimensional integration, as in the minimum mean squared error (MMSE) estimators. Unfortunately, analytical expressions for these estimators cannot be found in most real-world applications, and the Monte Carlo (MC) methodology is one feasible approach. MC methods proceed by drawing random samples, either from the desired distribution or from a simpler one, and using them to compute consistent estimators. The most important families of MC algorithms are Markov chain MC (MCMC) and importance sampling (IS). On the one hand, MCMC methods draw samples from a proposal density, building then an ergodic Markov chain whose stationary distribution is the desired distribution by accepting or rejecting those candidate samples as the new state of the chain. On the other hand, IS techniques draw samples from a simple proposal density, and then assign them suitable weights that measure their quality in some appropriate way. In this paper, we perform a thorough review of MC methods for the estimation of static parameters in signal processing applications. A historical note on the development of MC schemes is also provided, followed by the basic MC method and a brief description of the rejection sampling (RS) algorithm, as well as three sections describing many of the most relevant MCMC and IS algorithms, and their combined use.

秩 · SimPLe · 強對偶性 · 可辨認的 · 近似 ·

2021 年 7 月 23 日

On the simplicity and conditioning of low rank semidefinite programs

Lijun Ding,Madeleine Udell

from arxiv, 24 pages, 1 figure, and 1 table

Low rank matrix recovery problems appear widely in statistics, combinatorics, and imaging. One celebrated method for solving these problems is to formulate and solve a semidefinite program (SDP). It is often known that the exact solution to the SDP with perfect data recovers the solution to the original low rank matrix recovery problem. It is more challenging to show that an approximate solution to the SDP formulated with noisy problem data acceptably solves the original problem; arguments are usually ad hoc for each problem setting, and can be complex. In this note, we identify a set of conditions that we call simplicity that limit the error due to noisy problem data or incomplete convergence. In this sense, simple SDPs are robust: simple SDPs can be (approximately) solved efficiently at scale; and the resulting approximate solutions, even with noisy data, can be trusted. Moreover, we show that simplicity holds generically, and also for many structured low rank matrix recovery problems, including the stochastic block model, $\mathbb{Z}_2$ synchronization, and matrix completion. Formally, we call an SDP simple if it has a surjective constraint map, admits a unique primal and dual solution pair, and satisfies strong duality and strict complementarity. However, simplicity is not a panacea: we show the Burer-Monteiro formulation of the SDP may have spurious second-order critical points, even for a simple SDP with a rank 1 solution.

INFORMS · SimPLe · 模態 · MoDELS · 情景 ·

2021 年 7 月 22 日

A Logic of Expertise

Joseph Singleton

In this paper we introduce a simple modal logic framework to reason about the expertise of an information source. In the framework, a source is an expert on a proposition $p$ if they are able to correctly determine the truth value of $p$ in any possible world. We also consider how information may be false, but true after accounting for the lack of expertise of the source. This is relevant for modelling situations in which information sources make claims beyond their domain of expertise. We use non-standard semantics for the language based on an expertise set with certain closure properties. It turns out there is a close connection between our semantics and S5 epistemic logic, so that expertise can be expressed in terms of knowledge at all possible states. We use this connection to obtain a sound and complete axiomatisation.

Continuity · MoDELS · CASE · 模態 · 泛函 ·

2021 年 7 月 21 日

Dynamic Cantor Derivative Logic

David Fernández-Duque,Yoàv Montacute

from arxiv, 26 pages, 1 figure

Topological semantics for modal logic based on the Cantor derivative operator gives rise to derivative logics, also referred to as $d$-logics. Unlike logics based on the topological closure operator, $d$-logics have not previously been studied in the framework of dynamical systems, which are pairs $(X,f)$ consisting of a topological space $X$ equipped with a continuous function $f\colon X\to X$. We introduce the logics $\bf{wK4C}$, $\bf{K4C}$ and $\bf{GLC}$ and show that they all have the finite Kripke model property and are sound and complete with respect to the $d$-semantics in this dynamical setting. In particular, we prove that $\bf{wK4C}$ is the $d$-logic of all dynamic topological systems, $\bf{K4C}$ is the $d$-logic of all $T_D$ dynamic topological systems, and $\bf{GLC}$ is the $d$-logic of all dynamic topological systems based on a scattered space. We also prove a general result for the case where $f$ is a homeomorphism, which in particular yields soundness and completeness for the corresponding systems $\bf{wK4H}$, $\bf{K4H}$ and $\bf{GLH}$. The main contribution of this work is the foundation of a general proof method for finite model property and completeness of dynamic topological $d$-logics. Furthermore, our result for $\bf{GLC}$ constitutes the first step towards a proof of completeness for the trimodal topo-temporal language with respect to a finite axiomatisation $\mathord{-}$ something known to be impossible over the class of all spaces.

穩健性 · Neural Networks · Networking · 估計/估計量 · 預測器/決策函數 ·

2021 年 7 月 21 日

Robust Nonparametric Regression with Deep Neural Networks

Guohao Shen,Yuling Jiao,Yuanyuan Lin,Jian Huang

from arxiv, Guohao Shen and Yuling Jiao contributed equally to this work. Corresponding authors: Yuanyuan Lin (Email: [email protected]) and Jian Huang (Email: jian-). arXiv admin note: substantial text overlap with arXiv:2104.06708

In this paper, we study the properties of robust nonparametric estimation using deep neural networks for regression models with heavy tailed error distributions. We establish the non-asymptotic error bounds for a class of robust nonparametric regression estimators using deep neural networks with ReLU activation under suitable smoothness conditions on the regression function and mild conditions on the error term. In particular, we only assume that the error distribution has a finite p-th moment with p greater than one. We also show that the deep robust regression estimators are able to circumvent the curse of dimensionality when the distribution of the predictor is supported on an approximate lower-dimensional set. An important feature of our error bound is that, for ReLU neural networks with network width and network size (number of parameters) no more than the order of the square of the dimensionality d of the predictor, our excess risk bounds depend sub-linearly on d. Our assumption relaxes the exact manifold support assumption, which could be restrictive and unrealistic in practice. We also relax several crucial assumptions on the data distribution, the target regression function and the neural networks required in the recent literature. Our simulation studies demonstrate that the robust methods can significantly outperform the least squares method when the errors have heavy-tailed distributions and illustrate that the choice of loss function is important in the context of deep nonparametric regression.

邊緣化 · 對率損失 · FAST · 線性分類 · Performer ·

2021 年 7 月 1 日

Fast Margin Maximization via Dual Acceleration

Ziwei Ji,Nathan Srebro,Matus Telgarsky

from arxiv, ICML 2021

We present and analyze a momentum-based gradient method for training linear classifiers with an exponentially-tailed loss (e.g., the exponential or logistic loss), which maximizes the classification margin on separable data at a rate of $\widetilde{\mathcal{O}}(1/t^2)$. This contrasts with a rate of $\mathcal{O}(1/\log(t))$ for standard gradient descent, and $\mathcal{O}(1/t)$ for normalized gradient descent. This momentum-based method is derived via the convex dual of the maximum-margin problem, and specifically by applying Nesterov acceleration to this dual, which manages to result in a simple and intuitive method in the primal. This dual view can also be used to derive a stochastic variant, which performs adaptive non-uniform sampling via the dual variables.

秩 · 優化器 · Facebook AI Research · MoDELS · 隨機梯度下降 ·

2021 年 5 月 3 日

Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

Harrie Oosterhuis

from arxiv, full paper in the SIGIR 2021 conference proceedings

Recent work has proposed stochastic Plackett-Luce (PL) ranking models as a robust choice for optimizing relevance and fairness metrics. Unlike their deterministic counterparts that require heuristic optimization algorithms, PL models are fully differentiable. Theoretically, they can be used to optimize ranking metrics via stochastic gradient descent. However, in practice, the computation of the gradient is infeasible because it requires one to iterate over all possible permutations of items. Consequently, actual applications rely on approximating the gradient via sampling techniques. In this paper, we introduce a novel algorithm: PL-Rank, that estimates the gradient of a PL ranking model w.r.t. both relevance and fairness metrics. Unlike existing approaches that are based on policy gradients, PL-Rank makes use of the specific structure of PL models and ranking metrics. Our experimental analysis shows that PL-Rank has a greater sample-efficiency and is computationally less costly than existing policy gradients, resulting in faster convergence at higher performance. PL-Rank further enables the industry to apply PL models for more relevant and fairer real-world ranking systems.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.