欧美丰满大乳屁股流白浆_一区二区三区性色福利在线视频_中文精品久久久久国产不卡_国产欧美日韩一级视频在线观看_久久久久国色A免费看_无码AV永久免费专区网站_永久免费无码AV片在线播放

Plausibility is a formalization of exact tests for parametric models and generalizes procedures such as Fisher's exact test. The resulting tests are based on cumulative probabilities of the probability density function and evaluate consistency with a parametric family while providing exact control of the $\alpha$ level for finite sample size. Model comparisons are inefficient in this approach. We generalize plausibility by incorporating weighing which allows to perform model comparisons. We show that one weighing scheme is asymptotically equivalent to the likelihood ratio test (LRT) and has finite sample guarantees for the test size under the null hypothesis unlike the LRT. We confirm theoretical properties in simulations that mimic the data set of our data application. We apply the method to a retinoblastoma data set and demonstrate a parent-of-origin effect. Weighted plausibility also has applications in high-dimensional data analysis and P-values for penalized regression models can be derived. We demonstrate superior performance as compared to a data-splitting procedure in a simulation study. We apply weighted plausibility to a high-dimensional gene expression, case-control prostate cancer data set. We discuss the flexibility of the approach by relating weighted plausibility to targeted learning, the bootstrap, and sparsity selection.

相關內容

確切的

關注 0

可約的 · 優化器 · 控制器 · MoDELS · 離散化 ·

2021 年 10 月 31 日

Reduced Order Model Predictive Control for Parametrized Parabolic Partial Differential Equations

Saskia Dietze,Martin A. Grepl

Model Predictive Control (MPC) is a well-established approach to solve infinite horizon optimal control problems. Since optimization over an infinite time horizon is generally infeasible, MPC determines a suboptimal feedback control by repeatedly solving finite time optimal control problems. Although MPC has been successfully used in many applications, applying MPC to large-scale systems -- arising, e.g., through discretization of partial differential equations -- requires the solution of high-dimensional optimal control problems and thus poses immense computational effort. We consider systems governed by parametrized parabolic partial differential equations and employ the reduced basis (RB) method as a low-dimensional surrogate model for the finite time optimal control problem. The reduced order optimal control serves as feedback control for the original large-scale system. We analyze the proposed RB-MPC approach by first developing a posteriori error bounds for the errors in the optimal control and associated cost functional. These bounds can be evaluated efficiently in an offline-online computational procedure and allow us to guarantee asymptotic stability of the closed-loop system using the RB-MPC approach. We also propose an adaptive strategy to choose the prediction horizon of the finite time optimal control problem. Numerical results are presented to illustrate the theoretical properties of our approach.

泛化理論 · 核回歸 · 核化 · Machine Learning · 數據生成過程 ·

2021 年 10 月 30 日

Out-of-Distribution Generalization in Kernel Regression

Abdulkadir Canatar,Blake Bordelon,Cengiz Pehlevan

In real word applications, data generating process for training a machine learning model often differs from what the model encounters in the test stage. Understanding how and whether machine learning models generalize under such distributional shifts have been a theoretical challenge. Here, we study generalization in kernel regression when the training and test distributions are different using methods from statistical physics. Using the replica method, we derive an analytical formula for the out-of-distribution generalization error applicable to any kernel and real datasets. We identify an overlap matrix that quantifies the mismatch between distributions for a given kernel as a key determinant of generalization performance under distribution shift. Using our analytical expressions we elucidate various generalization phenomena including possible improvement in generalization when there is a mismatch. We develop procedures for optimizing training and test distributions for a given data budget to find best and worst case generalizations under the shift. We present applications of our theory to real and synthetic datasets and for many kernels. We compare results of our theory applied to Neural Tangent Kernel with simulations of wide networks and show agreement. We analyze linear regression in further depth.

策略改進 · 優化器 · 樣本 · Performer · 評論員 ·

2021 年 10 月 29 日

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney,Ioannis Ch. Paschalidis,Christos G. Cassandras

from arxiv, To appear in 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

In real-world decision making tasks, it is critical for data-driven reinforcement learning methods to be both stable and sample efficient. On-policy methods typically generate reliable policy improvement throughout training, while off-policy methods make more efficient use of data through sample reuse. In this work, we combine the theoretically supported stability benefits of on-policy algorithms with the sample efficiency of off-policy algorithms. We develop policy improvement guarantees that are suitable for the off-policy setting, and connect these bounds to the clipping mechanism used in Proximal Policy Optimization. This motivates an off-policy version of the popular algorithm that we call Generalized Proximal Policy Optimization with Sample Reuse. We demonstrate both theoretically and empirically that our algorithm delivers improved performance by effectively balancing the competing goals of stability and sample efficiency.

共軛先驗 · 后驗推斷 · 共軛 · 離散化 · 蒙特卡羅 ·

2021 年 10 月 29 日

Conjugate priors for count and rounded data regression

Daniel R. Kowal

Discrete data are abundant and often arise as counts or rounded data. However, even for linear regression models, conjugate priors and closed-form posteriors are typically unavailable, thereby necessitating approximations or Markov chain Monte Carlo for posterior inference. For a broad class of count and rounded data regression models, we introduce conjugate priors that enable closed-form posterior inference. Key posterior and predictive functionals are computable analytically or via direct Monte Carlo simulation. Crucially, the predictive distributions are discrete to match the support of the data and can be evaluated or simulated jointly across multiple covariate values. These tools are broadly useful for linear regression, nonlinear models via basis expansions, and model and variable selection. Multiple simulation studies demonstrate significant advantages in computing, predictive modeling, and selection relative to existing alternatives.

優化器 · 線性的 · 表示定理 · 線性回歸 · 泛函 ·

2021 年 10 月 29 日

Optimal prediction for kernel-based semi-functional linear regression

Keli Guo,Jun Fan,Lixing Zhu

In this paper, we establish minimax optimal rates of convergence for prediction in a semi-functional linear model that consists of a functional component and a less smooth nonparametric component. Our results reveal that the smoother functional component can be learned with the minimax rate as if the nonparametric component were known. More specifically, a double-penalized least squares method is adopted to estimate both the functional and nonparametric components within the framework of reproducing kernel Hilbert spaces. By virtue of the representer theorem, an efficient algorithm that requires no iterations is proposed to solve the corresponding optimization problem, where the regularization parameters are selected by the generalized cross validation criterion. Numerical studies are provided to demonstrate the effectiveness of the method and to verify the theoretical analysis.

分解的 · MoDELS · CP · 正交 · 估計/估計量 ·

2021 年 10 月 29 日

CP Factor Model for Dynamic Tensors

Yuefeng Han,Cun-Hui Zhang,Rong Chen

Observations in various applications are frequently represented as a time series of multidimensional arrays, called tensor time series, preserving the inherent multidimensional structure. In this paper, we present a factor model approach, in a form similar to tensor CP decomposition, to the analysis of high-dimensional dynamic tensor time series. As the loading vectors are uniquely defined but not necessarily orthogonal, it is significantly different from the existing tensor factor models based on Tucker-type tensor decomposition. The model structure allows for a set of uncorrelated one-dimensional latent dynamic factor processes, making it much more convenient to study the underlying dynamics of the time series. A new high order projection estimator is proposed for such a factor model, utilizing the special structure and the idea of the higher order orthogonal iteration procedures commonly used in Tucker-type tensor factor model and general tensor CP decomposition procedures. Theoretical investigation provides statistical error bounds for the proposed methods, which shows the significant advantage of utilizing the special model structure. Simulation study is conducted to further demonstrate the finite sample properties of the estimators. Real data application is used to illustrate the model and its interpretations.

近似 · 圖 · Performer · 估計/估計量 · 分解的 ·

2021 年 10 月 28 日

Approximating the Arboricity in Sublinear Time

Talya Eden,Saleet Mossel,Dana Ron

We consider the problem of approximating the arboricity of a graph $G= (V,E)$, which we denote by $\mathsf{arb}(G)$, in sublinear time, where the arboricity of a graph is the minimal number of forests required to cover its edges. An algorithm for this problem may perform degree and neighbor queries, and is allowed a small error probability. We design an algorithm that outputs an estimate $\hat{\alpha}$, such that with probability $1-1/\textrm{poly}(n)$, $\mathsf{arb}(G)/c\log^2 n \leq \hat{\alpha} \leq \mathsf{arb}(G)$, where $n=|V|$ and $c$ is a constant. The expected query complexity and running time of the algorithm are $O(n/\mathsf{arb}(G))\cdot \textrm{poly}(\log n)$, and this upper bound also holds with high probability. %($\widetilde{O}(\cdot)$ is used to suppress $\textrm{poly}(\log n)$ dependencies). This bound is optimal for such an approximation up to a $\textrm{poly}(\log n)$ factor.

相互獨立的 · 真實值 · 無監督 · Machine Learning · 學成 ·

2021 年 10 月 28 日

Aggregation as Unsupervised Learning and its Evaluation

Maria Ulan,Welf L?we,Morgan Ericsson,Anna Wingkvist

from arxiv, 13 pages, 8 figures

Regression uses supervised machine learning to find a model that combines several independent variables to predict a dependent variable based on ground truth (labeled) data, i.e., tuples of independent and dependent variables (labels). Similarly, aggregation also combines several independent variables to a dependent variable. The dependent variable should preserve properties of the independent variables, e.g., the ranking or relative distance of the independent variable tuples, and/or represent a latent ground truth that is a function of these independent variables. However, ground truth data is not available for finding the aggregation model. Consequently, aggregation models are data agnostic or can only be derived with unsupervised machine learning approaches. We introduce a novel unsupervised aggregation approach based on intrinsic properties of unlabeled training data, such as the cumulative probability distributions of the single independent variables and their mutual dependencies. We present an empirical evaluation framework that allows assessing the proposed approach against other aggregation approaches from two perspectives: (i) how well the aggregation output represents properties of the input tuples, and (ii) how well can aggregated output predict a latent ground truth. To this end, we use data sets for assessing supervised regression approaches that contain explicit ground truth labels. However, the ground truth is not used for deriving the aggregation models, but it allows for the assessment from a perspective (ii). More specifically, we use regression data sets from the UCI machine learning repository and benchmark several data-agnostic and unsupervised approaches for aggregation against ours. The benchmark results indicate that our approach outperforms the other data-agnostic and unsupervised aggregation approaches. It is almost on par with linear regression.

估計/估計量 · 配分函數 · 泛函 · 規范化因子 · 樣本 ·

2021 年 10 月 27 日

Parameter estimation for Gibbs distributions

David G. Harris,Vladimir Kolmogorov

from arxiv, This is a significantly extended version of a paper "A Faster Approximation Algorithm for the Gibbs Partition Function" (arXiv:1608.04223), which was published in COLT 2018. It covers many additional topics; most importantly, algorithms to estimate counts and algorithm specialized for integer-valued distributions

We consider \emph{Gibbs distributions}, which are families of probability distributions over a discrete space $\Omega$ with probability mass function of the form $\mu^\Omega_\beta(\omega) \propto e^{\beta H(\omega)}$ for $\beta$ in an interval $[\beta_{\min}, \beta_{\max}]$ and $H( \omega ) \in \{0 \} \cup [1, n]$. The \emph{partition function} is the normalization factor $Z(\beta)=\sum_{\omega \in\Omega}e^{\beta H(\omega)}$. Two important parameters of these distributions are the log partition ratio $q = \log \tfrac{Z(\beta_{\max})}{Z(\beta_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs and perfect matchings in a graph. We develop a key subroutine to estimate the partition function $Z$. Specifically, it generates a data structure to estimate $Z(\beta)$ for \emph{all} values $\beta$, without further samples. Constructing the data structure requires $O(\frac{q \log n}{\varepsilon^2})$ samples for general Gibbs distributions and $O(\frac{n^2 \log n}{\varepsilon^2} + n \log q)$ samples for integer-valued distributions. This improves over a prior algorithm of Huber (2015) which computes a single point estimate $Z(\beta_\max)$ using $O( q \log n( \log q + \log \log n + \varepsilon^{-2}))$ samples. We show matching lower bounds, demonstrating that this complexity is optimal as a function of $n$ and $q$ up to logarithmic terms.

Extensibility · 標記空間 · 目標檢測 · 類別 · Principle ·

2018 年 4 月 12 日

Zero-Shot Object Detection

Ankan Bansal,Karan Sikka,Gaurav Sharma,Rama Chellappa,Ajay Divakaran

from arxiv, 16 pages

We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-grained categories cf. prior works on zero-shot classification. We follow a principled approach by first adapting visual-semantic embeddings for ZSD. We then discuss the problems associated with selecting a background class and motivate two background-aware approaches for learning robust detectors. One of these models uses a fixed background class and the other is based on iterative latent assignments. We also outline the challenge associated with using a limited number of training classes and propose a solution based on dense sampling of the semantic label space using auxiliary data with a large number of categories. We propose novel splits of two standard detection datasets - MSCOCO and VisualGenome and discuss extensive empirical results to highlight the benefits of the proposed methods. We provide useful insights into the algorithm and conclude by posing some open questions to encourage further research.