高清一区二区三区视频在线观看_一级无码免费视频_亚洲AV综合色区无码国产播放_国产已婚妇女精油推拿按摩_黄色网站发布在线免费观看视频_变态强奸乱伦熟妇_午夜亚洲色欲AV一区二区三区

Point processes in time have a wide range of applications that include the claims arrival process in insurance or the analysis of queues in operations research. Due to advances in technology, such samples of point processes are increasingly encountered. A key object of interest is the local intensity function. It has a straightforward interpretation that allows to understand and explore point process data. We consider functional approaches for point processes, where one has a sample of repeated realizations of the point process. This situation is inherently connected with Cox processes, where the intensity functions of the replications are modeled as random functions. Here we study a situation where one records covariates for each replication of the process, such as the daily temperature for bike rentals. For modeling point processes as responses with vector covariates as predictors we propose a novel regression approach for the intensity function that is intrinsically nonparametric. While the intensity function of a point process that is only observed once on a fixed domain cannot be identified, we show how covariates and repeated observations of the process can be utilized to make consistent estimation possible, and we also derive asymptotic rates of convergence without invoking parametric assumptions.

相關內容

Processing（編(bian)程語言）

關注 121

Processing 是(shi)一門(men)開源編程語言和與之配套(tao)的集成開發環(huan)境（IDE）的名稱。Processing 在電子(zi)藝術(shu)(shu)和視覺設(she)計社區被(bei)用來(lai)教授編程基(ji)礎，并運(yun)用于大量的新媒(mei)體和互動藝術(shu)(shu)作品中。

anchor · 泛化理論 · 測試數據 · MoDELS · 數據生成過程 ·

2021 年 11 月 3 日

Distributional Anchor Regression

Lucas Kook,Beate Sick,Peter Bühlmann

from arxiv, Updated preprint. 26 pages, 13 figures, 2 tables

Prediction models often fail if train and test data do not stem from the same distribution. Out-of-distribution (OOD) generalization to unseen, perturbed test data is a desirable but difficult-to-achieve property for prediction models and in general requires strong assumptions on the data generating process (DGP). In a causally inspired perspective on OOD generalization, the test data arise from a specific class of interventions on exogenous random variables of the DGP, called anchors. Anchor regression models, introduced by Rothenhaeusler et al. (2021), protect against distributional shifts in the test data by employing causal regularization. However, so far anchor regression has only been used with a squared-error loss which is inapplicable to common responses such as censored continuous or ordinal data. Here, we propose a distributional version of anchor regression which generalizes the method to potentially censored responses with at least an ordered sample space. To this end, we combine a flexible class of parametric transformation models for distributional regression with an appropriate causal regularizer under a more general notion of residuals. In an exemplary application and several simulation scenarios we demonstrate the extent to which OOD generalization is possible.

近似 · 閾值 · Extensibility · 約束 · INFORMS ·

2021 年 11 月 3 日

Probing to Minimize

Weina Wang,Anupam Gupta,Jalani Williams

from arxiv, Accepted to ITCS 2022

We develop approximation algorithms for set-selection problems with deterministic constraints, but random objective values, i.e., stochastic probing problems. When the goal is to maximize the objective, approximation algorithms for probing problems are well-studied. On the other hand, few techniques are known for minimizing the objective, especially in the adaptive setting, where information about the random objective is revealed during the set-selection process and allowed to influence it. For minimization problems in particular, incorporating adaptivity can have a considerable effect on performance. In this work, we seek approximation algorithms that compare well to the optimal adaptive policy. We develop new techniques for adaptive minimization, applying them to a few problems of interest. The core technique we develop here is an approximate reduction from an adaptive expectation minimization problem to a set of adaptive probability minimization problems which we call threshold problems. By providing near-optimal solutions to these threshold problems, we obtain bicriteria adaptive policies. We apply this method to obtain an adaptive approximation algorithm for the MIN-ELEMENT problem, where the goal is to adaptively pick random variables to minimize the expected minimum value seen among them, subject to a knapsack constraint. This partially resolves an open problem raised in Goel et. al's "How to probe for an extreme value". We further consider three extensions on the MIN-ELEMENT problem, where our objective is the sum of the smallest k element-weights, or the weight of the min-weight basis of a given matroid, or where the constraint is not given by a knapsack but by a matroid constraint. For all three variations we explore, we develop adaptive approximation algorithms for their corresponding threshold problems, and prove their near-optimality via coupling arguments.

Performer · MoDELS · GROUP · 期望極大算法 · 學成 ·

2021 年 11 月 2 日

Deep Cox Mixtures for Survival Regression

Chirag Nagpal,Steve Yadlowsky,Negar Rostamzadeh,Katherine Heller

from arxiv, Machine Learning for Healthcare Conference, 2021

Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for this model that does hard assignments to mixture groups to make optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.

估計/估計量 · 線性回歸 · Performer · 穩健性 · 方陣 ·

2021 年 11 月 1 日

A robust partial least squares approach for function-on-function regression

Ufuk Beyaztas,Han Lin Shang

from arxiv, 27 pages, 8 figures, to appear in the Brazilian Journal of Probability and Statistics

The function-on-function linear regression model in which the response and predictors consist of random curves has become a general framework to investigate the relationship between the functional response and functional predictors. Existing methods to estimate the model parameters may be sensitive to outlying observations, common in empirical applications. In addition, these methods may be severely affected by such observations, leading to undesirable estimation and prediction results. A robust estimation method, based on iteratively reweighted simple partial least squares, is introduced to improve the prediction accuracy of the function-on-function linear regression model in the presence of outliers. The performance of the proposed method is based on the number of partial least squares components used to estimate the function-on-function linear regression model. Thus, the optimum number of components is determined via a data-driven error criterion. The finite-sample performance of the proposed method is investigated via several Monte Carlo experiments and an empirical data analysis. In addition, a nonparametric bootstrap method is applied to construct pointwise prediction intervals for the response function. The results are compared with some of the existing methods to illustrate the improvement potentially gained by the proposed method.

后驗推斷 · 推斷 · 特化 · 頻率主義學派 · tuning ·

2021 年 11 月 1 日

Posterior Inference for Quantile Regression: Adaptation to Sparsity

Yuanzhi Li,Xuming He

Quantile regression is a powerful data analysis tool that accommodates heterogeneous covariate-response relationships. We find that by coupling the asymmetric Laplace working likelihood with appropriate shrinkage priors, we can deliver posterior inference that automatically adapts to possible sparsity in quantile regression analysis. After a suitable adjustment on the posterior variance, the posterior inference provides asymptotically valid inference under heterogeneity. Furthermore, the proposed approach leads to oracle asymptotic efficiency for the active (nonzero) quantile regression coefficients and super-efficiency for the non-active ones. By avoiding the need to pursue dichotomous variable selection, the Bayesian computational framework demonstrates desirable inference stability with respect to tuning parameter selection. Our work helps to uncloak the value of Bayesian computational methods in frequentist inference for quantile regression.

Extensibility · Performer · 估計/估計量 · 樣本 · 主動學習 ·

2021 年 10 月 31 日

Chernoff Sampling for Active Testing and Extension to Active Regression

Subhojyoti Mukherjee,Ardhendu Tripathy,Robert Nowak

from arxiv, 46 pages, 9 figures

Active learning can reduce the number of samples needed to perform a hypothesis test and to estimate the parameters of a model. In this paper, we revisit the work of Chernoff that described an asymptotically optimal algorithm for performing a hypothesis test. We obtain a novel sample complexity bound for Chernoff's algorithm, with a non-asymptotic term that characterizes its performance at a fixed confidence level. We also develop an extension of Chernoff sampling that can be used to estimate the parameters of a wide variety of models and we obtain a non-asymptotic bound on the estimation error. We apply our extension of Chernoff sampling to actively learn neural network models and to estimate parameters in real-data linear and non-linear regression problems, where our approach performs favorably to state-of-the-art methods.

泛化理論 · 核回歸 · 核化 · Machine Learning · 數據生成過程 ·

2021 年 10 月 30 日

Out-of-Distribution Generalization in Kernel Regression

Abdulkadir Canatar,Blake Bordelon,Cengiz Pehlevan

In real word applications, data generating process for training a machine learning model often differs from what the model encounters in the test stage. Understanding how and whether machine learning models generalize under such distributional shifts have been a theoretical challenge. Here, we study generalization in kernel regression when the training and test distributions are different using methods from statistical physics. Using the replica method, we derive an analytical formula for the out-of-distribution generalization error applicable to any kernel and real datasets. We identify an overlap matrix that quantifies the mismatch between distributions for a given kernel as a key determinant of generalization performance under distribution shift. Using our analytical expressions we elucidate various generalization phenomena including possible improvement in generalization when there is a mismatch. We develop procedures for optimizing training and test distributions for a given data budget to find best and worst case generalizations under the shift. We present applications of our theory to real and synthetic datasets and for many kernels. We compare results of our theory applied to Neural Tangent Kernel with simulations of wide networks and show agreement. We analyze linear regression in further depth.

估計/估計量 · 估計誤差 · 近似 · COVID-19 · 閉式 ·

2021 年 10 月 30 日

Improved log-Gaussian approximation for over-dispersed Poisson regression: application to spatial analysis of COVID-19

Daisuke Murakami,Tomoko Matsui

In the era of open data, Poisson and other count regression models are increasingly important. Still, conventional Poisson regression has remaining issues in terms of identifiability and computational efficiency. Especially, due to an identification problem, Poisson regression can be unstable for small samples with many zeros. Provided this, we develop a closed-form inference for an over-dispersed Poisson regression including Poisson additive mixed models. The approach is derived via mode-based log-Gaussian approximation. The resulting method is fast, practical, and free from the identification problem. Monte Carlo experiments demonstrate that the estimation error of the proposed method is a considerably smaller estimation error than the closed-form alternatives and as small as the usual Poisson regressions. For counts with many zeros, our approximation has better estimation accuracy than conventional Poisson regression. We obtained similar results in the case of Poisson additive mixed modeling considering spatial or group effects. The developed method was applied for analyzing COVID-19 data in Japan. This result suggests that influences of pedestrian density, age, and other factors on the number of cases change over periods.

Machine Learning · MoDELS · 學成 · CC · 線性回歸 ·

2021 年 10 月 29 日

Physics-informed linear regression is a competitive approach compared to Machine Learning methods in building MPC

Felix Bünning,Benjamin Huber,Adrian Schalbetter,Ahmed Aboudonia,Mathias Hudoba de Badyn,Philipp Heer,Roy S. Smith,John Lygeros

from arxiv, 17 pages, 11 Figures, submitted to Applied Energy

Because physics-based building models are difficult to obtain as each building is individual, there is an increasing interest in generating models suitable for building MPC directly from measurement data. Machine learning methods have been widely applied to this problem and validated mostly in simulation; there are, however, few studies on a direct comparison of different models or validation in real buildings to be found in the literature. Methods that are indeed validated in application often lead to computationally complex non-convex optimization problems. Here we compare physics-informed Autoregressive-Moving-Average with Exogenous Inputs (ARMAX) models to Machine Learning models based on Random Forests and Input Convex Neural Networks and the resulting convex MPC schemes in experiments on a practical building application with the goal of minimizing energy consumption while maintaining occupant comfort, and in a numerical case study. We demonstrate that Predictive Control in general leads to savings between 26% and 49% of heating and cooling energy, compared to the building's baseline hysteresis controller. Moreover, we show that all model types lead to satisfactory control performance in terms of constraint satisfaction and energy reduction. However, we also see that the physics-informed ARMAX models have a lower computational burden, and a superior sample efficiency compared to the Machine Learning based models. Moreover, even if abundant training data is available, the ARMAX models have a significantly lower prediction error than the Machine Learning models, which indicates that the encoded physics-based prior of the former cannot independently be found by the latter.

可約的 · Performer · 正則化項 · 條件互信息 · Facebook AI Research ·

2021 年 10 月 28 日

Selective Regression Under Fairness Criteria

Abhin Shah,Yuheng Bu,Joshua Ka-Wing Lee,Subhro Das,Rameswar Panda,Prasanna Sattigeri,Gregory W. Wornell

Selective regression allows abstention from prediction if the confidence to make an accurate prediction is not sufficient. In general, by allowing a reject option, one expects the performance of a regression model to increase at the cost of reducing coverage (i.e., by predicting fewer samples). However, as shown in this work, in some cases, the performance of minority group can decrease while we reduce the coverage, and thus selective regression can magnify disparities between different sensitive groups. We show that such an unwanted behavior can be avoided if we can construct features satisfying the sufficiency criterion, so that the mean prediction and the associated uncertainty are calibrated across all the groups. Further, to mitigate the disparity in the performance across groups, we introduce two approaches based on this calibration criterion: (a) by regularizing an upper bound of conditional mutual information under a Gaussian assumption and (b) by regularizing a contrastive loss for mean and uncertainty prediction. The effectiveness of these approaches are demonstrated on synthetic as well as real-world datasets.