销魂美女一区二区三区AV,九九99精品国产精品欧洲

When analyzing large datasets, it is common to select a model prior to making inferences. For reliable inferences, it is important to make adjustments that account for the model selection process, resulting in selective inferences. Our paper introduces an asymptotic pivot to infer about the effects of selected variables on conditional quantile functions. Utilizing estimators from smoothed quantile regression, our proposed pivot is easy to compute and ensures asymptotically-exact selective inferences without making strict distributional assumptions about the response variable. At the core of the pivot is the use of external randomization, which enables us to utilize the full sample for both selection and inference without the need to partition the data into independent data subsets or discard data at either step. On simulated data, we find that: (i) the asymptotic confidence intervals based on our pivot achieve the desired coverage rates, even in cases where sample splitting fails due to insufficient sample size for inference; (ii) our intervals are consistently shorter than those produced by sample splitting across various models and signal settings. We report similar findings when we apply our approach to study risk factors for low birth weights in a publicly accessible dataset of US birth records from 2022.

相關內容

Pivotal（公司）

關注 0

最大后驗 · 近似 · 回合 · INFORMS · Performer ·

2024 年 5 月 20 日

Enzymatic cycle-based receivers with high input impedance for approximate maximum a posteriori demodulation of concentration modulated signals

Chun Tung Chou

Molecular communication is a bio-inspired communication paradigm where molecules are used as the information carrier. This paper considers a molecular communication network where the transmitter uses concentration modulated signals for communication. Our focus is to design receivers that can demodulate these signals. We impose three features on our receivers. We want the receivers to use enzymatic cycles as their building blocks, have high input impedance and can work approximately as a maximum a posteriori (MAP) demodulator. No receivers with all these three features exist in the current molecular communication literature. We consider enzymatic cycles because they are a very common class of chemical reactions that are found in living cells. Since a receiver is to be placed in the communication environment, it should ideally have a high input impedance so that it has minimal impact on the environment and on other receivers. Lastly, a MAP receiver has good statistical performance. In this paper, we show how we can use time-scale separation to make an enzymatic cycle to have high input impedance and how the parameters of the enzymatic cycles can be chosen so that the receiver can approximately implement a MAP demodulator. We use simulation to study the performance of this receiver. In particular, we consider an environment with multiple receivers and show that a receiver has little impact on the bit error ratio of a nearby receiver because they have high input impedance.

Performer · 有偏 · 得分 · 基準 · MoDELS ·

2024 年 5 月 17 日

High-dimensional multiple imputation (HDMI) for partially observed confounders including natural language processing-derived auxiliary covariates

Janick Weberpals,Pamela A. Shaw,Kueiyu Joshua Lin,Richard Wyss,Joseph M Plasek,Li Zhou,Kerry Ngan,Thomas DeRamus,Sudha R. Raman,Bradley G. Hammill,Hana Lee,Sengwee Toh,John G. Connolly,Kimberly J. Dandreo,Fang Tian,Wei Liu,Jie Li,José J. Hernández-Mu?oz,Sebastian Schneeweiss,Rishi J. Desai

Multiple imputation (MI) models can be improved by including auxiliary covariates (AC), but their performance in high-dimensional data is not well understood. We aimed to develop and compare high-dimensional MI (HDMI) approaches using structured and natural language processing (NLP)-derived AC in studies with partially observed confounders. We conducted a plasmode simulation study using data from opioid vs. non-steroidal anti-inflammatory drug (NSAID) initiators (X) with observed serum creatinine labs (Z2) and time-to-acute kidney injury as outcome. We simulated 100 cohorts with a null treatment effect, including X, Z2, atrial fibrillation (U), and 13 other investigator-derived confounders (Z1) in the outcome generation. We then imposed missingness (MZ2) on 50% of Z2 measurements as a function of Z2 and U and created different HDMI candidate AC using structured and NLP-derived features. We mimicked scenarios where U was unobserved by omitting it from all AC candidate sets. Using LASSO, we data-adaptively selected HDMI covariates associated with Z2 and MZ2 for MI, and with U to include in propensity score models. The treatment effect was estimated following propensity score matching in MI datasets and we benchmarked HDMI approaches against a baseline imputation and complete case analysis with Z1 only. HDMI using claims data showed the lowest bias (0.072). Combining claims and sentence embeddings led to an improvement in the efficiency displaying the lowest root-mean-squared-error (0.173) and coverage (94%). NLP-derived AC alone did not perform better than baseline MI. HDMI approaches may decrease bias in studies with partially observed confounders where missingness depends on unobserved factors.

Microsoft Surface · 離散化 · 穩健性 · MoDELS · 表示 ·

2024 年 5 月 17 日

High order unfitted finite element discretizations for explicit boundary representations

Pere A. Martorell,Santiago Badia

When modeling scientific and industrial problems, geometries are typically modeled by explicit boundary representations obtained from computer-aided design software. Unfitted (also known as embedded or immersed) finite element methods offer a significant advantage in dealing with complex geometries, eliminating the need for generating unstructured body-fitted meshes. However, current unfitted finite elements on nonlinear geometries are restricted to implicit (possibly high-order) level set geometries. In this work, we introduce a novel automatic computational pipeline to approximate solutions of partial differential equations on domains defined by explicit nonlinear boundary representations. For the geometrical discretization, we propose a novel algorithm to generate quadratures for the bulk and surface integration on nonlinear polytopes required to compute all the terms in unfitted finite element methods. The algorithm relies on a nonlinear triangulation of the boundary, a kd-tree refinement of the surface cells that simplify the nonlinear intersections of surface and background cells to simple cases that are diffeomorphically equivalent to linear intersections, robust polynomial root-finding algorithms and surface parameterization techniques. We prove the correctness of the proposed algorithm. We have successfully applied this algorithm to simulate partial differential equations with unfitted finite elements on nonlinear domains described by computer-aided design models, demonstrating the robustness of the geometric algorithm and showing high-order accuracy of the overall method.

穩健性 · 可辨認的 · Performer · Extensibility · 操作 ·

2024 年 5 月 16 日

Scalarisation-based risk concepts for robust multi-objective optimisation

Ben Tu,Nikolas Kantas,Robert M. Lee,Behrang Shafei

from arxiv, The code is available at: //github.com/benmltu/scalarize

Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective extension of this problem from a computational standpoint. We identify that the majority of all robust multi-objective algorithms rely on two key operations: robustification and scalarisation. Robustification refers to the strategy that is used to marginalise over the uncertainty in the problem. Whilst scalarisation refers to the procedure that is used to encode the relative importance of each objective. As these operations are not necessarily commutative, the order that they are performed in has an impact on the resulting solutions that are identified and the final decisions that are made. This work aims to give an exposition on the philosophical differences between these two operations and highlight when one should opt for one ordering over the other. As part of our analysis, we showcase how many existing risk concepts can be easily integrated into the specification and solution of a robust multi-objective optimisation problem. Besides this, we also demonstrate how one can principally define the notion of a robust Pareto front and a robust performance metric based on our robustify and scalarise methodology. To illustrate the efficacy of these new ideas, we present two insightful numerical case studies which are based on real-world data sets.

線性的 · MoDELS · 線性分類 · 對數幾率回歸 · 缺失值 ·

2024 年 5 月 15 日

Harnessing pattern-by-pattern linear classifiers for prediction with missing data

Angel D Reyero Lobo,Alexis Ayme,Claire Boyer,Erwan Scornet

Missing values have been thoroughly analyzed in the context of linear models, where the final aim is to build coefficient estimates. However, estimating coefficients does not directly solve the problem of prediction with missing entries: a manner to address empty components must be designed. Major approaches to deal with prediction with missing values are empirically driven and can be decomposed into two families: imputation (filling in empty fields) and pattern-by-pattern prediction, where a predictor is built on each missing pattern. Unfortunately, most simple imputation techniques used in practice (as constant imputation) are not consistent when combined with linear models. In this paper, we focus on the more flexible pattern-by-pattern approaches and study their predictive performances on Missing Completely At Random (MCAR) data. We first show that a pattern-by-pattern logistic regression model is intrinsically ill-defined, implying that even classical logistic regression is impossible to apply to missing data. We then analyze the perceptron model and show how the linear separability property extends to partially-observed inputs. Finally, we use the Linear Discriminant Analysis to prove that pattern-by-pattern LDA is consistent in a high-dimensional regime. We refine our analysis to more complex MNAR data.

INFORMS · 分離的 · CASE · 標量 · ENJOY ·

2024 年 5 月 14 日

Inductive diagrams for causal reasoning

Jonathan Castello,Patrick Redmond,Lindsey Kuper

from arxiv, This revision is as published in PACMPL through OOPSLA, but with [authorversion] set. Compared to the previous version, the introduction has been almost entirely rewritten

The Lamport diagram is a pervasive and intuitive tool for informal reasoning about "happens-before" relationships in a concurrent system. However, traditional axiomatic formalizations of Lamport diagrams can be painful to work with in a mechanized setting like Agda. We propose an alternative, inductive formalization -- the causal separation diagram (CSD) -- that takes inspiration from string diagrams and concurrent separation logic, but enjoys a graphical syntax similar to Lamport diagrams. Critically, CSDs are based on the idea that causal relationships between events are witnessed by the paths that information follows between them. To that end, we model happens-before as a dependent type of paths between events. The inductive formulation of CSDs enables their interpretation into a variety of semantic domains. We demonstrate the interpretability of CSDs with a case study on properties of logical clocks, widely-used mechanisms for reifying causal relationships as data. We carry out this study by implementing a series of interpreters for CSDs, culminating in a generic proof of Lamport's clock condition that is parametric in a choice of clock. We instantiate this proof on Lamport's scalar clock, on Mattern's vector clock, and on the matrix clocks of Raynal et al. and of Wuu and Bernstein, yielding verified implementations of each. The CSD formalism and our case study are mechanized in the Agda proof assistant.

泛函 · 線性的 · 廣義函數 · 估計/估計量 · Extensibility ·

2024 年 5 月 14 日

High dimensional test for functional covariates

Huaqing Jin,Fei Jiang

from arxiv, 35 pages,4 figures, 4 tables

As medical devices become more complex, they routinely collect extensive and complicated data. While classical regressions typically examine the relationship between an outcome and a vector of predictors, it becomes imperative to identify the relationship with predictors possessing functional structures. In this article, we introduce a novel inference procedure for examining the relationship between outcomes and large-scale functional predictors. We target testing the linear hypothesis on the functional parameters under the generalized functional linear regression framework, where the number of the functional parameters grows with the sample size. We develop the estimation procedure for the high dimensional generalized functional linear model incorporating B-spline functional approximation and amenable regularization. Furthermore, we construct a procedure that is able to test the local alternative hypothesis on the linear combinations of the functional parameters. We establish the statistical guarantees in terms of non-asymptotic convergence of the parameter estimation and the oracle property and asymptotic normality of the estimators. Moreover, we derive the asymptotic distribution of the test statistic. We carry out intensive simulations and illustrate with a new dataset from an Alzheimer's disease magnetoencephalography study.

泛函 · Tensor · MoDELS · 情景 · MASS ·

2024 年 5 月 14 日

Conditional probability tensor decompositions for multivariate categorical response regression

Aaron J. Molstad,Xin Zhang

In many modern regression applications, the response consists of multiple categorical random variables whose probability mass is a function of a common set of predictors. In this article, we propose a new method for modeling such a probability mass function in settings where the number of response variables, the number of categories per response, and the dimension of the predictor are large. Our method relies on a functional probability tensor decomposition: a decomposition of a tensor-valued function such that its range is a restricted set of low-rank probability tensors. This decomposition is motivated by the connection between the conditional independence of responses, or lack thereof, and their probability tensor rank. We show that the model implied by such a low-rank functional probability tensor decomposition can be interpreted in terms of a mixture of regressions and can thus be fit using maximum likelihood. We derive an efficient and scalable penalized expectation maximization algorithm to fit this model and examine its statistical properties. We demonstrate the encouraging performance of our method through both simulation studies and an application to modeling the functional classes of genes.

標量 · 近鄰 · CASE · 泛函 · Extensibility ·

2024 年 5 月 13 日

Measuring dependence between a scalar response and a functional covariate

Siegfried H?rmann,Daniel Strenger

We extend the scope of a recently introduced dependence coefficient between a scalar response $Y$ and a multivariate covariate $X$ to the case where $X$ takes values in a general metric space. Particular attention is paid to the case where $X$ is a curve. While on the population level, this extension is straight forward, the asymptotic behavior of the estimator we consider is delicate. It crucially depends on the nearest neighbor structure of the infinite-dimensional covariate sample, where deterministic bounds on the degrees of the nearest neighbor graphs available in multivariate settings do no longer exist. The main contribution of this paper is to give some insight into this matter and to advise a way how to overcome the problem for our purposes. As an important application of our results, we consider an independence test.

tuning · 正則化項 · 核范數 · 跡 · Performer ·

2024 年 5 月 11 日

Tuning parameter selection for the adaptive nuclear norm regularized trace regression

Pan Shang,Lingchen Kong,Yiting Ma

Regularized models have been applied in lots of areas, with high-dimensional data sets being popular. Because tuning parameter decides the theoretical performance and computational efficiency of the regularized models, tuning parameter selection is a basic and important issue. We consider the tuning parameter selection for adaptive nuclear norm regularized trace regression, which achieves by the Bayesian information criterion (BIC). The proposed BIC is established with the help of an unbiased estimator of degrees of freedom. Under some regularized conditions, this BIC is proved to achieve the rank consistency of the tuning parameter selection. That is the model solution under selected tuning parameter converges to the true solution and has the same rank with that of the true solution in probability. Some numerical results are presented to evaluate the performance of the proposed BIC on tuning parameter selection.