99热日韩这里只有国产中文精品,高中小鲜肉自慰GAY免费,国产一产区二产区,国产男女无套内谢免费视频,天堂8中文在线官网

The case-cohort design is a commonly used cost-effective sampling strategy for large cohort studies, where some covariates are expensive to measure or obtain. In this paper, we consider regression analysis under a case-cohort study with interval-censored failure time data, where the failure time is only known to fall within an interval instead of being exactly observed. A common approach to analyze data from a case-cohort study is the inverse probability weighting approach, where only subjects in the case-cohort sample are used in estimation, and the subjects are weighted based on the probability of inclusion into the case-cohort sample. This approach, though consistent, is generally inefficient as it does not incorporate information outside the case-cohort sample. To improve efficiency, we first develop a sieve maximum weighted likelihood estimator under the Cox model based on the case-cohort sample, and then propose a procedure to update this estimator by using information in the full cohort. We show that the update estimator is consistent, asymptotically normal, and more efficient than the original estimator. The proposed method can flexibly incorporate auxiliary variables to further improve estimation efficiency. We employ a weighted bootstrap procedure for variance estimation. Simulation results indicate that the proposed method works well in practical situations. A real study on diabetes is provided for illustration.

相關內容

估計/估計量

關注 3

查準率/準確率 · 估計/估計量 · Learning · INFORMS · 流 ·

2023 年 12 月 8 日

Precision estimation and second-order errors in cortical circuits

Arno Granier,Mihai A. Petrovici,Walter Senn,Katharina A. Wilmes

Minimization of cortical prediction errors has been considered a key computational goal of the cerebral cortex underlying perception, action and learning. However, it is still unclear how the cortex should form and use information about uncertainty in this process of prediction error minimization. Here we derive neural dynamics that minimize prediction errors under the assumption that cortical areas must not only predict the activity in other areas and sensory streams, but also jointly estimate the precision of their predictions. This results in a dynamic modulatory balancing of cortical streams based on context-dependent precision estimates. Moreover, the theory predicts the existence of cortical second-order errors, comparing estimated and actual precision, propagated through the cortical hierarchy alongside classical prediction errors. These second-order errors are used to learn weights of synapses responsible for precision estimation through an error-correcting synaptic learning rule. Finally, we propose a detailed mapping of the theory to cortical circuitry.

估計/估計量 · 得分 · Analysis · SimPLe · 線性的 ·

2023 年 12 月 7 日

A simple sensitivity analysis method for unmeasured confounders via linear programming with estimating equation constraints

Chengyao Tang,Yi Zhou,Ao Huang,Satoshi Hattori

from arxiv, 16 pages, 5 tables, 2 figures

In estimating the average treatment effect in observational studies, the influence of confounders should be appropriately addressed. To this end, the propensity score is widely used. If the propensity scores are known for all the subjects, bias due to confounders can be adjusted by using the inverse probability weighting (IPW) by the propensity score. Since the propensity score is unknown in general, it is usually estimated by the parametric logistic regression model with unknown parameters estimated by solving the score equation under the strongly ignorable treatment assignment (SITA) assumption. Violation of the SITA assumption and/or misspecification of the propensity score model can cause serious bias in estimating the average treatment effect. To relax the SITA assumption, the IPW estimator based on the outcome-dependent propensity score has been successfully introduced. However, it still depends on the correctly specified parametric model and its identification. In this paper, we propose a simple sensitivity analysis method for unmeasured confounders. In the standard practice, the estimating equation is used to estimate the unknown parameters in the parametric propensity score model. Our idea is to make inference on the average causal effect by removing restrictive parametric model assumptions while still utilizing the estimating equation. Using estimating equations as constraints, which the true propensity scores asymptotically satisfy, we construct the worst-case bounds for the average treatment effect with linear programming. Different from the existing sensitivity analysis methods, we construct the worst-case bounds with minimal assumptions. We illustrate our proposal by simulation studies and a real-world example.

統計量 · 有偏 · 估計/估計量 · 查準率/準確率 · 設計 ·

2023 年 12 月 7 日

Test-negative designs with various reasons for testing: statistical bias and solution

Mengxin Yu,Kendrick Qijun Li,Nicholas Jewell,Eric Tchetgen Tchetgen,Dylan Small,Xu Shi,Bingkai Wang

Test-negative designs are widely used for post-market evaluation of vaccine effectiveness. Different from classical test-negative designs where only healthcare-seekers with symptoms are included, recent test-negative designs have involved individuals with various reasons for testing, especially in an outbreak setting. While including these data can increase sample size and hence improve precision, concerns have been raised about whether they will introduce bias into the current framework of test-negative designs, thereby demanding a formal statistical examination of this modified design. In this article, using statistical derivations, causal graphs, and numerical simulations, we show that the standard odds ratio estimator may be biased if various reasons for testing are not accounted for. To eliminate this bias, we identify three categories of reasons for testing, including symptoms, disease-unrelated reasons, and case contact tracing, and characterize associated statistical properties and estimands. Based on our characterization, we propose stratified estimators that can incorporate multiple reasons for testing to achieve consistent estimation and improve precision by maximizing the use of data. The performance of our proposed method is demonstrated through simulation studies.

Analysis · 蒙特卡羅 · 統計量 · 線性的 · Excel ·

2023 年 12 月 6 日

Analysis and preconditioning of a probabilistic domain decomposition algorithm for elliptic boundary value problems

Francisco Bernal,Jorge Morón-Vidal

PDDSparse is a new hybrid parallelisation scheme for solving large-scale elliptic boundary value problems on supercomputers, which can be described as a Feynman-Kac formula for domain decomposition. At its core lies a stochastic linear, sparse system for the solutions on the interfaces, whose entries are generated via Monte Carlo simulations. Assuming small statistical errors, we show that the random system matrix ${\tilde G}(\omega)$ is near a nonsingular M-matrix $G$, i.e. ${\tilde G}(\omega)+E=G$ where $||E||/||G||$ is small. Using nonstandard arguments, we bound $||G^{-1}||$ and the condition number of $G$, showing that both of them grow moderately with the degrees of freedom of the discretisation. Moreover, the truncated Neumann series of $G^{-1}$ -- which is straightforward to calculate -- is the basis for an excellent preconditioner for ${\tilde G}(\omega)$. These findings are supported by numerical evidence.

估計/估計量 · GROUP · 分解的 · MoDELS · 流 ·

2023 年 12 月 6 日

A tight lower bound on non-adaptive group testing estimation

Nader H. Bshouty,Tsun-Ming Cheung,Gergely Harcos,Hamed Hatami,Anthony Ostuni

from arxiv, This work is a merger of arXiv:2309.09613 and arXiv:2309.10286

Efficiently counting or detecting defective items is a crucial task in various fields ranging from biological testing to quality control to streaming algorithms. The \emph{group testing estimation problem} concerns estimating the number of defective elements $d$ in a collection of $n$ total within a given factor. We primarily consider the classical query model, in which a query reveals whether the selected group of elements contains a defective one. We show that any non-adaptive randomized algorithm that estimates the value of $d$ within a constant factor requires $\Omega(\log n)$ queries. This confirms that a known $O(\log n)$ upper bound by Bshouty (2019) is tight and resolves a conjecture by Damaschke and Sheikh Muhammad (2010). Additionally, we prove similar matching upper and lower bounds in the threshold query model.

流形 · 正則化項 · 線性的 · 動力系統 · Weight ·

2023 年 12 月 6 日

Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems

Kevin Zeng,Carlos E. Pérez De Jesús,Andrew J. Fox,Michael D. Graham

While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework's ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a "collective weight variable" incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.

Networking · 生成器網絡 · 情景 · 推斷 · 估計/估計量 ·

2023 年 12 月 6 日

Design-based inference for generalized network experiments with stochastic interventions

Ambarish Chattopadhyay,Kosuke Imai,Jose R. Zubizarreta

A growing number of scholars and data scientists are conducting randomized experiments to analyze causal relationships in network settings where units influence one another. A dominant methodology for analyzing these network experiments has been design-based, leveraging randomization of treatment assignment as the basis for inference. In this paper, we generalize this design-based approach so that it can be applied to more complex experiments with a variety of causal estimands with different target populations. An important special case of such generalized network experiments is a bipartite network experiment, in which the treatment assignment is randomized among one set of units and the outcome is measured for a separate set of units. We propose a broad class of causal estimands based on stochastic intervention for generalized network experiments. Using a design-based approach, we show how to estimate the proposed causal quantities without bias, and develop conservative variance estimators. We apply our methodology to a randomized experiment in education where a group of selected students in middle schools are eligible for the anti-conflict promotion program, and the program participation is randomized within this group. In particular, our analysis estimates the causal effects of treating each student or his/her close friends, for different target populations in the network. We find that while the treatment improves the overall awareness against conflict among students, it does not significantly reduce the total number of conflicts.

MoDELS · 優化器 · Minimax · 類別 · 預測器/決策函數 ·

2023 年 12 月 5 日

T-Cal: An optimal test for the calibration of predictive models

Donghwan Lee,Xinmeng Huang,Hamed Hassani,Edgar Dobriban

from arxiv, The implementation of T-Cal is available at //github.com/dh7401/T-Cal

The prediction accuracy of machine learning methods is steadily increasing, but the calibration of their uncertainty predictions poses a significant challenge. Numerous works focus on obtaining well-calibrated predictive models, but less is known about reliably assessing model calibration. This limits our ability to know when algorithms for improving calibration have a real effect, and when their improvements are merely artifacts due to random noise in finite datasets. In this work, we consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem. The null hypothesis is that the predictive model is calibrated, while the alternative hypothesis is that the deviation from calibration is sufficiently large. We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. When the conditional class probabilities are H\"older continuous, we propose T-Cal, a minimax optimal test for calibration based on a debiased plug-in estimator of the $\ell_2$-Expected Calibration Error (ECE). We further propose Adaptive T-Cal, a version that is adaptive to unknown smoothness. We verify our theoretical findings with a broad range of experiments, including with several popular deep neural net architectures and several standard post-hoc calibration methods. T-Cal is a practical general-purpose tool, which -- combined with classical tests for discrete-valued predictors -- can be used to test the calibration of virtually any probabilistic classification method.

哈希學習 · 圖像檢索 · 卷積神經網絡 · 優化器 · 損失函數（機器學習） ·

2020 年 6 月 10 日

A survey on deep hashing for image retrieval

Xiaopeng Zhang

Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.

圖形處理器 · 圖 · INTERACT · Performer · Neural Networks ·

2019 年 11 月 6 日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Ruochi Zhang,Yuesong Zou,Jian Ma

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.