国产欧美日韩综合在线,国产精品免费线观看你懂的,国产制服精品一区二区视色

For the analysis of a time-to-event endpoint in a single-arm or randomized clinical trial it is generally perceived that interpretation of a given estimate of the survival function, or the comparison between two groups, hinges on some quantification of the amount of follow-up. Typically, a median of some loosely defined quantity is reported. However, whatever median is reported, is typically not answering the question(s) trialists actually have in terms of follow-up quantification. In this paper, inspired by the estimand framework, we formulate a comprehensive list of relevant scientific questions that trialists have when reporting time-to-event data. We illustrate how these questions should be answered, and that reference to an unclearly defined follow-up quantity is not needed at all. In drug development, key decisions are made based on randomized controlled trials, and we therefore also discuss relevant scientific questions not only when looking at a time-to-event endpoint in one group, but also for comparisons. We find that different thinking about some of the relevant scientific questions around follow-up is required depending on whether a proportional hazards assumption can be made or other patterns of survival functions are anticipated, e.g. delayed separation, crossing survival functions, or the potential for cure. We conclude the paper with practical recommendations.

相關內容

試驗

關注 2

估計/估計量 · INFORMS · 試驗 · 早停 · Analysis ·

2023 年 2 月 15 日

Interim Monitoring of Sequential Multiple Assignment Randomized Trials Using Partial Information

Cole Manschot,Eric Laber,Marie Davidian

The sequential multiple assignment randomized trial (SMART) is the gold standard trial design to generate data for the evaluation of multi-stage treatment regimes. As with conventional (single-stage) randomized clinical trials, interim monitoring allows early stopping; however, there are few methods for principled interim analysis in SMARTs. Because SMARTs involve multiple stages of treatment, a key challenge is that not all enrolled participants will have progressed through all treatment stages at the time of an interim analysis. Wu et al. (2021) propose basing interim analyses on an estimator for the mean outcome under a given regime that uses data only from participants who have completed all treatment stages. We propose an estimator for the mean outcome under a given regime that gains efficiency by using partial information from enrolled participants regardless of their progression through treatment stages. Using the asymptotic distribution of this estimator, we derive associated Pocock and O'Brien-Fleming testing procedures for early stopping. In simulation experiments, the estimator controls type I error and achieves nominal power while reducing expected sample size relative to the method of Wu et al. (2021). We present an illustrative application of the proposed estimator based on a recent SMART evaluating behavioral pain interventions for breast cancer patients.

估計/估計量 · 有偏 · 可辨認的 · INFORMS · motivation ·

2023 年 2 月 15 日

Estimating Causal Effects Under Image Confounding Bias with an Application to Poverty in Africa

Connor T. Jerzak,Fredrik Johansson,Adel Daoud

Observational studies of causal effects require adjustment for confounding factors. In the tabular setting, where these factors are well-defined, separate random variables, the effect of confounding is well understood. However, in public policy, ecology, and in medicine, decisions are often made in non-tabular settings, informed by patterns or objects detected in images (e.g., maps, satellite or tomography imagery). Using such imagery for causal inference presents an opportunity because objects in the image may be related to the treatment and outcome of interest. In these cases, we rely on the images to adjust for confounding but observed data do not directly label the existence of the important objects. Motivated by real-world applications, we formalize this challenge, how it can be handled, and what conditions are sufficient to identify and estimate causal effects. We analyze finite-sample performance using simulation experiments, estimating effects using a propensity adjustment algorithm that employs a machine learning model to estimate the image confounding. Our experiments also examine sensitivity to misspecification of the image pattern mechanism. Finally, we use our methodology to estimate the effects of policy interventions on poverty in African communities from satellite imagery.

PCA · Analysis · 奇異值分解 · 相互獨立的 · 奇異的 ·

2023 年 2 月 14 日

On the Multiway Principal Component Analysis

Jialin Ouyang,Ming Yuan

Multiway data are becoming more and more common. While there are many approaches to extending principal component analysis (PCA) from usual data matrices to multiway arrays, their conceptual differences from the usual PCA, and the methodological implications of such differences remain largely unknown. This work aims to specifically address these questions. In particular, we clarify the subtle difference between PCA and singular value decomposition (SVD) for multiway data, and show that multiway principal components (PCs) can be estimated reliably in absence of the eigengaps required by the usual PCA, and in general much more efficiently than the usual PCs. Furthermore, the sample multiway PCs are asymptotically independent and hence allow for separate and more accurate inferences about the population PCs. The practical merits of multiway PCA are further demonstrated through numerical, both simulated and real data, examples.

知識 (knowledge) · MoDELS · 蒸餾 · 集成 · 集成學習 ·

2023 年 2 月 14 日

Multi-teacher knowledge distillation as an effective method for compressing ensembles of neural networks

Konrad Zuchniak

from arxiv, Doctoral dissertation in the field of computer science, machine learning. Application of knowledge distillation as aggregation of ensemble models. Along with several uses. 140 pages, 67 figures, 13 tables

Deep learning has contributed greatly to many successes in artificial intelligence in recent years. Today, it is possible to train models that have thousands of layers and hundreds of billions of parameters. Large-scale deep models have achieved great success, but the enormous computational complexity and gigantic storage requirements make it extremely difficult to implement them in real-time applications. On the other hand, the size of the dataset is still a real problem in many domains. Data are often missing, too expensive, or impossible to obtain for other reasons. Ensemble learning is partially a solution to the problem of small datasets and overfitting. However, ensemble learning in its basic version is associated with a linear increase in computational complexity. We analyzed the impact of the ensemble decision-fusion mechanism and checked various methods of sharing the decisions including voting algorithms. We used the modified knowledge distillation framework as a decision-fusion mechanism which allows in addition compressing of the entire ensemble model into a weight space of a single model. We showed that knowledge distillation can aggregate knowledge from multiple teachers in only one student model and, with the same computational complexity, obtain a better-performing model compared to a model trained in the standard manner. We have developed our own method for mimicking the responses of all teachers at the same time, simultaneously. We tested these solutions on several benchmark datasets. In the end, we presented a wide application use of the efficient multi-teacher knowledge distillation framework. In the first example, we used knowledge distillation to develop models that could automate corrosion detection on aircraft fuselage. The second example describes detection of smoke on observation cameras in order to counteract wildfires in forests.

Continuity · 可約的 · Projection · Analysis · 離散化 ·

2023 年 2 月 14 日

Random projections for curves in high dimensions

Ioannis Psarros,Dennis Rohde

from arxiv, 22 pages

Modern time series analysis requires the ability to handle datasets that are inherently high-dimensional; examples include applications in climatology, where measurements from numerous sensors must be taken into account, or inventory tracking of large shops, where the dimension is defined by the number of tracked items. The standard way to mitigate computational issues arising from the high dimensionality of the data is by applying some dimension reduction technique that preserves the structural properties of the ambient space. The dissimilarity between two time series is often measured by ``discrete'' notions of distance, e.g. the dynamic time warping or the discrete Fr\'echet distance. Since all these distance functions are computed directly on the points of a time series, they are sensitive to different sampling rates or gaps. The continuous Fr\'echet distance offers a popular alternative which aims to alleviate this by taking into account all points on the polygonal curve obtained by linearly interpolating between any two consecutive points in a sequence. We study the ability of random projections \`a la Johnson and Lindenstrauss to preserve the continuous Fr\'echet distance of polygonal curves by effectively reducing the dimension. In particular, we show that one can reduce the dimension to $O(\epsilon^{-2} \log N)$, where $N$ is the total number of input points while preserving the continuous Fr\'echet distance between any two determined polygonal curves within a factor of $1\pm \epsilon$. We conclude with applications on clustering.

估計/估計量 · 統計量 · 泛函 · 累積分布函數 · 樣本 ·

2023 年 2 月 14 日

Private Statistical Estimation of Many Quantiles

Clément Lalanne,Aurélien Garivier,Rémi Gribonval

This work studies the estimation of many statistical quantiles under differential privacy. More precisely, given a distribution and access to i.i.d. samples from it, we study the estimation of the inverse of its cumulative distribution function (the quantile function) at specific points. For instance, this task is of key importance in private data generation. We present two different approaches. The first one consists in privately estimating the empirical quantiles of the samples and using this result as an estimator of the quantiles of the distribution. In particular, we study the statistical properties of the recently published algorithm introduced by Kaplan et al. 2022 that privately estimates the quantiles recursively. The second approach is to use techniques of density estimation in order to uniformly estimate the quantile function on an interval. In particular, we show that there is a tradeoff between the two methods. When we want to estimate many quantiles, it is better to estimate the density rather than estimating the quantile function at specific points.

VTS · CAD · 置換 · 原點 · 論文 ·

2023 年 2 月 14 日

A Poly-algorithmic Approach to Quantifier Elimination

James H. Davenport,Zak P. Tonks,Ali K. Uncu

Cylindrical Algebraic Decomposition (CAD) was the first practical means for doing real quantifier elimination (QE), and is still a major method, with many improvements since Collins' original method. Nevertheless, its complexity is inherently doubly exponential in the number of variables. Where applicable, virtual term substitution (VTS) is more effective, turning a QE problem in $n$ variables to one in $n-1$ variables in one application, and so on. Hence there is scope for hybrid methods: doing VTS where possible then using CAD. This paper describes such a poly-algorithmic implementation, based on the second author's Ph.D. thesis. The version of CAD used is based on a new implementation of Lazard's recently-justified method, with some improvements to handle equational constraints.

MASS · 蒙特卡羅 · 蒙特卡羅方法 · 可約的 · Performer ·

2023 年 2 月 13 日

Uncertainty quantification in coastal aquifers using the multilevel Monte Carlo method

Alexander Litvinenko,Dmitry Logashenko,Raul Tempone,Ekaterina Vasilyeva,Gabriel Wittum

from arxiv, 24 pages, 3 tables, 11 figures

We consider a class of density-driven flow problems. We are particularly interested in the problem of the salinization of coastal aquifers. We consider the Henry saltwater intrusion problem with uncertain porosity, permeability, and recharge parameters as a test case. The reason for the presence of uncertainties is the lack of knowledge, inaccurate measurements, and inability to measure parameters at each spatial or time location. This problem is nonlinear and time-dependent. The solution is the salt mass fraction, which is uncertain and changes in time. Uncertainties in porosity, permeability, recharge, and mass fraction are modeled using random fields. This work investigates the applicability of the well-known multilevel Monte Carlo (MLMC) method for such problems. The MLMC method can reduce the total computational and storage costs. Moreover, the MLMC method runs multiple scenarios on different spatial and time meshes and then estimates the mean value of the mass fraction. The parallelization is performed in both the physical space and stochastic space. To solve every deterministic scenario, we run the parallel multigrid solver ug4 in a black-box fashion. We use the solution obtained from the quasi-Monte Carlo method as a reference solution.

UDP · Less · Performer · 服務器 · ReQuEST ·

2023 年 2 月 10 日

TurboTLS: TLS connection establishment with 1 less round trip

Carlos Aguilar-Melchor,Thomas Bailleux,Jason Goertzen,David Joseph,Douglas Stebila

We show how to establish TLS connections using one less round trip. In our approach, which we call TurboTLS, the initial client-to-server and server-to-client flows of the TLS handshake are sent over UDP rather than TCP. At the same time, in the same flights, the three-way TCP handshake is carried out. Once the TCP connection is established, the client and server can complete the final flight of the TLS handshake over the TCP connection and continue using it for application data. No changes are made to the contents of the TLS handshake protocol, only its delivery mechanism. We avoid problems with UDP fragmentation by using request-based fragmentation, in which the client sends in advance enough UDP requests to provide sufficient room for the server to fit its response with one response packet per request packet. Clients can detect which servers support this without an additional round trip, if the server advertises its support in a DNS HTTPS resource record. Experiments using our software implementation show substantial latency improvements. On reliable connections, we effectively eliminate a round trip without any noticeable cost. To ensure adequate performance on unreliable connections, we use lightweight packet ordering and buffering; we can have a client wait a very small time to receive a potentially lost packet (e.g., a fraction of the RTT observed for the first fragment) before falling back to TCP without any further delay, since the TCP connection was already in the process of being established. This approach offers substantial performance improvements with low complexity, even in heterogeneous network environments with poorly configured middleboxes.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.