国产欧美日韩视频一区二区,国内精品VA视频在线观看,午夜大片免费观看完整版

Many clinical studies require the follow-up of patients over time. This is challenging: apart from frequently observed drop-out, there are often also organizational and financial challenges, which can lead to reduced data collection and, in turn, can complicate subsequent analyses. In contrast, there is often plenty of baseline data available of patients with similar characteristics and background information, e.g., from patients that fall outside the study time window. In this article, we investigate whether we can benefit from the inclusion of such unlabeled data instances to predict accurate survival times. In other words, we introduce a third level of supervision in the context of survival analysis, apart from fully observed and censored instances, we also include unlabeled instances. We propose three approaches to deal with this novel setting and provide an empirical comparison over fifteen real-life clinical and gene expression survival datasets. Our results demonstrate that all approaches are able to increase the predictive performance over independent test data. We also show that integrating the partial supervision provided by censored data in a semi-supervised wrapper approach generally provides the best results, often achieving high improvements, compared to not using unlabeled data.

相關內容

未標記

關注 0

近似 · 估計/估計量 · 不變 · INFORMS · 離散化 ·

2022 年 12 月 9 日

A general framework for the rigorous computation of invariant densities and the coarse-fine strategy

Stefano Galatolo,Maurizio Monge,Isaia Nisoli,Federico Poloni

In this paper we present a general, axiomatical framework for the rigorous approximation of invariant densities and other important statistical features of dynamics. We approximate the system trough a finite element reduction, by composing the associated transfer operator with a suitable finite dimensional projection (a discretization scheme) as in the well-known Ulam method. We introduce a general framework based on a list of properties (of the system and of the projection) that need to be verified so that we can take advantage of a so-called ``coarse-fine'' strategy. This strategy is a novel method in which we exploit information coming from a coarser approximation of the system to get useful information on a finer approximation, speeding up the computation. This coarse-fine strategy allows a precise estimation of invariant densities and also allows to estimate rigorously the speed of mixing of the system by the speed of mixing of a coarse approximation of it, which can easily be estimated by the computer. The estimates obtained here are rigourous, i.e., they come with exact error bounds that are guaranteed to hold and take into account both the discretiazation and the approximations induced by finite-precision arithmetic. We apply this framework to several discretization schemes and examples of invariant density computation from previous works, obtaining a remarkable reduction in computation time. We have implemented the numerical methods described here in the Julia programming language, and released our implementation publicly as a Julia package.

Analysis · 圖 · 代碼 · 編譯器 · INFORMS ·

2022 年 12 月 9 日

Representing LLVM-IR in a Code Property Graph

Alexander Küchler,Christian Banse

In the past years, a number of static application security testing tools have been proposed which make use of so-called code property graphs, a graph model which keeps rich information about the source code while enabling its user to write language-agnostic analyses. However, they suffer from several shortcomings. They work mostly on source code and exclude the analysis of third-party dependencies if they are only available as compiled binaries. Furthermore, they are limited in their analysis to whether an individual programming language is supported or not. While often support for well-established languages such as C/C++ or Java is included, languages that are still heavily evolving, such as Rust, are not considered because of the constant changes in the language design. To overcome these limitations, we extend an open source implementation of a code property graph to support LLVM-IR which can be used as output by many compilers and binary lifters. In this paper, we discuss how we address challenges that arise when mapping concepts of an intermediate representation to a CPG. At the same time, we optimize the resulting graph to be minimal and close to the representation of equivalent source code. Our evaluation indicates that existing analyses can be reused without modifications and that the performance requirements are comparable to operating on source code. This makes the approach suitable for an analysis of large-scale projects.

Analysis · Markov · 估計/估計量 · 可理解性 · 跡 ·

2022 年 12 月 8 日

History of applications of martingales in survival analysis

Odd O. Aalen,Per Kragh Andersen,?rnulf Borgan,Richard D. Gill,Niels Keiding

from arxiv, Electronic Journal for History of Probability and Statistics, Vol. 5 (1) June 2009 (www.jehps.net), 28 pp; reprinted, 2022, as chapter 13 of Mazliak, L., Shafer, G. (eds) The Splendors and Miseries of Martingales. Birkh\"auser, Cham

The paper traces the development of the use of martingale methods in survival analysis from the mid 1970's to the early 1990's. This development was initiated by Aalen's Berkeley PhD-thesis in 1975, progressed through the work on estimation of Markov transition probabilities, non-parametric tests and Cox's regression model in the late 1970's and early 1980's, and it was consolidated in the early 1990's with the publication of the monographs by Fleming and Harrington (1991) and Andersen, Borgan, Gill and Keiding (1993). The development was made possible by an unusually fast technology transfer of pure mathematical concepts, primarily from French probability, into practical biostatistical methodology, and we attempt to outline some of the personal relationships that helped this happen. We also point out that survival analysis was ready for this development since the martingale ideas inherent in the deep understanding of temporal development so intrinsic to the French theory of processes were already quite close to the surface in survival analysis.

簇 · 訓練數據 · Shapley value · 論文 · MoDELS ·

2022 年 12 月 8 日

Shapley values for cluster importance: How clusters of the training data affect a prediction

Andreas Brands?ter,Ingrid K. Glad

This paper proposes a novel approach to explain the predictions made by data-driven methods. Since such predictions rely heavily on the data used for training, explanations that convey information about how the training data affects the predictions are useful. The paper proposes a novel approach to quantify how different data-clusters of the training data affect a prediction. The quantification is based on Shapley values, a concept which originates from coalitional game theory, developed to fairly distribute the payout among a set of cooperating players. A player's Shapley value is a measure of that player's contribution. Shapley values are often used to quantify feature importance, ie. how features affect a prediction. This paper extends this to cluster importance, letting clusters of the training data act as players in a game where the predictions are the payouts. The novel methodology proposed in this paper lets us explore and investigate how different clusters of the training data affect the predictions made by any black-box model, allowing new aspects of the reasoning and inner workings of a prediction model to be conveyed to the users. The methodology is fundamentally different from existing explanation methods, providing insight which would not be available otherwise, and should complement existing explanation methods, including explanations based on feature importance.

MoDELS · Continuity · 秩 · Processing（編程語言） · 去噪 ·

2022 年 12 月 8 日

Diffusion Probabilistic Modeling for Video Generation

Ruihan Yang,Prakhar Srivastava,Stephan Mandt

Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame prediction using a stochastic residual generated by an inverse diffusion process. We compare this approach against five baselines on four datasets involving natural and simulation-based videos. We find significant improvements in terms of perceptual quality for all datasets. Furthermore, by introducing a scalable version of the Continuous Ranked Probability Score (CRPS) applicable to video, we show that our model also outperforms existing approaches in their probabilistic frame forecasting ability.

Conformer · 情景 · 可交換的 · 同分布的 · 估計/估計量 ·

2022 年 12 月 7 日

Root-finding Approaches for Computing Conformal Prediction Set

Eugene Ndiaye,Ichiro Takeuchi

from arxiv, Published in Machine Learning Journal //rdcu.be/c0nh9

Conformal prediction constructs a confidence set for an unobserved response of a feature vector based on previous identically distributed and exchangeable observations of responses and features. It has a coverage guarantee at any nominal level without additional assumptions on their distribution. Its computation deplorably requires a refitting procedure for all replacement candidates of the target response. In regression settings, this corresponds to an infinite number of model fits. Apart from relatively simple estimators that can be written as pieces of linear function of the response, efficiently computing such sets is difficult, and is still considered as an open problem. We exploit the fact that, \emph{often}, conformal prediction sets are intervals whose boundaries can be efficiently approximated by classical root-finding algorithms. We investigate how this approach can overcome many limitations of formerly used strategies; we discuss its complexity and drawbacks.

近似 · Processing（編程語言） · INFORMS · binary · CASES ·

2022 年 12 月 6 日

Scheduling with Speed Predictions

Eric Balkanski,Tingting Ou,Clifford Stein,Hao-Ting Wei

Algorithms with predictions is a recent framework that has been used to overcome pessimistic worst-case bounds in incomplete information settings. In the context of scheduling, very recent work has leveraged machine-learned predictions to design algorithms that achieve improved approximation ratios in settings where the processing times of the jobs are initially unknown. In this paper, we study the speed-robust scheduling problem where the speeds of the machines, instead of the processing times of the jobs, are unknown and augment this problem with predictions. Our main result is an algorithm that achieves a $\min\{\eta^2(1+\alpha), (2 + 2/\alpha)\}$ approximation, for any $\alpha \in (0,1)$, where $\eta \geq 1$ is the prediction error. When the predictions are accurate, this approximation outperforms the best known approximation for speed-robust scheduling without predictions of $2-1/m$, where $m$ is the number of machines, while simultaneously maintaining a worst-case approximation of $2 + 2/\alpha$ even when the predictions are arbitrarily wrong. In addition, we obtain improved approximations for three special cases: equal job sizes, infinitesimal job sizes, and binary machine speeds. We also complement our algorithmic results with lower bounds. Finally, we empirically evaluate our algorithm against existing algorithms for speed-robust scheduling.

生成模型 · MoDELS · 表示學習 · 學成 · 可辨認的 ·

2021 年 6 月 9 日

Generative Models as a Data Source for Multiview Representation Learning

Ali Jahanian,Xavier Puig,Yonglong Tian,Phillip Isola

Generative models are now capable of producing highly realistic images that look nearly indistinguishable from the data on which they are trained. This raises the question: if we have good enough generative models, do we still need datasets? We investigate this question in the setting of learning general-purpose visual representations from a black-box generative model rather than directly from data. Given an off-the-shelf image generator without any access to its training data, we train representations from the samples output by this generator. We compare several representation learning methods that can be applied to this setting, using the latent space of the generator to generate multiple "views" of the same semantic content. We show that for contrastive methods, this multiview data can naturally be used to identify positive pairs (nearby in latent space) and negative pairs (far apart in latent space). We find that the resulting representations rival those learned directly from real data, but that good performance requires care in the sampling strategy applied and the training method. Generative models can be viewed as a compressed and organized copy of a dataset, and we envision a future where more and more "model zoos" proliferate while datasets become increasingly unwieldy, missing, or private. This paper suggests several techniques for dealing with visual representation learning in such a future. Code is released on our project page: //ali-design.github.io/GenRep/

圖 · Networking · INTERACT · INFORMS · 圖形處理器 ·

2020 年 11 月 25 日

Time-Series Event Prediction with Evolutionary State Graph

Wenjie Hu,Yang Yang,Ziqiang Cheng,Carl Yang,Xiang Ren

from arxiv, A long version of EvoNet (WSDM 2021)

The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.

網絡嵌入 · Networking · Better · MoDELS · Networks ·

2018 年 1 月 19 日

mvn2vec: Preservation and Collaboration in Multi-View Network Embedding

Yu Shi,Fangqiu Han,Xinran He,Carl Yang,Jie Luo,Jiawei Han

Multi-view networks are ubiquitous in real-world applications. In order to extract knowledge or business value, it is of interest to transform such networks into representations that are easily machine-actionable. Meanwhile, network embedding has emerged as an effective approach to generate distributed network representations. Therefore, we are motivated to study the problem of multi-view network embedding, with a focus on the characteristics that are specific and important in embedding this type of networks. In our practice of embedding real-world multi-view networks, we identify two such characteristics, which we refer to as preservation and collaboration. We then explore the feasibility of achieving better embedding quality by simultaneously modeling preservation and collaboration, and propose the mvn2vec algorithms. With experiments on a series of synthetic datasets, an internal Snapchat dataset, and two public datasets, we further confirm the presence and importance of preservation and collaboration. These experiments also demonstrate that better embedding can be obtained by simultaneously modeling the two characteristics, while not over-complicating the model or requiring additional supervision.