高清国产三级在线播放,日本又色又爽又黄一级视频,日韩一区二区三区无码免费看,国产人妖一区二区三区

In the field of medical imaging, there are seldom large-scale public datasets with high-quality annotations due to data privacy and annotation cost. To address this issue, we release SynFundus-1M, a high-quality synthetic dataset containing over \textbf{1 million} fundus images w.r.t. 11 disease types. Moreover, we intentionally diversify the readability of the images and accordingly provide 4 types of the quality score for each image. To the best of our knowledge, SynFundus-1M is currently the largest fundus dataset with the most sophisticated annotations. All the images are generated by a Denoising Diffusion Probabilistic Model, named SynFundus-Generator. Trained with over 1.3 million private fundus images, our SynFundus-Generator achieves significant superior performance in generating fundus images compared to some recent related works. Furthermore, we blend some synthetic images from SynFundus-1M with real fundus images, and ophthalmologists can hardly distinguish the synthetic images from real ones. Through extensive experiments, we demonstrate that both convolutional neural networs (CNN) and Vision Transformer (ViT) can benefit from SynFundus-1M by pretraining or training directly. Compared to datasets like ImageNet or EyePACS, models trained on SynFundus-1M not only achieve better performance but also faster convergence on various downstream tasks.

相關內容

數據集

關注 88

數據集，又稱為資料集、數據集合或資料集合，是一種由數據所組成的集合。
Data set（或dataset）是一個數據的集合，通常以表格形式出現。每一列代表一個特定變量。每一行都對應于某一成員的數據集的問題。它列出的價值觀為每一個變量，如身高和體重的一個物體或價值的隨機數。每個數值被稱為數據資料。對應于行數，該數據集的數據可能包括一個或多個成員。

相互獨立的 · 貝葉斯網/貝葉斯網絡 · 邊緣獨立性 · Networking · 邊緣化 ·

2024 年 1 月 31 日

Combinatorial and algebraic perspectives on the marginal independence structure of Bayesian networks

Danai Deligeorgaki,Alex Markham,Pratik Misra,Liam Solus

from arxiv, 54 pages, 13 figures, 3 tables

We consider the problem of estimating the marginal independence structure of a Bayesian network from observational data, learning an undirected graph we call the unconditional dependence graph. We show that unconditional dependence graphs of Bayesian networks correspond to the graphs having equal independence and intersection numbers. Using this observation, a Gr\"obner basis for a toric ideal associated to unconditional dependence graphs of Bayesian networks is given and then extended by additional binomial relations to connect the space of all such graphs. An MCMC method, called GrUES (Gr\"obner-based Unconditional Equivalence Search), is implemented based on the resulting moves and applied to synthetic Gaussian data. GrUES recovers the true marginal independence structure via a penalized maximum likelihood or MAP estimate at a higher rate than simple independence tests while also yielding an estimate of the posterior, for which the $20\%$ HPD credible sets include the true structure at a high rate for data-generating graphs with density at least $0.5$.

控制器 · Processing（編程語言） · 估計/估計量 · Tensor · MoDELS ·

2024 年 1 月 31 日

Tensor-based process control and monitoring for semiconductor manufacturing with unstable disturbances

Yanrong Li,Juan Du,Fugee Tsung,Wei Jiang

from arxiv, 30 pages, 5 figures

With the development and popularity of sensors installed in manufacturing systems, complex data are collected during manufacturing processes, which brings challenges for traditional process control methods. This paper proposes a novel process control and monitoring method for the complex structure of high-dimensional image-based overlay errors (modeled in tensor form), which are collected in semiconductor manufacturing processes. The proposed method aims to reduce overlay errors using limited control recipes. We first build a high-dimensional process model and propose different tensor-on-vector regression algorithms to estimate parameters in the model to alleviate the curse of dimensionality. Then, based on the estimate of tensor parameters, the exponentially weighted moving average (EWMA) controller for tensor data is designed whose stability is theoretically guaranteed. Considering the fact that low-dimensional control recipes cannot compensate for all high-dimensional disturbances on the image, control residuals are monitored to prevent significant drifts of uncontrollable high-dimensional disturbances. Through extensive simulations and real case studies, the performances of parameter estimation algorithms and the EWMA controller in tensor space are evaluated. Compared with existing image-based feedback controllers, the superiority of our method is verified especially when disturbances are not stable.

潛變量/隱變量 · MoDELS · 線性的 · 線性回歸 · 縮放 ·

2024 年 1 月 30 日

A brief introduction on latent variable based ordinal regression models with an application to survey data

Johannes Wieditz,Clemens Miller,Jan Scholand,Marcus Nemeth

The analysis of survey data is a frequently arising issue in clinical trials, particularly when capturing quantities which are difficult to measure. Typical examples are questionnaires about patient's well-being, pain, or consent to an intervention. In these, data is captured on a discrete scale containing only a limited number of possible answers, from which the respondent has to pick the answer which fits best his/her personal opinion. This data is generally located on an ordinal scale as answers can usually be arranged in an ascending order, e.g., "bad", "neutral", "good" for well-being. Since responses are usually stored numerically for data processing purposes, analysis of survey data using ordinary linear regression models are commonly applied. However, assumptions of these models are often not met as linear regression requires a constant variability of the response variable and can yield predictions out of the range of response categories. By using linear models, one only gains insights about the mean response which may affect representativeness. In contrast, ordinal regression models can provide probability estimates for all response categories and yield information about the full response scale beyond the mean. In this work, we provide a concise overview of the fundamentals of latent variable based ordinal models, applications to a real data set, and outline the use of state-of-the-art-software for this purpose. Moreover, we discuss strengths, limitations and typical pitfalls. This is a companion work to a current vignette-based structured interview study in paediatric anaesthesia.

估計/估計量 · Copulas · 極大似然估計 · 邊緣化 · MoDELS ·

2024 年 1 月 29 日

Efficient estimation of parameters in marginals in semiparametric multivariate models

Ivan Medovikov,Valentyn Panchenko,Artem Prokhorov

We consider a general multivariate model where univariate marginal distributions are known up to a parameter vector and we are interested in estimating that parameter vector without specifying the joint distribution, except for the marginals. If we assume independence between the marginals and maximize the resulting quasi-likelihood, we obtain a consistent but inefficient QMLE estimator. If we assume a parametric copula (other than independence) we obtain a full MLE, which is efficient but only under a correct copula specification and may be biased if the copula is misspecified. Instead we propose a sieve MLE estimator (SMLE) which improves over QMLE but does not have the drawbacks of full MLE. We model the unknown part of the joint distribution using the Bernstein-Kantorovich polynomial copula and assess the resulting improvement over QMLE and over misspecified FMLE in terms of relative efficiency and robustness. We derive the asymptotic distribution of the new estimator and show that it reaches the relevant semiparametric efficiency bound. Simulations suggest that the sieve MLE can be almost as efficient as FMLE relative to QMLE provided there is enough dependence between the marginals. We demonstrate practical value of the new estimator with several applications. First, we apply SMLE in an insurance context where we build a flexible semi-parametric claim loss model for a scenario where one of the variables is censored. As in simulations, the use of SMLE leads to tighter parameter estimates. Next, we consider financial risk management examples and show how the use of SMLE leads to superior Value-at-Risk predictions. The paper comes with an online archive which contains all codes and datasets.

推斷 · 頻率主義學派 · 平滑 · 可辨認的 · 估計/估計量 ·

2024 年 1 月 29 日

Bayesian one- and two-sided inference on the local effective dimension

Eduard Belitser

It is a challenge to manage infinite- or high-dimensional data in situations where storage, transmission, or computation resources are constrained. In the simplest scenario when the data consists of a noisy infinite-dimensional signal, we introduce the notion of local \emph{effective dimension} (i.e., pertinent to the underlying signal), formulate and study the problem of its recovery on the basis of noisy data. This problem can be associated to the problems of adaptive quantization, (lossy) data compression, oracle signal estimation. We apply a Bayesian approach and study frequentists properties of the resulting posterior, a purely frequentist version of the results is also proposed. We derive certain upper and lower bounds results about identifying the local effective dimension which show that only the so called \emph{one-sided inference} on the local effective dimension can be ensured whereas the \emph{two-sided inference}, on the other hand, is in general impossible. We establish the \emph{minimal} conditions under which two-sided inference can be made. Finally, connection to the problem of smoothness estimation for some traditional smoothness scales (Sobolev scales) is considered.

SIR · 統計量 · MoDELS · COVID-19 · 推斷 ·

2024 年 1 月 28 日

A filtering approach for statistical inference in a stochastic SIR model with an application to Covid-19 data

Katia Colaneri,Camilla Damian,Rüdiger Frey

from arxiv, 21 pages, 9 figures, 1 Table

In this paper, we consider a discrete-time stochastic SIR model, where the transmission rate and the true number of infectious individuals are random and unobservable. An advantage of this model is that it permits us to account for random fluctuations in infectiousness and for non-detected infections. However, a difficulty arises because statistical inference has to be done in a partial information setting. We adopt a nested particle filtering approach to estimate the reproduction rate and the model parameters. As a case study, we apply our methodology to Austrian Covid-19 infection data. Moreover, we discuss forecasts and model tests.

推斷 · MoDELS · 置信度 · 相關系數 · 邊緣化 ·

2024 年 1 月 27 日

Improved confidence intervals for nonlinear mixed-effects and nonparametric regression models

Nan Zheng,Noel Cadigan

Statistical inference for high dimensional parameters (HDPs) can be based on their intrinsic correlation; that is, parameters that are close spatially or temporally tend to have more similar values. This is why nonlinear mixed-effects models (NMMs) are commonly (and appropriately) used for models with HDPs. Conversely, in many practical applications of NMM, the random effects (REs) are actually correlated HDPs that should remain constant during repeated sampling for frequentist inference. In both scenarios, the inference should be conditional on REs, instead of marginal inference by integrating out REs. In this paper, we first summarize recent theory of conditional inference for NMM, and then propose a bias-corrected RE predictor and confidence interval (CI). We also extend this methodology to accommodate the case where some REs are not associated with data. Simulation studies indicate that this new approach leads to substantial improvement in the conditional coverage rate of RE CIs, including CIs for smooth functions in generalized additive models, as compared to the existing method based on marginal inference.

預測器/決策函數 · 泛函 · 稀疏 · 估計/估計量 · 線性的 ·

2024 年 1 月 26 日

Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables

Silvia Novo,Germán Aneiros,Philippe Vieu

from arxiv, 40 pages, 7 figures, 5 tables

This paper aims to front with dimensionality reduction in regression setting when the predictors are a mixture of functional variable and high-dimensional vector. A flexible model, combining both sparse linear ideas together with semiparametrics, is proposed. A wide scope of asymptotic results is provided: this covers as well rates of convergence of the estimators as asymptotic behaviour of the variable selection procedure. Practical issues are analysed through finite sample simulated experiments while an application to Tecator's data illustrates the usefulness of our methodology.

entity · 圖 · 知識圖譜 · MoDELS · 鏈路預測 ·

2020 年 8 月 10 日

A survey of embedding models of entities and relationships for knowledge graph completion

Dat Quoc Nguyen

from arxiv, 13 pages, 2 figures and 6 tables

Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge graphs are typically incomplete, it is useful to perform knowledge graph completion or link prediction, i.e. predict whether a relationship not in the knowledge graph is likely to be true. This paper serves as a comprehensive survey of embedding models of entities and relationships for knowledge graph completion, summarizing up-to-date experimental results on standard benchmark datasets and pointing out potential future research directions.

圖形處理器 · 圖 · INTERACT · Performer · Neural Networks ·

2019 年 11 月 6 日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Ruochi Zhang,Yuesong Zou,Jian Ma

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.