亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Image acquisition and segmentation are likely to introduce noise. Further image processing such as image registration and parameterization can introduce additional noise. It is thus imperative to reduce noise measurements and boost signal. In order to increase the signal-to-noise ratio (SNR) and smoothness of data required for the subsequent random field theory based statistical inference, some type of smoothing is necessary. Among many image smoothing methods, Gaussian kernel smoothing has emerged as a de facto smoothing technique among brain imaging researchers due to its simplicity in numerical implementation. Gaussian kernel smoothing also increases statistical sensitivity and statistical power as well as Gausianness. Gaussian kernel smoothing can be viewed as weighted averaging of voxel values. Then from the central limit theorem, the weighted average should be more Gaussian.

相關內容

Researchers may use a sketch of data of size $m$ instead of the full sample of size $n$ sometimes to relieve computation burden, and other times to maintain data privacy. This paper considers the case when full sample estimation would have required the Eicker-Huber-White robust standard errors to account for heteroskedasticity. We show that random projections have a smoothing effect on the sketched data, with the consequence that the least squares estimates using such sketched data behave 'as if' the errors were homoskedastic. This result is obtained by expressing the difference between the moments computed from the full sample and the sketched data as a degenerate $U$-statistic which is asymptotically normal with a homoskedastic variance when the conditions in Hall (1984) are satisfied. This result also holds for two-stage least squares for which algorithmic and statistical properties are analyzed. Sketches produced by random sampling will not, however, have the effect of homogenizing the error variances.

The classification of histopathology images fundamentally differs from traditional image classification tasks because histopathology images naturally exhibit a range of diagnostic features, resulting in a diverse range of annotator agreement levels. However, examples with high annotator disagreement are often either assigned the majority label or discarded entirely when training histopathology image classifiers. This widespread practice often yields classifiers that do not account for example difficulty and exhibit poor model calibration. In this paper, we ask: can we improve model calibration by endowing histopathology image classifiers with inductive biases about example difficulty? We propose several label smoothing methods that utilize per-image annotator agreement. Though our methods are simple, we find that they substantially improve model calibration, while maintaining (or even improving) accuracy. For colorectal polyp classification, a common yet challenging task in gastrointestinal pathology, we find that our proposed agreement-aware label smoothing methods reduce calibration error by almost 70%. Moreover, we find that using model confidence as a proxy for annotator agreement also improves calibration and accuracy, suggesting that datasets without multiple annotators can still benefit from our proposed label smoothing methods via our proposed confidence-aware label smoothing methods. Given the importance of calibration (especially in histopathology image analysis), the improvements from our proposed techniques merit further exploration and potential implementation in other histopathology image classification tasks.

Causal Optimal Transport (COT) results from imposing a temporal causality constraint on classic optimal transport problems, which naturally generates a new concept of distances between distributions on path spaces. The first application of the COT theory for sequential learning was given in Xu et al. (2020), where COT-GAN was introduced as an adversarial algorithm to train implicit generative models optimized for producing sequential data. Relying on (Xu et al., 2020), the contribution of the present paper is twofold. First, we develop a conditional version of COT-GAN suitable for sequence prediction. This means that the dataset is now used in order to learn how a sequence will evolve given the observation of its past evolution. Second, we improve on the convergence results by working with modifications of the empirical measures via kernel smoothing due to (Pflug and Pichler (2016)). The resulting kernel conditional COT-GAN algorithm is illustrated with an application for video prediction.

Gaussian process regression is a widely-applied method for function approximation and uncertainty quantification. The technique has gained popularity recently in the machine learning community due to its robustness and interpretability. The mathematical methods we discuss in this paper are an extension of the Gaussian-process framework. We are proposing advanced kernel designs that only allow for functions with certain desirable characteristics to be elements of the reproducing kernel Hilbert space (RKHS) that underlies all kernel methods and serves as the sample space for Gaussian process regression. These desirable characteristics reflect the underlying physics; two obvious examples are symmetry and periodicity constraints. In addition, non-stationary kernel designs can be defined in the same framework to yield flexible multi-task Gaussian processes. We will show the impact of advanced kernel designs on Gaussian processes using several synthetic and two scientific data sets. The results show that including domain knowledge, communicated through advanced kernel designs, has a significant impact on the accuracy and relevance of the function approximation.

Kernel matrices, which arise from discretizing a kernel function $k(x,x')$, have a variety of applications in mathematics and engineering. Classically, the celebrated fast multipole method was designed to perform matrix multiplication on kernel matrices of dimension $N$ in time almost linear in $N$ by using techniques later generalized into the linear algebraic framework of hierarchical matrices. In light of this success, we propose a quantum algorithm for efficiently performing matrix operations on hierarchical matrices by implementing a quantum block-encoding of the hierarchical matrix structure. When applied to many kernel matrices, our quantum algorithm can solve quantum linear systems of dimension $N$ in time $O(\kappa \operatorname{polylog}(\frac{N}{\varepsilon}))$, where $\kappa$ and $\varepsilon$ are the condition number and error bound of the matrix operation. This runtime is exponentially faster than any existing quantum algorithms for implementing dense kernel matrices. Finally, we discuss possible applications of our methodology in solving integral equations or accelerating computations in N-body problems.

Gaussian processes are among the most useful tools in modeling continuous processes in machine learning and statistics. If the value of a process is known at a finite collection of points, one may use Gaussian processes to construct a surface which interpolates these values to be used for prediction and uncertainty quantification in other locations. However, it is not always the case that the available information is in the form of a finite collection of points. For example, boundary value problems contain information on the boundary of a domain, which is an uncountable collection of points that cannot be incorporated into typical Gaussian process techniques. In this paper we construct a Gaussian process model which utilizes reproducing kernel Hilbert spaces to unify the typical finite case with the case of having uncountable information by exploiting the equivalence of conditional expectation and orthogonal projections. We discuss this construction in statistical models, including numerical considerations and a proof of concept.

Cluster analysis aims at partitioning data into groups or clusters. In applications, it is common to deal with problems where the number of clusters is unknown. Bayesian mixture models employed in such applications usually specify a flexible prior that takes into account the uncertainty with respect to the number of clusters. However, a major empirical challenge involving the use of these models is in the characterisation of the induced prior on the partitions. This work introduces an approach to compute descriptive statistics of the prior on the partitions for three selected Bayesian mixture models developed in the areas of Bayesian finite mixtures and Bayesian nonparametrics. The proposed methodology involves computationally efficient enumeration of the prior on the number of clusters in-sample (termed as ``data clusters'') and determining the first two prior moments of symmetric additive statistics characterising the partitions. The accompanying reference implementation is made available in the R package 'fipp'. Finally, we illustrate the proposed methodology through comparisons and also discuss the implications for prior elicitation in applications.

The area of Data Analytics on graphs promises a paradigm shift as we approach information processing of classes of data, which are typically acquired on irregular but structured domains (social networks, various ad-hoc sensor networks). Yet, despite its long history, current approaches mostly focus on the optimization of graphs themselves, rather than on directly inferring learning strategies, such as detection, estimation, statistical and probabilistic inference, clustering and separation from signals and data acquired on graphs. To fill this void, we first revisit graph topologies from a Data Analytics point of view, and establish a taxonomy of graph networks through a linear algebraic formalism of graph topology (vertices, connections, directivity). This serves as a basis for spectral analysis of graphs, whereby the eigenvalues and eigenvectors of graph Laplacian and adjacency matrices are shown to convey physical meaning related to both graph topology and higher-order graph properties, such as cuts, walks, paths, and neighborhoods. Next, to illustrate estimation strategies performed on graph signals, spectral analysis of graphs is introduced through eigenanalysis of mathematical descriptors of graphs and in a generic way. Finally, a framework for vertex clustering and graph segmentation is established based on graph spectral representation (eigenanalysis) which illustrates the power of graphs in various data association tasks. The supporting examples demonstrate the promise of Graph Data Analytics in modeling structural and functional/semantic inferences. At the same time, Part I serves as a basis for Part II and Part III which deal with theory, methods and applications of processing Data on Graphs and Graph Topology Learning from data.

Multi-source translation is an approach to exploit multiple inputs (e.g. in two different languages) to increase translation accuracy. In this paper, we examine approaches for multi-source neural machine translation (NMT) using an incomplete multilingual corpus in which some translations are missing. In practice, many multilingual corpora are not complete due to the difficulty to provide translations in all of the relevant languages (for example, in TED talks, most English talks only have subtitles for a small portion of the languages that TED supports). Existing studies on multi-source translation did not explicitly handle such situations. This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol <NULL>. These methods allow us to use incomplete corpora both at training time and test time. In experiments with real incomplete multilingual corpora of TED Talks, the multi-source NMT with the <NULL> tokens achieved higher translation accuracies measured by BLEU than those by any one-to-one NMT systems.

This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data.

北京阿比特科技有限公司