亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Computing the marginal likelihood (also called the Bayesian model evidence) is an important task in Bayesian model selection, providing a principled quantitative way to compare models. The learned harmonic mean estimator solves the exploding variance problem of the original harmonic mean estimation of the marginal likelihood. The learned harmonic mean estimator learns an importance sampling target distribution that approximates the optimal distribution. While the approximation need not be highly accurate, it is critical that the probability mass of the learned distribution is contained within the posterior in order to avoid the exploding variance problem. In previous work a bespoke optimization problem is introduced when training models in order to ensure this property is satisfied. In the current article we introduce the use of normalizing flows to represent the importance sampling target distribution. A flow-based model is trained on samples from the posterior by maximum likelihood estimation. Then, the probability density of the flow is concentrated by lowering the variance of the base distribution, i.e. by lowering its "temperature", ensuring its probability mass is contained within the posterior. This approach avoids the need for a bespoke optimisation problem and careful fine tuning of parameters, resulting in a more robust method. Moreover, the use of normalizing flows has the potential to scale to high dimensional settings. We present preliminary experiments demonstrating the effectiveness of the use of flows for the learned harmonic mean estimator. The harmonic code implementing the learned harmonic mean, which is publicly available, has been updated to now support normalizing flows.

相關內容

We address the classical inverse problem of recovering the position and shape of obstacles immersed in a planar Stokes flow using boundary measurements. We prove that this problem can be transformed into a shape-from-moments problem to which ad hoc reconstruction methods can be applied. The effectiveness of this approach is confirmed by numerical tests that show significant improvements over those available in the literature to date.

Model averaging (MA), a technique for combining estimators from a set of candidate models, has attracted increasing attention in machine learning and statistics. In the existing literature, there is an implicit understanding that MA can be viewed as a form of shrinkage estimation that draws the response vector towards the subspaces spanned by the candidate models. This paper explores this perspective by establishing connections between MA and shrinkage in a linear regression setting with multiple nested models. We first demonstrate that the optimal MA estimator is the best linear estimator with monotone non-increasing weights in a Gaussian sequence model. The Mallows MA, which estimates weights by minimizing the Mallows' $C_p$, is a variation of the positive-part Stein estimator. Motivated by these connections, we develop a novel MA procedure based on a blockwise Stein estimation. Our resulting Stein-type MA estimator is asymptotically optimal across a broad parameter space when the variance is known. Numerical results support our theoretical findings. The connections established in this paper may open up new avenues for investigating MA from different perspectives. A discussion on some topics for future research concludes the paper.

This note presents a refined local approximation for the logarithm of the ratio between the negative multinomial probability mass function and a multivariate normal density, both having the same mean-covariance structure. This approximation, which is derived using Stirling's formula and a meticulous treatment of Taylor expansions, yields an upper bound on the Hellinger distance between the jittered negative multinomial distribution and the corresponding multivariate normal distribution. Upper bounds on the Le Cam distance between negative multinomial and multivariate normal experiments ensue.

We present a method for computing nearly singular integrals that occur when single or double layer surface integrals, for harmonic potentials or Stokes flow, are evaluated at points nearby. Such values could be needed in solving an integral equation when one surface is close to another or to obtain values at grid points. We replace the singular kernel with a regularized version having a length parameter $\delta$ in order to control discretization error. Analysis near the singularity leads to an expression for the error due to regularization which has terms with unknown coefficients multiplying known quantities. By computing the integral with three choices of $\delta$ we can solve for an extrapolated value that has regularization error reduced to $O(\delta^5)$. In examples with $\delta/h$ constant and moderate resolution we observe total error about $O(h^5)$. For convergence as $h \to 0$ we can choose $\delta$ proportional to $h^q$ with $q < 1$ to ensure the discretization error is dominated by the regularization error. With $q = 4/5$ we find errors about $O(h^4)$. For harmonic potentials we extend the approach to a version with $O(\delta^7)$ regularization; it typically has smaller errors but the order of accuracy is less predictable.

We investigate the combinatorics of max-pooling layers, which are functions that downsample input arrays by taking the maximum over shifted windows of input coordinates, and which are commonly used in convolutional neural networks. We obtain results on the number of linearity regions of these functions by equivalently counting the number of vertices of certain Minkowski sums of simplices. We characterize the faces of such polytopes and obtain generating functions and closed formulas for the number of vertices and facets in a 1D max-pooling layer depending on the size of the pooling windows and stride, and for the number of vertices in a special case of 2D max-pooling.

As Basu (1977) writes, "Eliminating nuisance parameters from a model is universally recognized as a major problem of statistics," but after more than 50 years since Basu wrote these words, the two mainstream schools of thought in statistics have yet to solve the problem. Fortunately, the two mainstream frameworks aren't the only options. This series of papers rigorously develops a new and very general inferential model (IM) framework for imprecise-probabilistic statistical inference that is provably valid and efficient, while simultaneously accommodating incomplete or partial prior information about the relevant unknowns when it's available. The present paper, Part III in the series, tackles the marginal inference problem. Part II showed that, for parametric models, the likelihood function naturally plays a central role and, here, when nuisance parameters are present, the same principles suggest that the profile likelihood is the key player. When the likelihood factors nicely, so that the interest and nuisance parameters are perfectly separated, the valid and efficient profile-based marginal IM solution is immediate. But even when the likelihood doesn't factor nicely, the same profile-based solution remains valid and leads to efficiency gains. This is demonstrated in several examples, including the famous Behrens--Fisher and gamma mean problems, where I claim the proposed IM solution is the best solution available. Remarkably, the same profiling-based construction offers validity guarantees in the prediction and non-parametric inference problems. Finally, I show how a broader view of this new IM construction can handle non-parametric inference on risk minimizers and makes a connection between non-parametric IMs and conformal prediction.

In this paper, a high-order approximation to Caputo-type time-fractional diffusion equations involving an initial-time singularity of the solution is proposed. At first, we employ a numerical algorithm based on the Lagrange polynomial interpolation to approximate the Caputo derivative on the non-uniform mesh. Then truncation error rate and the optimal grading constant of the approximation on a graded mesh are obtained as $\min\{4-\alpha,r\alpha\}$ and $\frac{4-\alpha}{\alpha}$, respectively, where $\alpha\in(0,1)$ is the order of fractional derivative and $r\geq 1$ is the mesh grading parameter. Using this new approximation, a difference scheme for the Caputo-type time-fractional diffusion equation on graded temporal mesh is formulated. The scheme proves to be uniquely solvable for general $r$. Then we derive the unconditional stability of the scheme on uniform mesh. The convergence of the scheme, in particular for $r=1$, is analyzed for non-smooth solutions and concluded for smooth solutions. Finally, the accuracy of the scheme is verified by analyzing the error through a few numerical examples.

We propose a new class of models for variable clustering called Asymptotic Independent block (AI-block) models, which defines population-level clusters based on the independence of the maxima of a multivariate stationary mixing random process among clusters. This class of models is identifiable, meaning that there exists a maximal element with a partial order between partitions, allowing for statistical inference. We also present an algorithm for recovering the clusters of variables without specifying the number of clusters \emph{a priori}. Our work provides some theoretical insights into the consistency of our algorithm, demonstrating that under certain conditions it can effectively identify clusters in the data with a computational complexity that is polynomial in the dimension. This implies that groups can be learned nonparametrically in which block maxima of a dependent process are only sub-asymptotic. To further illustrate the significance of our work, we applied our method to neuroscience and environmental real-datasets. These applications highlight the potential and versatility of the proposed approach.

This paper studies the extreme singular values of non-harmonic Fourier matrices. Such a matrix of size $m\times s$ can be written as $\Phi=[ e^{-2\pi i j x_k}]_{j=0,1,\dots,m-1, k=1,2,\dots,s}$ for some set $\mathcal{X}=\{x_k\}_{k=1}^s$. The main results provide explicit lower bounds for the smallest singular value of $\Phi$ under the assumption $m\geq 6s$ and without any restrictions on $\mathcal{X}$. They show that for an appropriate scale $\tau$ determined by a density criteria, interactions between elements in $\mathcal{X}$ at scales smaller than $\tau$ are most significant and depends on the multiscale structure of $\mathcal{X}$ at fine scales, while distances larger than $\tau$ are less important and only depend on the local sparsity of the far away points. Theoretical and numerical comparisons show that the main results significantly improve upon classical bounds and achieve the same rate that was previously discovered for more restrictive settings.

We propose an approach to compute inner and outer-approximations of the sets of values satisfying constraints expressed as arbitrarily quantified formulas. Such formulas arise for instance when specifying important problems in control such as robustness, motion planning or controllers comparison. We propose an interval-based method which allows for tractable but tight approximations. We demonstrate its applicability through a series of examples and benchmarks using a prototype implementation.

北京阿比特科技有限公司