亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Information geometry of Markov chains has been studied by Nagaoka, Takeuchi and others using the dually flat structure of the space of transition probabilities. In this context, a submanifold of the space is called a Markov model. In the present paper, we seek for a theory of extended spaces of Markov models in the following sense. As a prototype, for the space of probability distributions on a finite set, Amari has introduced the space of positive measures simply by removing the constraint condition that the total mass is equal to $1$ and investigated the extended space by finding the Bregman and $F$-divergence suitably. According to this line, we introduce an extension of the space of transition probabilities equipped with suitable $F$-divergence for a given Markov chain. We regard it as the space of positive transition measures on a Markov chain, and study the dually flat structure on the space. That provides a new insight on the geometry of Markov chains. We also discuss a relation with other existing work.

相關內容

We prove asymptotic results for a modification of the cross-entropy estimator originally introduced by Ziv and Merhav in the Markovian setting in 1993. Our results concern a more general class of decoupled measures. In particular, our results imply strong asymptotic consistency of the modified estimator for all pairs of functions of stationary, irreducible, finite-state Markov chains satisfying a mild decay condition. {Our approach is based on the study of a rescaled cumulant-generating function called the cross-entropic pressure, importing to information theory some techniques from the study of large deviations within the thermodynamic formalism.

The Manifold Hypothesis is a widely accepted tenet of Machine Learning which asserts that nominally high-dimensional data are in fact concentrated near a low-dimensional manifold, embedded in high-dimensional space. This phenomenon is observed empirically in many real world situations, has led to development of a wide range of statistical methods in the last few decades, and has been suggested as a key factor in the success of modern AI technologies. We show that rich and sometimes intricate manifold structure in data can emerge from a generic and remarkably simple statistical model -- the Latent Metric Model -- via elementary concepts such as latent variables, correlation and stationarity. This establishes a general statistical explanation for why the Manifold Hypothesis seems to hold in so many situations. Informed by the Latent Metric Model we derive procedures to discover and interpret the geometry of high-dimensional data, and explore hypotheses about the data generating mechanism. These procedures operate under minimal assumptions and make use of well known, scaleable graph-analytic algorithms.

We give the first numerical calculation of the spectrum of the Laplacian acting on bundle-valued forms on a Calabi-Yau three-fold. Specifically, we show how to compute the approximate eigenvalues and eigenmodes of the Dolbeault Laplacian acting on bundle-valued $(p,q)$-forms on K\"ahler manifolds. We restrict our attention to line bundles over complex projective space and Calabi-Yau hypersurfaces therein. We give three examples. For two of these, $\mathbb{P}^3$ and a Calabi-Yau one-fold (a torus), we compare our numerics with exact results available in the literature and find complete agreement. For the third example, the Fermat quintic three-fold, there are no known analytic results, so our numerical calculations are the first of their kind. The resulting spectra pass a number of non-trivial checks that arise from Serre duality and the Hodge decomposition. The outputs of our algorithm include all the ingredients one needs to compute physical Yukawa couplings in string compactifications.

We characterize the strength, in terms of Weihrauch degrees, of certain problems related to Ramsey-like theorems concerning colourings of the rationals and of the natural numbers. The theorems we are chiefly interested in assert the existence of almost-homogeneous sets for colourings of pairs of rationals respectively natural numbers satisfying properties determined by some additional algebraic structure on the set of colours. In the context of reverse mathematics, most of the principles we study are equivalent to $\Sigma^0_2$-induction over $\mathrm{RCA}_0$. The associated problems in the Weihrauch lattice are related to $\mathrm{TC}_\mathbb{N}^*$, $(\mathrm{LPO}')^*$ or their product, depending on their precise formalizations.

We consider the numerical approximation of Gaussian random fields on closed surfaces defined as the solution to a fractional stochastic partial differential equation (SPDE) with additive white noise. The SPDE involves two parameters controlling the smoothness and the correlation length of the Gaussian random field. The proposed numerical method relies on the Balakrishnan integral representation of the solution and does not require the approximation of eigenpairs. Rather, it consists of a sinc quadrature coupled with a standard surface finite element method. We provide a complete error analysis of the method and illustrate its performances by several numerical experiments.

This work is concerned with the analysis of a space-time finite element discontinuous Galerkin method on polytopal meshes (XT-PolydG) for the numerical discretization of wave propagation in coupled poroelastic-elastic media. The mathematical model consists of the low-frequency Biot's equations in the poroelastic medium and the elastodynamics equation for the elastic one. To realize the coupling, suitable transmission conditions on the interface between the two domains are (weakly) embedded in the formulation. The proposed PolydG discretization in space is then coupled with a dG time integration scheme, resulting in a full space-time dG discretization. We present the stability analysis for both the continuous and the semidiscrete formulations, and we derive error estimates for the semidiscrete formulation in a suitable energy norm. The method is applied to a wide set of numerical test cases to verify the theoretical bounds. Examples of physical interest are also presented to investigate the capability of the proposed method in relevant geophysical scenarios.

In this note we use the State of the Union Address dataset from Kaggle to make some surprising (and some not so surprising) observations pertaining to the general timeline of American history, and the character and nature of the addresses themselves. Our main approach is using vector embeddings, such as BERT (DistilBERT) and GPT-2. While it is widely believed that BERT (and its variations) is most suitable for NLP classification tasks, we find out that GPT-2 in conjunction with nonlinear dimension reduction methods such as UMAP provide better separation and stronger clustering. This makes GPT-2 + UMAP an interesting alternative. In our case, no model fine-tuning is required, and the pre-trained out-of-the-box GPT-2 model is enough. We also used a fine-tuned DistilBERT model for classification (detecting which president delivered which address), with very good results (accuracy 93% - 95% depending on the run). All computations can be replicated by using the accompanying code on GitHub.

In many applications, it is desired to obtain extreme eigenvalues and eigenvectors of large Hermitian matrices by efficient and compact algorithms. In particular, orthogonalization-free methods are preferred for large-scale problems for finding eigenspaces of extreme eigenvalues without explicitly computing orthogonal vectors in each iteration. For the top $p$ eigenvalues, the simplest orthogonalization-free method is to find the best rank-$p$ approximation to a positive semi-definite Hermitian matrix by algorithms solving the unconstrained Burer-Monteiro formulation. We show that the nonlinear conjugate gradient method for the unconstrained Burer-Monteiro formulation is equivalent to a Riemannian conjugate gradient method on a quotient manifold with the Bures-Wasserstein metric, thus its global convergence to a stationary point can be proven. Numerical tests suggest that it is efficient for computing the largest $k$ eigenvalues for large-scale matrices if the largest $k$ eigenvalues are nearly distributed uniformly.

Robots operating in an open world will encounter novel objects with unknown physical properties, such as mass, friction, or size. These robots will need to sense these properties through interaction prior to performing downstream tasks with the objects. We propose a method that autonomously learns tactile exploration policies by developing a generative world model that is leveraged to 1) estimate the object's physical parameters using a differentiable Bayesian filtering algorithm and 2) develop an exploration policy using an information-gathering model predictive controller. We evaluate our method on three simulated tasks where the goal is to estimate a desired object property (mass, height or toppling height) through physical interaction. We find that our method is able to discover policies that efficiently gather information about the desired property in an intuitive manner. Finally, we validate our method on a real robot system for the height estimation task, where our method is able to successfully learn and execute an information-gathering policy from scratch.

Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.

北京阿比特科技有限公司