亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The motivation for this paper is to detect when an irreducible projective variety V is not toric. We do this by analyzing a Lie group and a Lie algebra associated to V. If the dimension of V is strictly less than the dimension of the above mentioned objects, then V is not a toric variety. We provide an algorithm to compute the Lie algebra of an irreducible variety and use it to provide examples of non-toric statistical models in algebraic statistics.

相關內容

We proposed an extension of Akaike's relative power contribution that could be applied to data with correlations between noises. This method decomposes the power spectrum into a contribution of the terms caused by correlation between two noises, in addition to the contributions of the independent noises. Numerical examples confirm that some of the correlated noise has the effect of reducing the power spectrum.

For industrial learning-to-rank (LTR) systems, it is common that the output of a ranking model is modified, either as a results of post-processing logic that enforces business requirements, or as a result of unforeseen design flaws or bugs present in real-world production systems. This poses a challenge for deploying off-policy learning and evaluation methods, as these often rely on the assumption that rankings implied by the model's scores coincide with displayed items to the users. Further requirements for reliable offline evaluation are proper randomization and correct estimation of the propensities of displaying each item in any given position of the ranking, which are also impacted by the aforementioned post-processing. We investigate empirically how these scenarios impair off-policy evaluation for learning-to-rank models. We then propose a novel correction method based on the Birkhoff-von-Neumann decomposition that is robust to this type of post-processing. We obtain more accurate off-policy estimates in offline experiments, overcoming the problem of post-processed rankings. To the best of our knowledge this is the first study on the impact of real-world business rules on offline evaluation of LTR models.

This paper discusses the formalization of proofs "by diagram chasing", a standard technique for proving properties in abelian categories. We discuss how the essence of diagram chases can be captured by a simple many-sorted first-order theory, and we study the models and decidability of this theory. The longer-term motivation of this work is the design of a computer-aided instrument for writing reliable proofs in homological algebra, based on interactive theorem provers.

It has been shown in previous works that an optimal control formulation for an incompressible ideal fluid flow yields Euler's fluid equations. In this paper we consider the modified Euler's equations by adding a potential function playing the role of a barrier function in the corresponding optimal control problem with the motivation of studying obstacle avoidance in the motion of fluid particles for incompressible ideal flows of an inviscid fluid From the physical point of view, imposing an artificial potential in the fluid context is equivalent to generating a desired pressure. Simulation results for the obstacle avoidance task are provided.

We consider the lossless compression bound of any single data sequence. If we fit the data by a parametric model, the entropy quantity $nH({\hat \theta}_n)$ obtained by plugging in the maximum likelihood estimate is an underestimate of the bound, where $n$ is the number of words. Shtarkov showed that the normalized maximum likelihood (NML) distribution or code length is optimal in a minimax sense for any parametric family. We show by the local asymptotic normality that the NML code length for the exponential families is $nH(\hat \theta_n) +\frac{d}{2}\log \, \frac{n}{2\pi} +\log \int_{\Theta} |I(\theta)|^{1/2}\, d\theta+o(1)$, where $d$ is the model dimension or dictionary size, and $|I(\theta)|$ is the determinant of the Fisher information matrix. We also demonstrate that sequentially predicting the optimal code length for the next word via a Bayesian mechanism leads to the mixture code, whose pathwise length is given by $nH({\hat \theta}_n) +\frac{d}{2}\log \, \frac{n}{2\pi} +\log \frac{|\, I({\hat \theta}_n)|^{1/2}}{w({\hat \theta}_n)}+o(1) $, where $w(\theta)$ is a prior. The asymptotics apply to not only discrete symbols but also continuous data if the code length for the former is replaced by the description length of the latter. The analytical result is exemplified by calculating compression bounds of protein-encoding DNA sequences under different parsing models. Typically, the highest compression is achieved when the parsing is in phase of the amino acid codons. On the other hand, the compression rates of pseudo-random sequences are larger than 1 regardless parsing models. These model-based results are in consistency with that random sequences are incompressible as asserted by the Kolmogorov complexity theory. The empirical lossless compression bound is particularly more accurate when dictionary size is relatively large.

Can a micron sized sack of interacting molecules autonomously learn an internal model of a complex and fluctuating environment? We draw insights from control theory, machine learning theory, chemical reaction network theory, and statistical physics to develop a general architecture whereby a broad class of chemical systems can autonomously learn complex distributions. Our construction takes the form of a chemical implementation of machine learning's optimization workhorse: gradient descent on the relative entropy cost function. We show how this method can be applied to optimize any detailed balanced chemical reaction network and that the construction is capable of using hidden units to learn complex distributions. This result is then recast as a form of integral feedback control. Finally, due to our use of an explicit physical model of learning, we are able to derive thermodynamic costs and trade-offs associated to this process.

A recent development in Bayesian optimization is the use of local optimization strategies, which can deliver strong empirical performance on high-dimensional problems compared to traditional global strategies. The "folk wisdom" in the literature is that the focus on local optimization sidesteps the curse of dimensionality; however, little is known concretely about the expected behavior or convergence of Bayesian local optimization routines. We first study the behavior of the local approach, and find that the statistics of individual local solutions of Gaussian process sample paths are surprisingly good compared to what we would expect to recover from global methods. We then present the first rigorous analysis of such a Bayesian local optimization algorithm recently proposed by M\"uller et al. (2021), and derive convergence rates in both the noisy and noiseless settings.

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.

While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on the ImageNet classification task has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new Full Reference Image Quality Assessment (FR-IQA) dataset of perceptual human judgments, orders of magnitude larger than previous datasets. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by huge margins. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.

北京阿比特科技有限公司