亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We present a comparison between various algorithms of inference of covariance and precision matrices in small datasets of real vectors, of the typical length and dimension of human brain activity time series retrieved by functional Magnetic Resonance Imaging (fMRI). Assuming a Gaussian model underlying the neural activity, the problem consists in denoising the empirically observed matrices in order to obtain a better estimator of the true precision and covariance matrices. We consider several standard noise-cleaning algorithms and compare them on two types of datasets. The first type are time series of fMRI brain activity of human subjects at rest. The second type are synthetic time series sampled from a generative Gaussian model of which we can vary the fraction of dimensions per sample q = N/T and the strength of off-diagonal correlations. The reliability of each algorithm is assessed in terms of test-set likelihood and, in the case of synthetic data, of the distance from the true precision matrix. We observe that the so called Optimal Rotationally Invariant Estimator, based on Random Matrix Theory, leads to a significantly lower distance from the true precision matrix in synthetic data, and higher test likelihood in natural fMRI data. We propose a variant of the Optimal Rotationally Invariant Estimator in which one of its parameters is optimised by cross-validation. In the severe undersampling regime (large q) typical of fMRI series, it outperforms all the other estimators. We furthermore propose a simple algorithm based on an iterative likelihood gradient ascent, providing an accurate estimation for weakly correlated datasets.

相關內容

Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or ``generalizable learning of optimizers"); and (ii) the test performance of an optimizee (itself as a machine learning model), trained by the optimizer, in terms of the accuracy over unseen data (optimizee generalization, or ``learning to generalize"). While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper. We first theoretically establish an implicit connection between the local entropy and the Hessian, and hence unify their roles in the handcrafted design of generalizable optimizers as equivalent metrics of the landscape flatness of loss functions. We then propose to incorporate these two metrics as flatness-aware regularizers into the L2O framework in order to meta-train optimizers to learn to generalize, and theoretically show that such generalization ability can be learned during the L2O meta-training process and then transformed to the optimizee loss function. Extensive experiments consistently validate the effectiveness of our proposals with substantially improved generalization on multiple sophisticated L2O models and diverse optimizees. Our code is available at: //github.com/VITA-Group/Open-L2O/tree/main/Model_Free_L2O/L2O-Entropy.

We propose a text-to-image generation algorithm based on deep neural networks when text captions for images are unavailable during training. In this work, instead of simply generating pseudo-ground-truth sentences of training images using existing image captioning methods, we employ a pretrained CLIP model, which is capable of properly aligning embeddings of images and corresponding texts in a joint space and, consequently, works well on zero-shot recognition tasks. We optimize a text-to-image generation model by maximizing the data log-likelihood conditioned on pairs of image-text CLIP embeddings. To better align data in the two domains, we employ a principled way based on a variational inference, which efficiently estimates an approximate posterior of the hidden text embedding given an image and its CLIP feature. Experimental results validate that the proposed framework outperforms existing approaches by large margins under unsupervised and semi-supervised text-to-image generation settings.

Here, we investigate whether (and how) experimental design could aid in the estimation of the precision matrix in a Gaussian chain graph model, especially the interplay between the design, the effect of the experiment and prior knowledge about the effect. Estimation of the precision matrix is a fundamental task to infer biological graphical structures like microbial networks. We compare the marginal posterior precision of the precision matrix under four priors: flat, conjugate Normal-Wishart, Normal-MGIG and a general independent. Under the flat and conjugate priors, the Laplace-approximated posterior precision is not a function of the design matrix rendering useless any efforts to find an optimal experimental design to infer the precision matrix. In contrast, the Normal-MGIG and general independent priors do allow for the search of optimal experimental designs, yet there is a sharp upper bound on the information that can be extracted from a given experiment. We confirm our theoretical findings via a simulation study comparing i) the KL divergence between prior and posterior and ii) the Stein's loss difference of MAPs between random and no experiment. Our findings provide practical advice for domain scientists conducting experiments to better infer the precision matrix as a representation of a biological network.

In this work, we study the concentration behavior of a stochastic approximation (SA) algorithm under a contractive operator with respect to an arbitrary norm. We consider two settings where the iterates are potentially unbounded: (1) bounded multiplicative noise, and (2) additive sub-Gaussian noise. We obtain maximal concentration inequalities on the convergence errors, and show that these errors have sub-Gaussian tails in the additive noise setting, and super-polynomial tails (faster than polynomial decay) in the multiplicative noise setting. In addition, we provide an impossibility result showing that it is in general not possible to achieve sub-exponential tails for SA with multiplicative noise. To establish these results, we develop a novel bootstrapping argument that involves bounding the moment generating function of the generalized Moreau envelope of the error and the construction of an exponential supermartingale to enable using Ville's maximal inequality. To demonstrate the applicability of our theoretical results, we use them to provide maximal concentration bounds for a large class of reinforcement learning algorithms, including but not limited to on-policy TD-learning with linear function approximation, off-policy TD-learning with generalized importance sampling factors, and $Q$-learning. To the best of our knowledge, super-polynomial concentration bounds for off-policy TD-learning have not been established in the literature due to the challenge of handling the combination of unbounded iterates and multiplicative noise.

We continue the study of the performance for fixed-price mechanisms in the bilateral trade problem, and improve approximation ratios of welfare-optimal mechanisms in several settings. Specifically, in the case where only the buyer distribution is known, we prove that there exists a distribution over different fixed-price mechanisms, such that the approximation ratio lies within the interval of [0.71, 0.7381]. Furthermore, we show that the same approximation ratio holds for the optimal fixed-price mechanism, when both buyer and seller distributions are known. As a result, the previously best-known (1 - 1/e+0.0001)-approximation can be improved to $0.71$. Additionally, we examine randomized fixed-price mechanisms when we receive just one single sample from the seller distribution, for both symmetric and asymmetric settings. Our findings reveal that posting the single sample as the price remains optimal among all randomized fixed-price mechanisms.

Unsplittable flow problems cover a wide range of telecommunication and transportation problems and their efficient resolution is key to a number of applications. In this work, we study algorithms that can scale up to large graphs and important numbers of commodities. We present and analyze in detail a heuristic based on the linear relaxation of the problem and randomized rounding. We provide empirical evidence that this approach is competitive with state-of-the-art resolution methods either by its scaling performance or by the quality of its solutions. We provide a variation of the heuristic which has the same approximation factor as the state-of-the-art approximation algorithm. We also derive a tighter analysis for the approximation factor of both the variation and the state-of-the-art algorithm. We introduce a new objective function for the unsplittable flow problem and discuss its differences with the classical congestion objective function. Finally, we discuss the gap in practical performance and theoretical guarantees between all the aforementioned algorithms.

The eigenvalue method, suggested by the developer of the extensively used Analytic Hierarchy Process methodology, exhibits right-left asymmetry: the priorities derived from the right eigenvector do not necessarily coincide with the priorities derived from the reciprocal left eigenvector. This paper offers a comprehensive numerical experiment to compare the two eigenvector-based weighting procedures and their reasonable alternative of the row geometric mean with respect to four measures. The underlying pairwise comparison matrices are constructed randomly with different dimensions and levels of inconsistency. The disagreement between the two eigenvectors turns out to be not always a monotonic function of these important characteristics of the matrix. The ranking contradictions can affect alternatives with relatively distant priorities. The row geometric mean is found to be almost at the midpoint between the right and inverse left eigenvectors, making it a straightforward compromise between them.

Machine learning has emerged as a powerful tool for time series analysis. Existing methods are usually customized for different analysis tasks and face challenges in tackling practical problems such as partial labeling and domain shift. To achieve universal analysis and address the aforementioned problems, we develop UniTS, a novel framework that incorporates self-supervised representation learning (or pre-training). The components of UniTS are designed using sklearn-like APIs to allow flexible extensions. We demonstrate how users can easily perform an analysis task using the user-friendly GUIs, and show the superior performance of UniTS over the traditional task-specific methods without self-supervised pre-training on five mainstream tasks and two practical settings.

In this paper, we consider a new approach for semi-discretization in time and spatial discretization of a class of semi-linear stochastic partial differential equations (SPDEs) with multiplicative noise. The drift term of the SPDEs is only assumed to satisfy a one-sided Lipschitz condition and the diffusion term is assumed to be globally Lipschitz continuous. Our new strategy for time discretization is based on the Milstein method from stochastic differential equations. We use the energy method for its error analysis and show a strong convergence order of nearly $1$ for the approximate solution. The proof is based on new H\"older continuity estimates of the SPDE solution and the nonlinear term. For the general polynomial-type drift term, there are difficulties in deriving even the stability of the numerical solutions. We propose an interpolation-based finite element method for spatial discretization to overcome the difficulties. Then we obtain $H^1$ stability, higher moment $H^1$ stability, $L^2$ stability, and higher moment $L^2$ stability results using numerical and stochastic techniques. The nearly optimal convergence orders in time and space are hence obtained by coupling all previous results. Numerical experiments are presented to implement the proposed numerical scheme and to validate the theoretical results.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司