亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Spectral approximation and variational inducing learning for the Gaussian process are two popular methods to reduce computational complexity. However, in previous research, those methods always tend to adopt the orthonormal basis functions, such as eigenvectors in the Hilbert space, in the spectrum method, or decoupled orthogonal components in the variational framework. In this paper, inspired by quantum physics, we introduce a novel basis function, which is tunable, local and bounded, to approximate the kernel function in the Gaussian process. There are two adjustable parameters in these functions, which control their orthogonality to each other and limit their boundedness. And we conduct extensive experiments on open-source datasets to testify its performance. Compared to several state-of-the-art methods, it turns out that the proposed method can obtain satisfactory or even better results, especially with poorly chosen kernel functions.

相關內容

Processing 是一門開源編(bian)程(cheng)語言和(he)與之配套的集(ji)成開發環境(IDE)的名稱。Processing 在電子藝(yi)(yi)術和(he)視覺(jue)設計社區被用(yong)來教授編(bian)程(cheng)基(ji)礎(chu),并運用(yong)于大量的新媒體和(he)互動藝(yi)(yi)術作品中。

In many applications, we have access to the complete dataset but are only interested in the prediction of a particular region of predictor variables. A standard approach is to find the globally best modeling method from a set of candidate methods. However, it is perhaps rare in reality that one candidate method is uniformly better than the others. A natural approach for this scenario is to apply a weighted $L_2$ loss in performance assessment to reflect the region-specific interest. We propose a targeted cross-validation (TCV) to select models or procedures based on a general weighted $L_2$ loss. We show that the TCV is consistent in selecting the best performing candidate under the weighted $L_2$ loss. Experimental studies are used to demonstrate the use of TCV and its potential advantage over the global CV or the approach of using only local data for modeling a local region. Previous investigations on CV have relied on the condition that when the sample size is large enough, the ranking of two candidates stays the same. However, in many applications with the setup of changing data-generating processes or highly adaptive modeling methods, the relative performance of the methods is not static as the sample size varies. Even with a fixed data-generating process, it is possible that the ranking of two methods switches infinitely many times. In this work, we broaden the concept of the selection consistency by allowing the best candidate to switch as the sample size varies, and then establish the consistency of the TCV. This flexible framework can be applied to high-dimensional and complex machine learning scenarios where the relative performances of modeling procedures are dynamic.

Sleeve functions are generalizations of the well-established ridge functions that play a major role in the theory of partial differential equation, medical imaging, statistics, and neural networks. Where ridge functions are non-linear, univariate functions of the distance to hyperplanes, sleeve functions are based on the squared distance to lower-dimensional manifolds. The present work is a first step to study general sleeve functions by starting with sleeve functions based on finite-length curves. To capture these curve-based sleeve functions, we propose and study a two-step method, where first the outer univariate function - the profile - is recovered, and second the underlying curve is represented by a polygonal chain. Introducing a concept of well-separation, we ensure that the proposed method always terminates and approximate the true sleeve function with a certain quality. Investigating the local geometry, we study an inexact version of our method and show its success under certain conditions.

In this work we introduce a concept of complexity for undirected graphs in terms of the spectral analysis of the Laplacian operator defined by the incidence matrix of the graph. Precisely, we compute the norm of the vector of eigenvalues of both the graph and its complement and take their product. Doing so, we obtain a quantity that satisfies two basic properties that are the expected for a measure of complexity. First,complexity of fully connected and fully disconnected graphs vanish. Second, complexity of complementary graphs coincide. This notion of complexity allows us to distinguish different kinds of graphs by placing them in a "croissant-shaped" region of the plane link density - complexity, highlighting some features like connectivity,concentration, uniformity or regularity and existence of clique-like clusters. Indeed, considering graphs with a fixed number of nodes, by plotting the link density versus the complexity we find that graphs generated by different methods take place at different regions of the plane. We consider some generated graphs, in particular the Erd\"os-R\'enyi, the Watts-Strogatz and the Barab\'asi-Albert models. Also, we place some particular, let us say deterministic, to wit, lattices, stars, hyper-concentrated and cliques-containing graphs. It is worthy noticing that these deterministic classical models of graphs depict the boundary of the croissant-shaped region. Finally, as an application to graphs generated by real measurements, we consider the brain connectivity graphs from two epileptic patients obtained from magnetoencephalography (MEG) recording, both in a baseline period and in ictal periods .In this case, our definition of complexity could be used as a tool for discerning between states, by the analysis of differences at distinct frequencies of the MEG recording.

We consider increasingly complex models of matrix denoising and dictionary learning in the Bayes-optimal setting, in the challenging regime where the matrices to infer have a rank growing linearly with the system size. This is in contrast with most existing literature concerned with the low-rank (i.e., constant-rank) regime. We first consider a class of rotationally invariant matrix denoising problems whose mutual information and minimum mean-square error are computable using standard techniques from random matrix theory. Next, we analyze the more challenging models of dictionary learning. To do so we introduce a novel combination of the replica method from statistical mechanics together with random matrix theory, coined spectral replica method. It allows us to conjecture variational formulas for the mutual information between hidden representations and the noisy data as well as for the overlaps quantifying the optimal reconstruction error. The proposed methods reduce the number of degrees of freedom from $\Theta(N^2)$ (matrix entries) to $\Theta(N)$ (eigenvalues or singular values), and yield Coulomb gas representations of the mutual information which are reminiscent of matrix models in physics. The main ingredients are the use of HarishChandra-Itzykson-Zuber spherical integrals combined with a new replica symmetric decoupling ansatz at the level of the probability distributions of eigenvalues (or singular values) of certain overlap matrices.

This paper examines the problem of testing whether a discrete time-series vector contains a periodic signal or is merely noise. To do this we examine the stochastic behaviour of the maximum intensity of the observed time-series vector and formulate a simple hypothesis test that rejects the null hypothesis of exchangeability if the maximum intensity spike in the Fourier domain is "too big" relative to its null distribution. This comparison is undertaken by simulating the null distribution of the maximum intensity using random permutations of the time-series vector. We show that this test has a p-value that is uniformly distributed for an exchangeable time-series vector, and that the p-value increases when there is a periodic signal present in the observed vector. We compare our test to Fisher's spectrum test, which assumes normality of the underlying noise terms. We show that our test is more robust than this test, and accommodates noise vectors with fat tails.

Value iteration is a fixed point iteration technique utilized to obtain the optimal value function and policy in a discounted reward Markov Decision Process (MDP). Here, a contraction operator is constructed and applied repeatedly to arrive at the optimal solution. Value iteration is a first order method and therefore it may take a large number of iterations to converge to the optimal solution. Successive relaxation is a popular technique that can be applied to solve a fixed point equation. It has been shown in the literature that, under a special structure of the MDP, successive over-relaxation technique computes the optimal value function faster than standard value iteration. In this work, we propose a second order value iteration procedure that is obtained by applying the Newton-Raphson method to the successive relaxation value iteration scheme. We prove the global convergence of our algorithm to the optimal solution asymptotically and show the second order convergence. Through experiments, we demonstrate the effectiveness of our proposed approach.

Multitask Gaussian processes (MTGP) are the Gaussian process (GP) framework's solution for multioutput regression problems in which the $T$ elements of the regressors cannot be considered conditionally independent given the observations. Standard MTGP models assume that there exist both a multitask covariance matrix as a function of an intertask matrix, and a noise covariance matrix. These matrices need to be approximated by a low rank simplification of order $P$ in order to reduce the number of parameters to be learnt from $T^2$ to $TP$. Here we introduce a novel approach that simplifies the multitask learning by reducing it to a set of conditioned univariate GPs without the need for any low rank approximations, therefore completely eliminating the requirement to select an adequate value for hyperparameter $P$. At the same time, by extending this approach with both a hierarchical and an approximate model, the proposed extensions are capable of recovering the multitask covariance and noise matrices after learning only $2T$ parameters, avoiding the validation of any model hyperparameter and reducing the overall complexity of the model as well as the risk of overfitting. Experimental results over synthetic and real problems confirm the advantages of this inference approach in its ability to accurately recover the original noise and signal matrices, as well as the achieved performance improvement in comparison to other state of art MTGP approaches. We have also integrated the model with standard GP toolboxes, showing that it is computationally competitive with state of the art options.

Attackers can access sensitive information of programs by exploiting the side-effects of speculatively-executed instructions using Spectre attacks. To mitigate theses attacks, popular compilers deployed a wide range of countermeasures. The security of these countermeasures, however, has not been ascertained: while some of them are believed to be secure, others are known to be insecure and result in vulnerable programs. To reason about the security guarantees of these compiler-inserted countermeasures, this paper presents a framework comprising several secure compilation criteria characterizing when compilers produce code resistant against Spectre attacks. With this framework, we perform a comprehensive security analysis of compiler-level countermeasures against Spectre attacks implemented in major compilers. This work provides sound foundations to formally reason about the security of compiler-level countermeasures against Spectre attacks as well as the first proofs of security and insecurity of said countermeasures.

Random walk based node embedding algorithms learn vector representations of nodes by optimizing an objective function of node embedding vectors and skip-bigram statistics computed from random walks on the network. They have been applied to many supervised learning problems such as link prediction and node classification and have demonstrated state-of-the-art performance. Yet, their properties remain poorly understood. This paper studies properties of random walk based node embeddings in the unsupervised setting of discovering hidden block structure in the network, i.e., learning node representations whose cluster structure in Euclidean space reflects their adjacency structure within the network. We characterize the ergodic limits of the embedding objective, its generalization, and related convex relaxations to derive corresponding non-randomized versions of the node embedding objectives. We also characterize the optimal node embedding Grammians of the non-randomized objectives for the expected graph of a two-community Stochastic Block Model (SBM). We prove that the solution Grammian has rank $1$ for a suitable nuclear norm relaxation of the non-randomized objective. Comprehensive experimental results on SBM random networks reveal that our non-randomized ergodic objectives yield node embeddings whose distribution is Gaussian-like, centered at the node embeddings of the expected network within each community, and concentrate in the linear degree-scaling regime as the number of nodes increases.

Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them. We train SpectralNet using a procedure that involves constrained stochastic optimization. Stochastic optimization allows it to scale to large datasets, while the constraints, which are implemented using a special-purpose output layer, allow us to keep the network output orthogonal. Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities leaned from unlabeled data using a Siamese network. Additional improvement can be achieved by applying the network to code representations produced, e.g., by standard autoencoders. Our end-to-end learning procedure is fully unsupervised. In addition, we apply VC dimension theory to derive a lower bound on the size of SpectralNet. State-of-the-art clustering results are reported on the Reuters dataset. Our implementation is publicly available at //github.com/kstant0725/SpectralNet .

北京阿比特科技有限公司