亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency -- given $n$ input points, most kernel-based algorithms need to materialize the full $n \times n$ kernel matrix before performing any subsequent computation, thus incurring $\Omega(n^2)$ runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain $\textit{subquadratic}$ time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving linear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recent Kernel Density Estimation framework, which (after preprocessing in time subquadratic in $n$) can return estimates of row/column sums of the kernel matrix. In particular, we develop efficient reductions from $\textit{weighted vertex}$ and $\textit{weighted edge sampling}$ on kernel graphs, $\textit{simulating random walks}$ on kernel graphs, and $\textit{importance sampling}$ on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in $\textit{sublinear}$ (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsification, where we observe a $\textbf{9x}$ decrease in the number of kernel evaluations over baselines for LRA and a $\textbf{41x}$ reduction in the graph size for spectral sparsification.

相關內容

Multi-robot cooperative control has gained extensive research interest due to its wide applications in civil, security, and military domains. This paper proposes a cooperative control algorithm for multi-robot systems with general linear dynamics. The algorithm is based on distributed cooperative optimisation and output regulation, and it achieves global optimum by utilising only information shared among neighbouring robots. Technically, a high-level distributed optimisation algorithm for multi-robot systems is presented, which will serve as an optimal reference generator for each individual agent. Then, based on the distributed optimisation algorithm, an output regulation method is utilised to solve the optimal coordination problem for general linear dynamic systems. The convergence of the proposed algorithm is theoretically proved. Both numerical simulations and real-time physical robot experiments are conducted to validate the effectiveness of the proposed cooperative control algorithms.

Causal inference on populations embedded in social networks poses technical challenges, since the typical no interference assumption may no longer hold. For instance, in the context of social research, the outcome of a study unit will likely be affected by an intervention or treatment received by close neighbors. While inverse probability-of-treatment weighted (IPW) estimators have been developed for this setting, they are often highly inefficient. In this work, we assume that the network is a union of disjoint components and propose doubly robust (DR) estimators combining models for treatment and outcome that are consistent and asymptotically normal if either model is correctly specified. We present empirical results that illustrate the DR property and the efficiency gain of DR over IPW estimators when both the outcome and treatment models are correctly specified. Simulations are conducted for networks with equal and unequal component sizes and outcome data with and without a multilevel structure. We apply these methods in an illustrative analysis using the Add Health network, examining the impact of maternal college education on adolescent school performance, both direct and indirect.

We revisit the Heaviest Induced Ancestors (HIA) problem that was introduced by Gagie, Gawrychowski, and Nekrich [CCCG 2013] and has a number of applications in string algorithms. Let $T_1$ and $T_2$ be two rooted trees whose nodes have weights that are increasing in all root-to-leaf paths, and labels on the leaves, such that no two leaves of a tree have the same label. A pair of nodes $(u, v)\in T_1 \times T_2$ is \emph{induced} if and only if there is a label shared by leaf-descendants of $u$ and $v$. In an HIA query, given nodes $x \in T_1$ and $y \in T_2$, the goal is to find an induced pair of nodes $(u, v)$ of the maximum total weight such that $u$ is an ancestor of~$x$ and $v$ is an ancestor of $y$. Let $n$ be the upper bound on the sizes of the two trees. It is known that no data structure of size $\tilde{\mathcal{O}}(n)$ can answer HIA queries in $o(\log n / \log \log n)$ time [Charalampopoulos, Gawrychowski, Pokorski; ICALP 2020]. This (unconditional) lower bound is a $\operatorname{polyloglog} n$ factor away from the query time of the fastest $\tilde{\mathcal{O}}(n)$-size data structure known to date for the HIA problem [Abedin, Hooshmand, Ganguly, Thankachan; Algorithmica 2022]. In this work, we resolve the query-time complexity of the HIA problem for the near-linear space regime by presenting a data structure that can be built in $\tilde{\mathcal{O}}(n)$ time and answers HIA queries in $\mathcal{O}(\log n/\log\log n)$ time. As a direct corollary, we obtain an $\tilde{\mathcal{O}}(n)$-size data structure that maintains the LCS of a static string and a dynamic string, both of length at most $n$, in time optimal for this space regime. The main ingredients of our approach are fractional cascading and the utilization of an $\mathcal{O}(\log n/ \log\log n)$-depth tree decomposition.

We introduce a new stochastic algorithm for solving entropic optimal transport (EOT) between two absolutely continuous probability measures $\mu$ and $\nu$. Our work is motivated by the specific setting of Monge-Kantorovich quantiles where the source measure $\mu$ is either the uniform distribution on the unit hypercube or the spherical uniform distribution. Using the knowledge of the source measure, we propose to parametrize a Kantorovich dual potential by its Fourier coefficients. In this way, each iteration of our stochastic algorithm reduces to two Fourier transforms that enables us to make use of the Fast Fourier Transform (FFT) in order to implement a fast numerical method to solve EOT. We study the almost sure convergence of our stochastic algorithm that takes its values in an infinite-dimensional Banach space. Then, using numerical experiments, we illustrate the performances of our approach on the computation of regularized Monge-Kantorovich quantiles. In particular, we investigate the potential benefits of entropic regularization for the smooth estimation of multivariate quantiles using data sampled from the target measure $\nu$.

Many real-world systems can be described by mathematical formulas that are human-comprehensible, easy to analyze and can be helpful in explaining the system's behaviour. Symbolic regression is a method that generates nonlinear models from data in the form of analytic expressions. Historically, symbolic regression has been predominantly realized using genetic programming, a method that iteratively evolves a population of candidate solutions that are sampled by genetic operators crossover and mutation. This gradient-free evolutionary approach suffers from several deficiencies: it does not scale well with the number of variables and samples in the training data, models tend to grow in size and complexity without an adequate accuracy gain, and it is hard to fine-tune the inner model coefficients using just genetic operators. Recently, neural networks have been applied to learn the whole analytic formula, i.e., its structure as well as the coefficients, by means of gradient-based optimization algorithms. We propose a novel neural network-based symbolic regression method that constructs physically plausible models based on limited training data and prior knowledge about the system. The method employs an adaptive weighting scheme to effectively deal with multiple loss function terms and an epoch-wise learning process to reduce the chance of getting stuck in poor local optima. Furthermore, we propose a parameter-free method for choosing the model with the best interpolation and extrapolation performance out of all models generated through the whole learning process. We experimentally evaluate the approach on the TurtleBot 2 mobile robot, the magnetic manipulation system, the equivalent resistance of two resistors in parallel, and the anti-lock braking system. The results clearly show the potential of the method to find sparse and accurate models that comply with the prior knowledge provided.

Modern reinforcement learning (RL) often faces an enormous state-action space. Existing analytical results are typically for settings with a small number of state-actions, or simple models such as linearly modeled Q-functions. To derive statistically efficient RL policies handling large state-action spaces, with more general Q-functions, some recent works have considered nonlinear function approximation using kernel ridge regression. In this work, we derive sample complexities for kernel based Q-learning when a generative model exists. We propose a nonparametric Q-learning algorithm which finds an $\epsilon$-optimal policy in an arbitrarily large scale discounted MDP. The sample complexity of the proposed algorithm is order optimal with respect to $\epsilon$ and the complexity of the kernel (in terms of its information gain). To the best of our knowledge, this is the first result showing a finite sample complexity under such a general model.

Black box optimisation of an unknown function from expensive and noisy evaluations is a ubiquitous problem in machine learning, academic research and industrial production. An abstraction of the problem can be formulated as a kernel based bandit problem (also known as Bayesian optimisation), where a learner aims at optimising a kernelized function through sequential noisy observations. The existing work predominantly assumes feedback is immediately available; an assumption which fails in many real world situations, including recommendation systems, clinical trials and hyperparameter tuning. We consider a kernel bandit problem under stochastically delayed feedback, and propose an algorithm with $\tilde{\mathcal{O}}(\sqrt{\Gamma_k(T)T}+\mathbb{E}[\tau])$ regret, where $T$ is the number of time steps, $\Gamma_k(T)$ is the maximum information gain of the kernel with $T$ observations, and $\tau$ is the delay random variable. This represents a significant improvement over the state of the art regret bound of $\tilde{\mathcal{O}}(\Gamma_k(T)\sqrt{T}+\mathbb{E}[\tau]\Gamma_k(T))$ reported in Verma et al. (2022). In particular, for very non-smooth kernels, the information gain grows almost linearly in time, trivializing the existing results. We also validate our theoretical results with simulations.

Often, polynomials or rational functions, orthogonal for a particular inner product are desired. In practical numerical algorithms these polynomials are not constructed, but instead the associated recurrence relations are computed. Moreover, also typically the inner product is changed to a discrete inner product, which is the finite sum of weighted functions evaluated in specific nodes. For particular applications it is beneficial to have an efficient procedure to update the recurrence relations when adding or removing nodes from the inner product. The construction of the recurrence relations is equivalent to computing a structured matrix (polynomial) or pencil (rational) having prescribed spectral properties. Hence the solution of this problem is often referred to as solving an Inverse Eigenvalue Problem. In Van Buggenhout et al. (2022) we proposed updating techniques to add nodes to the inner product while efficiently updating the recurrences. To complete this study we present in this article manners to efficiently downdate the recurrences when removing nodes from the inner product. The link between removing nodes and the QR algorithm to deflate eigenvalues is exploited to develop efficient algorithms. We will base ourselves on the perfect shift strategy and develop algorithms, both for the polynomial case and the rational function setting. Numerical experiments validate our approach.

Empirical neural tangent kernels (eNTKs) can provide a good understanding of a given network's representation: they are often far less expensive to compute and applicable more broadly than infinite width NTKs. For networks with O output units (e.g. an O-class classifier), however, the eNTK on N inputs is of size $NO \times NO$, taking $O((NO)^2)$ memory and up to $O((NO)^3)$ computation. Most existing applications have therefore used one of a handful of approximations yielding $N \times N$ kernel matrices, saving orders of magnitude of computation, but with limited to no justification. We prove that one such approximation, which we call "sum of logits", converges to the true eNTK at initialization for any network with a wide final "readout" layer. Our experiments demonstrate the quality of this approximation for various uses across a range of settings.

Approximate Message Passing (AMP) algorithms are a class of iterative procedures for computationally-efficient estimation in high-dimensional inference and estimation tasks. Due to the presence of an 'Onsager' correction term in its iterates, for $N \times M$ design matrices $\mathbf{A}$ with i.i.d. Gaussian entries, the asymptotic distribution of the estimate at any iteration of the algorithm can be exactly characterized in the large system limit as $M/N \rightarrow \delta \in (0, \infty)$ via a scalar recursion referred to as state evolution. In this paper, we show that appropriate functionals of the iterates, in fact, concentrate around their limiting values predicted by these asymptotic distributions with rates exponentially fast in $N$ for a large class of AMP-style algorithms, including those that are used when high-dimensional generalized linear regression models are assumed to be the data-generating process, like the generalized AMP algorithm, or those that are used when the measurement matrix is assumed to be right rotationally invariant instead of i.i.d. Gaussian, like vector AMP and generalized vector AMP. In practice, these more general AMP algorithms have many applications, for example in in communications or imaging, and this work provides the first study of finite sample behavior of such algorithms.

北京阿比特科技有限公司