亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We introduce kernel thinning, a new procedure for compressing a distribution $\mathbb{P}$ more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel $\mathbf{k}_{\star}$ and $\mathcal{O}(n^2)$ time, kernel thinning compresses an $n$-point approximation to $\mathbb{P}$ into a $\sqrt{n}$-point approximation with comparable worst-case integration error across the associated reproducing kernel Hilbert space. The maximum discrepancy in integration error is $\mathcal{O}_d(n^{-1/2}\sqrt{\log n})$ in probability for compactly supported $\mathbb{P}$ and $\mathcal{O}_d(n^{-\frac{1}{2}} (\log n)^{(d+1)/2}\sqrt{\log\log n})$ for sub-exponential $\mathbb{P}$ on $\mathbb{R}^d$. In contrast, an equal-sized i.i.d. sample from $\mathbb{P}$ suffers $\Omega(n^{-1/4})$ integration error. Our sub-exponential guarantees resemble the classical quasi-Monte Carlo error rates for uniform $\mathbb{P}$ on $[0,1]^d$ but apply to general distributions on $\mathbb{R}^d$ and a wide range of common kernels. Moreover, the same construction delivers near-optimal $L^\infty$ coresets in $\mathcal O(n^2)$ time. We use our results to derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat\'ern, and B-spline kernels and present two vignettes illustrating the practical benefits of kernel thinning over i.i.d. sampling and standard Markov chain Monte Carlo thinning, in dimensions $d=2$ through $100$.

相關內容

For flexible yet safe imitation learning (IL), we propose theory and a modular method, with a safety layer that enables a closed-form probability density/gradient of the safe generative continuous policy, end-to-end generative adversarial training, and worst-case safety guarantees. The safety layer maps all actions into a set of safe actions, and uses the change-of-variables formula plus additivity of measures for the density. The set of safe actions is inferred by first checking safety of a finite sample of actions via adversarial reachability analysis of fallback maneuvers, and then concluding on the safety of these actions' neighborhoods using, e.g., Lipschitz continuity. We provide theoretical analysis showing the robustness advantage of using the safety layer already during training (imitation error linear in the horizon) compared to only using it at test time (up to quadratic error). In an experiment on real-world driver interaction data, we empirically demonstrate tractability, safety and imitation performance of our approach.

The research detailed in this paper scrutinizes Principal Component Analysis (PCA), a seminal method employed in statistics and machine learning for the purpose of reducing data dimensionality. Singular Value Decomposition (SVD) is often employed as the primary means for computing PCA, a process that indispensably includes the step of centering - the subtraction of the mean location from the data set. In our study, we delve into a detailed exploration of the influence of this critical yet often ignored or downplayed data centering step. Our research meticulously investigates the conditions under which two PCA embeddings, one derived from SVD with centering and the other without, can be viewed as aligned. As part of this exploration, we analyze the relationship between the first singular vector and the mean direction, subsequently linking this observation to the congruity between two SVDs of centered and uncentered matrices. Furthermore, we explore the potential implications arising from the absence of centering in the context of performing PCA via SVD from a spectral analysis standpoint. Our investigation emphasizes the importance of a comprehensive understanding and acknowledgment of the subtleties involved in the computation of PCA. As such, we believe this paper offers a crucial contribution to the nuanced understanding of this foundational statistical method and stands as a valuable addition to the academic literature in the field of statistics.

Inference of transfer operators from data is often formulated as a classical problem that hinges on the Ulam method. The conventional description, known as the Ulam-Galerkin method, involves projecting onto basis functions represented as characteristic functions supported over a fine grid of rectangles. From this perspective, the Ulam-Galerkin approach can be interpreted as density estimation using the histogram method. In this study, we recast the problem within the framework of statistical density estimation. This alternative perspective allows for an explicit and rigorous analysis of bias and variance, thereby facilitating a discussion on the mean square error. Through comprehensive examples utilizing the logistic map and a Markov map, we demonstrate the validity and effectiveness of this approach in estimating the eigenvectors of the Frobenius-Perron operator. We compare the performance of Histogram Density Estimation(HDE) and Kernel Density Estimation(KDE) methods and find that KDE generally outperforms HDE in terms of accuracy. However, it is important to note that KDE exhibits limitations around boundary points and jumps. Based on our research findings, we suggest the possibility of incorporating other density estimation methods into this field and propose future investigations into the application of KDE-based estimation for high-dimensional maps. These findings provide valuable insights for researchers and practitioners working on estimating the Frobenius-Perron operator and highlight the potential of density estimation techniques in this area of study. Keywords: Transfer Operators; Frobenius-Perron operator; probability density estimation; Ulam-Galerkin method; Kernel Density Estimation; Histogram Density Estimation.

We consider the problem of clustering privately a dataset in $\mathbb{R}^d$ that undergoes both insertion and deletion of points. Specifically, we give an $\varepsilon$-differentially private clustering mechanism for the $k$-means objective under continual observation. This is the first approximation algorithm for that problem with an additive error that depends only logarithmically in the number $T$ of updates. The multiplicative error is almost the same as non privately. To do so we show how to perform dimension reduction under continual observation and combine it with a differentially private greedy approximation algorithm for $k$-means. We also partially extend our results to the $k$-median problem.

Rescheduling problems arise in a variety of situations where a previously planned schedule needs to be adjusted to deal with unforeseen events. A common problem is the arrival of new orders, i.e. jobs, which have to be integrated into the schedule of the so-called old jobs. The maximum and total absolute time deviations of the completion times of these jobs are modeled as a disruption constraint to limit the change in the original schedule. Disruption constraints affect the shape of an optimal schedule, particularly with respect to the sequencing of old jobs and the insertion of idle time. We therefore give a classification into idle and no-idle problems for a set of single-machine rescheduling problems with different objective functions. We then prove the complexity of five rescheduling problems that have been left open in the literature.

Physics-informed neural networks (PINNs) offer a novel and efficient approach to solving partial differential equations (PDEs). Their success lies in the physics-informed loss, which trains a neural network to satisfy a given PDE at specific points and to approximate the solution. However, the solutions to PDEs are inherently infinite-dimensional, and the distance between the output and the solution is defined by an integral over the domain. Therefore, the physics-informed loss only provides a finite approximation, and selecting appropriate collocation points becomes crucial to suppress the discretization errors, although this aspect has often been overlooked. In this paper, we propose a new technique called good lattice training (GLT) for PINNs, inspired by number theoretic methods for numerical analysis. GLT offers a set of collocation points that are effective even with a small number of points and for multi-dimensional spaces. Our experiments demonstrate that GLT requires 2--20 times fewer collocation points (resulting in lower computational cost) than uniformly random sampling or Latin hypercube sampling, while achieving competitive performance.

We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the number of changes made to the set of centers after each point insertion or deletion. Previous works by Lattanzi and Vassilvitskii [ICML '12] and Fichtenberger, Lattanzi, Norouzi-Fard, and Svensson [SODA '21] showed that in the incremental setting, where deletions are not allowed, one can obtain $k \cdot \textrm{polylog}(n) / n$ amortized recourse for both $k$-center and $k$-median, and demonstrated a matching lower bound. However, no algorithm for the fully dynamic setting achieves less than the trivial $O(k)$ changes per update, which can be obtained by simply reclustering the full dataset after every update. In this work, we give the first algorithm for consistent $k$-center clustering for the fully dynamic setting, i.e., when both point insertions and deletions are allowed, and improves upon a trivial $O(k)$ recourse bound. Specifically, our algorithm maintains a constant factor approximate solution while ensuring worst-case constant recourse per update, which is optimal in the fully dynamic setting. Moreover, our algorithm is deterministic and is therefore correct even if an adaptive adversary chooses the insertions and deletions.

Machine learning has attracted widespread attention and evolved into an enabling technology for a wide range of highly successful applications, such as intelligent computer vision, speech recognition, medical diagnosis, and more. Yet a special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning. This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality. At the same time, this ambitious problem has led to numerous research efforts aimed at confronting its challenges. To the best of our knowledge, no study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios. Accordingly, with this survey, we aim to capture the key concepts of unlearning techniques. The existing solutions are classified and summarized based on their characteristics within an up-to-date and comprehensive review of each category's advantages and limitations. The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities.

While existing work in robust deep learning has focused on small pixel-level $\ell_p$ norm-based perturbations, this may not account for perturbations encountered in several real world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.

Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.

北京阿比特科技有限公司