亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study the differential privacy (DP) of a core ML problem, linear ordinary least squares (OLS), a.k.a. $\ell_2$-regression. Our key result is that the approximate LS algorithm (ALS) (Sarlos, 2006), a randomized solution to the OLS problem primarily used to improve performance on large datasets, also preserves privacy. ALS achieves a better privacy/utility tradeoff, without modifications or further noising, when compared to alternative private OLS algorithms which modify and/or noise OLS. We give the first {\em tight} DP-analysis for the ALS algorithm and the standard Gaussian mechanism (Dwork et al., 2014) applied to OLS. Our methodology directly improves the privacy analysis of (Blocki et al., 2012) and (Sheffet, 2019)) and introduces new tools which may be of independent interest: (1) the exact spectrum of $(\epsilon, \delta)$-DP parameters (``DP spectrum") for mechanisms whose output is a $d$-dimensional Gaussian, and (2) an improved DP spectrum for random projection (compared to (Blocki et al., 2012) and (Sheffet, 2019)). All methods for private OLS (including ours) assume, often implicitly, restrictions on the input database, such as bounds on leverage and residuals. We prove that such restrictions are necessary. Hence, computing the privacy of mechanisms such as ALS must estimate these database parameters, which can be infeasible in big datasets. For more complex ML models, DP bounds may not even be tractable. There is a need for blackbox DP-estimators (Lu et al., 2022) which empirically estimate a data-dependent privacy. We demonstrate the effectiveness of such a DP-estimator by empirically recovering a DP-spectrum that matches our theory for OLS. This validates the DP-estimator in a nontrivial ML application, opening the door to its use in more complex nonlinear ML settings where theory is unavailable.

相關內容

A new approach to analyzing intrinsic properties of the Josephus function, $J_{_k}$, is presented in this paper. The linear structure between extreme points of $J_{_k}$ is fully revealed, leading to the design of an efficient algorithm for evaluating $J_{_k}(n)$. Algebraic expressions that describe how recursively compute extreme points, including fixed points, are derived. The existence of consecutive extreme and also fixed points for all $k\geq 2$ is proven as a consequence, which generalizes Knuth result for $k=2$. Moreover, an extensive comparative numerical experiment is conducted to illustrate the performance of the proposed algorithm for evaluating the Josephus function compared to established algorithms. The results show that the proposed scheme is highly effective in computing $J_{_k}(n)$ for large inputs.

We study the minority-opinion dynamics over a fully-connected network of $n$ nodes with binary opinions. Upon activation, a node receives a sample of opinions from a limited number of neighbors chosen uniformly at random. Each activated node then adopts the opinion that is least common within the received sample. Unlike all other known consensus dynamics, we prove that this elementary protocol behaves in dramatically different ways, depending on whether activations occur sequentially or in parallel. Specifically, we show that its expected consensus time is exponential in $n$ under asynchronous models, such as asynchronous GOSSIP. On the other hand, despite its chaotic nature, we show that it converges within $O(\log^2 n)$ rounds with high probability under synchronous models, such as synchronous GOSSIP. Finally, our results shed light on the bit-dissemination problem, that was previously introduced to model the spread of information in biological scenarios. Specifically, our analysis implies that the minority-opinion dynamics is the first stateless solution to this problem, in the parallel passive-communication setting, achieving convergence within a polylogarithmic number of rounds. This, together with a known lower bound for sequential stateless dynamics, implies a parallel-vs-sequential gap for this problem that is nearly quadratic in the number $n$ of nodes. This is in contrast to all known results for problems in this area, which exhibit a linear gap between the parallel and the sequential setting.

Among the wide variety of evolutionary computing models, Finite State Machines (FSMs) have several attractions for fundamental research. They are easy to understand in concept and can be visualised clearly in simple cases. They have a ready fitness criterion through their relationship with Regular Languages. They have also been shown to be tractably evolvable, even up to exhibiting evidence of open-ended evolution in specific scenarios. In addition to theoretical attraction, they also have industrial applications, as a paradigm of both automated and user-initiated control. Improving the understanding of the factors affecting FSM evolution has relevance to both computer science and practical optimisation of control. We investigate an evolutionary scenario of FSMs adapting to recognise one of a family of Regular Languages by categorising positive and negative samples, while also being under a counteracting selection pressure that favours fewer states. The results appear to indicate that longer strings provided as samples reduce the speed of fitness gain, when fitness is measured against a fixed number of sample strings. We draw the inference that additional information from longer strings is not sufficient to compensate for sparser coverage of the combinatorial space of positive and negative sample strings.

Within the realm of deep learning, the interpretability of Convolutional Neural Networks (CNNs), particularly in the context of image classification tasks, remains a formidable challenge. To this end we present a neurosymbolic framework, NeSyFOLD-G that generates a symbolic rule-set using the last layer kernels of the CNN to make its underlying knowledge interpretable. What makes NeSyFOLD-G different from other similar frameworks is that we first find groups of similar kernels in the CNN (kernel-grouping) using the cosine-similarity between the feature maps generated by various kernels. Once such kernel groups are found, we binarize each kernel group's output in the CNN and use it to generate a binarization table which serves as input data to FOLD-SE-M which is a Rule Based Machine Learning (RBML) algorithm. FOLD-SE-M then generates a rule-set that can be used to make predictions. We present a novel kernel grouping algorithm and show that grouping similar kernels leads to a significant reduction in the size of the rule-set generated by FOLD-SE-M, consequently, improving the interpretability. This rule-set symbolically encapsulates the connectionist knowledge of the trained CNN. The rule-set can be viewed as a normal logic program wherein each predicate's truth value depends on a kernel group in the CNN. Each predicate in the rule-set is mapped to a concept using a few semantic segmentation masks of the images used for training, to make it human-understandable. The last layers of the CNN can then be replaced by this rule-set to obtain the NeSy-G model which can then be used for the image classification task. The goal directed ASP system s(CASP) can be used to obtain the justification of any prediction made using the NeSy-G model. We also propose a novel algorithm for labeling each predicate in the rule-set with the semantic concept(s) that its corresponding kernel group represents.

We propose tests of fit for classes of distributions that include the Weibull, the Pareto and the Fr\'echet, distributions. The new tests employ the novel tool of the min--characteristic function and are based on an L2--type weighted distance between this function and its empirical counterpart applied on suitably standardized data. If data--standardization is performed using the MLE of the distributional parameters then the method reduces to testing for the standard member of the family, with parameter values known and set equal to one. We investigate asymptotic properties of the tests, while a Monte Carlo study is presented that includes the new procedure as well as competitors for the purpose of specification testing with three extreme value distributions. The new tests are also applied on a few real--data sets.

Adiabatic quantum computing (AQC) is a promising quantum computing approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many machine learning and computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding information contained in the remaining ones. In this work, we explore the potential of using this information for probabilistic balanced k-means clustering. Instead of discarding non-optimal solutions, we propose to use them to compute calibrated posterior probabilities with little additional compute cost. This allows us to identify ambiguous solutions and data points, which we demonstrate on a D-Wave AQC on synthetic and real data.

Reachability analysis plays a central role in system design and verification. The reachability problem, denoted $\Diamond^J\,\Phi$, asks whether the system will meet the property $\Phi$ after some time in a given time interval $J$. Recently, it has been considered on a novel kind of real-time systems -- quantum continuous-time Markov chains (QCTMCs), and embedded into the model-checking algorithm. In this paper, we further study the repeated reachability problem in QCTMCs, denoted $\Box^I\,\Diamond^J\,\Phi$, which concerns whether the system starting from each \emph{absolute} time in $I$ will meet the property $\Phi$ after some coming \emph{relative} time in $J$. First of all, we reduce it to the real root isolation of a class of real-valued functions (exponential polynomials), whose solvability is conditional to Schanuel's conjecture being true. To speed up the procedure, we employ the strategy of sampling. The original problem is shown to be equivalent to the existence of a finite collection of satisfying samples. We then present a sample-driven procedure, which can effectively refine the sample space after each time of sampling, no matter whether the sample itself is successful or conflicting. The improvement on efficiency is validated by randomly generated instances. Hence the proposed method would be promising to attack the repeated reachability problems together with checking other $\omega$-regular properties in a wide scope of real-time systems.

Nested simulation encompasses the estimation of functionals linked to conditional expectations through simulation techniques. In this paper, we treat conditional expectation as a function of the multidimensional conditioning variable and provide asymptotic analyses of general Least Squared Estimators on sieve, without imposing specific assumptions on the function's form. Our study explores scenarios in which the convergence rate surpasses that of the standard Monte Carlo method and the one recently proposed based on kernel ridge regression. We also delve into the conditions that allow for achieving the best possible square root convergence rate among all methods. Numerical experiments are conducted to support our statements.

Online algorithms with predictions have become a trending topic in the field of beyond worst-case analysis of algorithms. These algorithms incorporate predictions about the future to obtain performance guarantees that are of high quality when the predictions are good, while still maintaining bounded worst-case guarantees when predictions are arbitrarily poor. In general, the algorithm is assumed to be unaware of the prediction's quality. However, recent developments in the machine learning literature have studied techniques for providing uncertainty quantification on machine-learned predictions, which describes how certain a model is about its quality. This paper examines the question of how to optimally utilize uncertainty-quantified predictions in the design of online algorithms. In particular, we consider predictions augmented with uncertainty quantification describing the likelihood of the ground truth falling in a certain range, designing online algorithms with these probabilistic predictions for two classic online problems: ski rental and online search. In each case, we demonstrate that non-trivial modifications to algorithm design are needed to fully leverage the probabilistic predictions. Moreover, we consider how to utilize more general forms of uncertainty quantification, proposing a framework based on online learning that learns to exploit uncertainty quantification to make optimal decisions in multi-instance settings.

We consider the problem of discovering $K$ related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a $l_1/l_2$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures. To allow practical implementation, we design a continuous optimization problem whose optimizer is the same as the joint estimator and can be approximated efficiently by an iterative algorithm. We validate the theoretical analysis and the effectiveness of the joint estimator in experiments.

北京阿比特科技有限公司