亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overparameterization: the fact that overparameterized models may be more vulnerable to privacy attacks, in particular the membership inference attack that predicts the (potentially sensitive) examples used to train a model. We significantly extend the relatively few empirical results on this problem by theoretically proving for an overparameterized linear regression model in the Gaussian data setting that membership inference vulnerability increases with the number of parameters. Moreover, a range of empirical studies indicates that more complex, nonlinear models exhibit the same behavior. Finally, we extend our analysis towards ridge-regularized linear regression and show in the Gaussian data setting that increased regularization also increases membership inference vulnerability in the overparameterized regime.

相關內容

In the age of big data and interpretable machine learning, approaches need to work at scale and at the same time allow for a clear mathematical understanding of the method's inner workings. While there exist inherently interpretable semi-parametric regression techniques for large-scale applications to account for non-linearity in the data, their model complexity is still often restricted. One of the main limitations are missing interactions in these models, which are not included for the sake of better interpretability, but also due to untenable computational costs. To address this shortcoming, we derive a scalable high-order tensor product spline model using a factorization approach. Our method allows to include all (higher-order) interactions of non-linear feature effects while having computational costs proportional to a model without interactions. We prove both theoretically and empirically that our methods scales notably better than existing approaches, derive meaningful penalization schemes and also discuss further theoretical aspects. We finally investigate predictive and estimation performance both with synthetic and real data.

Regression models that ignore measurement error in predictors may produce highly biased estimates leading to erroneous inferences. It is well known that it is extremely difficult to take measurement error into account in Gaussian nonparametric regression. This problem becomes tremendously more difficult when considering other families such as logistic regression, Poisson and negative-binomial. For the first time, we present a method aiming to correct for measurement error when estimating regression functions flexibly covering virtually all distributions and link functions regularly considered in generalized linear models. This approach depends on approximating the first and the second moment of the response after integrating out the true unobserved predictors in a semiparametric generalized linear model. Unlike previous methods, this method is not restricted to truncated splines and can utilize various basis functions. Through extensive simulation studies, we study the performance of our method under many scenarios.

We study the stability of posterior predictive inferences to the specification of the likelihood model and perturbations of the data generating process. In modern big data analyses, the decision-maker may elicit useful broad structural judgements but a level of interpolation is required to arrive at a likelihood model. One model, often a computationally convenient canonical form, is chosen, when many alternatives would have been equally consistent with the elicited judgements. Equally, observational datasets often contain unforeseen heterogeneities and recording errors. Acknowledging such imprecisions, a faithful Bayesian analysis should be stable across reasonable equivalence classes for these inputs. We show that traditional Bayesian updating provides stability across a very strict class of likelihood models and DGPs, while a generalised Bayesian alternative using the beta-divergence loss function is shown to be stable across practical and interpretable neighbourhoods. We illustrate this in linear regression, binary classification, and mixture modelling examples, showing that stable updating does not compromise the ability to learn about the DGP. These stability results provide a compelling justification for using generalised Bayes to facilitate inference under simplified canonical models.

We study the implicit regularization of gradient descent towards structured sparsity via a novel neural reparameterization, which we call a diagonally grouped linear neural network. We show the following intriguing property of our reparameterization: gradient descent over the squared regression loss, without any explicit regularization, biases towards solutions with a group sparsity structure. In contrast to many existing works in understanding implicit regularization, we prove that our training trajectory cannot be simulated by mirror descent. We analyze the gradient dynamics of the corresponding regression problem in the general noise setting and obtain minimax-optimal error rates. Compared to existing bounds for implicit sparse regularization using diagonal linear networks, our analysis with the new reparameterization shows improved sample complexity. In the degenerate case of size-one groups, our approach gives rise to a new algorithm for sparse linear regression. Finally, we demonstrate the efficacy of our approach with several numerical experiments.

Combining ideas coming from Stone duality and Reynolds parametricity, we formulate in a clean and principled way a notion of profinite lambda-term which, we show, generalizes at every type the traditional notion of profinite word coming from automata theory. We start by defining the Stone space of profinite lambda-terms as a projective limit of finite sets of usual lambda-terms, considered modulo a notion of equivalence based on the finite standard model. One main contribution of the paper is to establish that, somewhat surprisingly, the resulting notion of profinite lambda-term coming from Stone duality lives in perfect harmony with the principles of Reynolds parametricity. In addition, we show that the notion of profinite lambda-term is compositional by constructing a cartesian closed category of profinite lambda-terms, and we establish that the embedding from lambda-terms modulo beta-eta-conversion to profinite lambda-terms is faithful using Statman's finite completeness theorem. Finally, we prove a parametricity theorem for Church encodings of word and ranked tree languages, which states that every parametric family of elements in the finite standard model is the interpretation of a profinite lambda-term. This result shows that our notion of profinite lambda-term conservatively extends the existing notion of profinite word and provides a natural framework for defining and studying profinite trees.

In this paper, we present a unified and general framework for analyzing the batch updating approach to nonlinear, high-dimensional optimization. The framework encompasses all the currently used batch updating approaches, and is applicable to nonconvex as well as convex functions. Moreover, the framework permits the use of noise-corrupted gradients, as well as first-order approximations to the gradient (sometimes referred to as "gradient-free" approaches). By viewing the analysis of the iterations as a problem in the convergence of stochastic processes, we are able to establish a very general theorem, which includes most known convergence results for zeroth-order and first-order methods. The analysis of "second-order" or momentum-based methods is not a part of this paper, and will be studied elsewhere. However, numerical experiments indicate that momentum-based methods can fail if the true gradient is replaced by its first-order approximation. This requires further theoretical analysis.

Federated Learning is a distributed machine learning environment, which ensures that clients complete collaborative training without sharing private data, only by exchanging parameters. However, the data does not satisfy the same distribution and the computing resources of clients are different, which brings challenges to the related research. To better solve the above heterogeneous problems, we designed a novel federated learning method. The local model consists of the pre-trained model as the backbone and fully connected layers as the head. The backbone can extract features for the head, and the embedding vector of classes is shared between clients to optimize the head so that the local model can perform better. By sharing the embedding vector of classes, instead of parameters based on gradient space, clients can better adapt to private data, and it is more efficient in the communication between the server and clients. To better protect privacy, we proposed a privacy-preserving hybrid method to add noise to the embedding vector of classes, which has less impact on the local model performance under the premise of satisfying differential privacy. We conduct a comprehensive evaluation with other federated learning methods on the self-built vehicle dataset under non-independent identically distributed(Non-IID)

We study the error of linear regression in the face of adversarial attacks. In this framework, an adversary changes the input to the regression model in order to maximize the prediction error. We provide bounds on the prediction error in the presence of an adversary as a function of the parameter norm and the error in the absence of such an adversary. We show how these bounds make it possible to study the adversarial error using analysis from non-adversarial setups. The obtained results shed light on the robustness of overparameterized linear models to adversarial attacks. Adding features might be either a source of additional robustness or brittleness. On the one hand, we use asymptotic results to illustrate how double-descent curves can be obtained for the adversarial error. On the other hand, we derive conditions under which the adversarial error can grow to infinity as more features are added, while at the same time, the test error goes to zero. We show this behavior is caused by the fact that the norm of the parameter vector grows with the number of features. It is also established that $\ell_\infty$ and $\ell_2$-adversarial attacks might behave fundamentally differently due to how the $\ell_1$ and $\ell_2$-norms of random projections concentrate. We also show how our reformulation allows for solving adversarial training as a convex optimization problem. This fact is then exploited to establish similarities between adversarial training and parameter-shrinking methods and to study how the training might affect the robustness of the estimated models.

The industrialization of catalytic processes is of far more importance today than it has ever been before and kinetic models are essential tools for their industrialization. Kinetic models affect the design, the optimization and the control of catalytic processes, but they are not easy to obtain. Classical paradigms, such as mechanistic modeling require substantial domain knowledge, while data-driven and hybrid modeling lack interpretability. Consequently, a different approach called automated knowledge discovery has recently gained popularity. Many methods under this paradigm have been developed, where ALAMO, SINDy and genetic programming are notable examples. However, these methods suffer from important drawbacks: they require assumptions about model structures, scale poorly, lack robust and well-founded model selection routines, and they are sensitive to noise. To overcome these challenges, the present work constructs two methodological frameworks, Automated Discovery of Kinetics using a Strong/Weak formulation of symbolic regression, ADoK-S and ADoK-W, for the automated generation of catalytic kinetic models. We leverage genetic programming for model generation, a sequential optimization routine for model refinement, and a robust criterion for model selection. Both frameworks are tested against three computational case studies of increasing complexity. We showcase their ability to retrieve the underlying kinetic rate model with a limited amount of noisy data from the catalytic system, indicating a strong potential for chemical reaction engineering applications.

The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated with detrimental overfitting, and yet a wide range of interpolating models -- from simple linear models to deep neural networks -- have recently been observed to generalize extremely well on fresh test data. Indeed, the recently discovered double descent phenomenon has revealed that highly overparameterized models often improve over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theory and foundational empirical studies, even for the simplest case of the linear model. The underpinnings of this understanding have been laid in very recent analyses of overparameterized linear regression and related statistical learning tasks, which resulted in precise analytic characterizations of double descent. This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.

北京阿比特科技有限公司