欧美成人性色XXⅩXXA片在线_国产乱来免费视频_真人強奷112分钟_特色特黄不卡免费一级视频_国产激情一区一区三区_国产日韩精品无码区免费专区国产_精品亚洲AV无码午夜黄桃在线

Penalized linear regression is of fundamental importance in high-dimensional statistics and has been routinely used to regress a response on a high-dimensional set of predictors. In many scientific applications, there exists external information that encodes the predictive power and sparsity structure of the predictors. In this article, we propose the Structure Adaptive Elastic-Net (SA-Enet), which provides a new framework for incorporating potentially useful side information into a penalized regression. The basic idea is to translate the external information into different penalization strengths for the regression coefficients. We particularly focus on group and covariate-dependent structures and study the risk properties of the resulting estimator. To this, we generalize the state evolution framework recently introduced for the analysis of the approximate message-passing algorithm to the SA-Enet framework. We show that the finite sample risk of the SA-Enet estimator is consistent with the theoretical risk predicted by the state evolution equation. Our theory suggests that the SA-Enet with an informative group or covariate structure can outperform the Lasso, Adaptive Lasso, Sparse Group Lasso, Feature-weighted Elastic-Net, and Graper. This evidence is further confirmed in our numerical studies. We also demonstrate the usefulness and the superiority of our method for leukemia data from molecular biology and precision medicine.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 訓練誤差 · 神經網絡 · 參數化 · 變分 ·

2023 年 4 月 12 日

Analytic theory for the dynamics of wide quantum neural networks

Junyu Liu,Khadijeh Najafi,Kunal Sharma,Francesco Tacchino,Liang Jiang,Antonio Mezzacapo

from arxiv, 37 pages, many figures. v2, v3: adding learning supervised perspectives and new results, close to published version

Parameterized quantum circuits can be used as quantum neural networks and have the potential to outperform their classical counterparts when trained for addressing learning problems. To date, much of the results on their performance on practical problems are heuristic in nature. In particular, the convergence rate for the training of quantum neural networks is not fully understood. Here, we analyze the dynamics of gradient descent for the training error of a class of variational quantum machine learning models. We define wide quantum neural networks as parameterized quantum circuits in the limit of a large number of qubits and variational parameters. We then find a simple analytic formula that captures the average behavior of their loss function and discuss the consequences of our findings. For example, for random quantum circuits, we predict and characterize an exponential decay of the residual training error as a function of the parameters of the system. We finally validate our analytic results with numerical experiments.

密度估計 · 平滑 · 卷積 · 細胞毒性 · 污染 ·

2023 年 4 月 10 日

Smoothness-Penalized Deconvolution (SPeD) of a Density Estimate

David Kent,David Ruppert

from arxiv, Revisions: added new theorem in Section 6; added list of assumptions; other, more minor revisions throughout

This paper addresses the deconvolution problem of estimating a square-integrable probability density from observations contaminated with additive measurement errors having a known density. The estimator begins with a density estimate of the contaminated observations and minimizes a reconstruction error penalized by an integrated squared $m$-th derivative. Theory for deconvolution has mainly focused on kernel- or wavelet-based techniques, but other methods including spline-based techniques and this smoothness-penalized estimator have been found to outperform kernel methods in simulation studies. This paper fills in some of these gaps by establishing asymptotic guarantees for the smoothness-penalized approach. Consistency is established in mean integrated squared error, and rates of convergence are derived for Gaussian, Cauchy, and Laplace error densities, attaining some lower bounds already in the literature. The assumptions are weak for most results; the estimator can be used with a broader class of error densities than the deconvoluting kernel. Our application example estimates the density of the mean cytotoxicity of certain bacterial isolates under random sampling; this mean cytotoxicity can only be measured experimentally with additive error, leading to the deconvolution problem. We also describe a method for approximating the solution by a cubic spline, which reduces to a quadratic program.

二元關系 · 交互 · 編程 · 函數依賴 · 穩定域 ·

2023 年 4 月 9 日

Making Concurrency Functional

Glynn Winskel

The article bridges between two major paradigms in computation, the functional, at basis computation from input to output, and the interactive, where computation reacts to its environment while underway. Central to any compositional theory of interaction is the dichotomy between a system and its environment. Concurrent games and strategies address the dichotomy in fine detail, very locally, in a distributed fashion, through distinctions between Player moves (events of the system) and Opponent moves (those of the environment). A functional approach has to handle the dichotomy much more ingeniously, through its blunter distinction between input and output. This has led to a variety of functional approaches, specialised to particular interactive demands. Through concurrent games we can more clearly see what separates and connects the differing paradigms, and show how: * to lift functions to strategies; the "Scott order" intrinsic to concurrent games plays a key role in turning functional dependency to causal dependency. * several paradigms of functional programming and logic arise naturally as subcategories of concurrent games, including stable domain theory; nondeterministic dataflow; geometry of interaction; the dialectica interpretation; lenses and optics; and their extensions to containers in dependent lenses and optics. * to transfer enrichments of strategies (such as to probabilistic, quantum or real-number computation) to functional cases.

近似 · 近似計算 · 計算方法 · 神經網絡 · 演示 ·

2023 年 4 月 8 日

Training Neural Networks for Execution on Approximate Hardware

Tianmu Li,Shurui Li,Puneet Gupta

Approximate computing methods have shown great potential for deep learning. Due to the reduced hardware costs, these methods are especially suitable for inference tasks on battery-operated devices that are constrained by their power budget. However, approximate computing hasn't reached its full potential due to the lack of work on training methods. In this work, we discuss training methods for approximate hardware. We demonstrate how training needs to be specialized for approximate hardware, and propose methods to speed up the training process by up to 18X.

振蕩 · 自適應 · 高斯過程 · 基準測試 · 基準 ·

2023 年 4 月 7 日

Multidimensional adaptive order GP-WENO via kernel-based reconstruction

Ian May,Dongwook Lee

from arxiv, Submitted to Journal of Computational Physics April 2023

This paper presents a fully multidimensional kernel-based reconstruction scheme for finite volume methods applied to systems of hyperbolic conservation laws, with a particular emphasis on the compressible Euler equations. Non-oscillatory reconstruction is achieved through an adaptive order weighted essentially non-oscillatory (WENO-AO) method cast into a form suited to multidimensional stencils and reconstruction. A kernel-based approach inspired by Gaussian process (GP) modeling is presented here. This approach allows the creation of a scheme of arbitrary order with simply defined multidimensional stencils and substencils. Furthermore, the fully multidimensional nature of the reconstruction allows a more straightforward extension to higher spatial dimensions and removes the need for complicated boundary conditions on intermediate quantities in modified dimension-by-dimension methods. In addition, a new simple-yet-effective set of reconstruction variables is introduced, as well as an easy-to-implement effective limiter for positivity preservation, both of which could be useful in existing schemes with little modification. The proposed scheme is applied to a suite of stringent and informative benchmark problems to demonstrate its efficacy and utility.

魯棒 · 穩健 · 高維 · 損失函數 · 穩健估計 ·

2023 年 4 月 7 日

Robust adaptive Lasso in high-dimensional logistic regression

Ayanendranath Basu,Abhik Ghosh,María Jaenada,Leandro Pardo

from arxiv, 27 pages

Penalized logistic regression is extremely useful for binary classification with large number of covariates (higher than the sample size), having several real life applications, including genomic disease classification. However, the existing methods based on the likelihood loss function are sensitive to data contamination and other noise and, hence, robust methods are needed for stable and more accurate inference. In this paper, we propose a family of robust estimators for sparse logistic models utilizing the popular density power divergence based loss function and the general adaptively weighted LASSO penalties. We study the local robustness of the proposed estimators through its influence function and also derive its oracle properties and asymptotic distribution. With extensive empirical illustrations, we demonstrate the significantly improved performance of our proposed estimators over the existing ones with particular gain in robustness. Our proposal is finally applied to analyse four different real datasets for cancer classification, obtaining robust and accurate models, that simultaneously performs gene selection and patient classification.

高斯混合 · 混合分布 · 自適應采樣 · 采樣法 · 混合 ·

2023 年 4 月 7 日

GAS: A Gaussian Mixture Distribution-Based Adaptive Sampling Method for PINNs

Yuling Jiao,Di Li,Xiliang Lu,Jerry Zhijian Yang,Cheng Yuan

With the recent study of deep learning in scientific computation, the Physics-Informed Neural Networks (PINNs) method has drawn widespread attention for solving Partial Differential Equations (PDEs). Compared to traditional methods, PINNs can efficiently handle high-dimensional problems, but the accuracy is relatively low, especially for highly irregular problems. Inspired by the idea of adaptive finite element methods and incremental learning, we propose GAS, a Gaussian mixture distribution-based adaptive sampling method for PINNs. During the training procedure, GAS uses the current residual information to generate a Gaussian mixture distribution for the sampling of additional points, which are then trained together with historical data to speed up the convergence of the loss and achieve higher accuracy. Several numerical simulations on 2D and 10D problems show that GAS is a promising method that achieves state-of-the-art accuracy among deep solvers, while being comparable with traditional numerical solvers.

高斯過程 · 非高斯 · 初始化 · 度量 · 寬度 ·

2023 年 4 月 6 日

Wide neural networks: From non-gaussian random fields at initialization to the NTK geometry of training

Luís Carvalho,Jo?o Lopes Costa,José Mour?o,Gon?alo Oliveira

from arxiv, 44 pages, 2 figures, comments welcome

Recent developments in applications of artificial neural networks with over $n=10^{14}$ parameters make it extremely important to study the large $n$ behaviour of such networks. Most works studying wide neural networks have focused on the infinite width $n \to +\infty$ limit of such networks and have shown that, at initialization, they correspond to Gaussian processes. In this work we will study their behavior for large, but finite $n$. Our main contributions are the following: (1) The computation of the corrections to Gaussianity in terms of an asymptotic series in $n^{-\frac{1}{2}}$. The coefficients in this expansion are determined by the statistics of parameter initialization and by the activation function. (2) Controlling the evolution of the outputs of finite width $n$ networks, during training, by computing deviations from the limiting infinite width case (in which the network evolves through a linear flow). This improves previous estimates and yields sharper decay rates for the (finite width) NTK in terms of $n$, valid during the entire training procedure. As a corollary, we also prove that, with arbitrarily high probability, the training of sufficiently wide neural networks converges to a global minimum of the corresponding quadratic loss function. (3) Estimating how the deviations from Gaussianity evolve with training in terms of $n$. In particular, using a certain metric in the space of measures we find that, along training, the resulting measure is within $n^{-\frac{1}{2}}(\log n)^{1+}$ of the time dependent Gaussian process corresponding to the infinite width network (which is explicitly given by precomposing the initial Gaussian process with the linear flow corresponding to training in the infinite width limit).

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.