美国式禁忌电影在线观看免费观看,人人爽人人看人人硬,国产精品久久久久久久裸模

Principal components computed via PCA (principal component analysis) are traditionally used to reduce dimensionality in genomic data or to correct for population stratification. In this paper, we explore the penalized eigenvalue problem (PEP) which reformulates the computation of the first eigenvector as an optimization problem and adds an L1 penalty constraint. The contribution of our article is threefold. First, we extend PEP by applying Nesterov smoothing to the original LASSO-type L1 penalty. This allows one to compute analytical gradients which enable faster and more efficient minimization of the objective function associated with the optimization problem. Second, we demonstrate how higher order eigenvectors can be calculated with PEP using established results from singular value decomposition (SVD). Third, using data from the 1000 Genome Project dataset, we empirically demonstrate that our proposed smoothed PEP allows one to increase numerical stability and obtain meaningful eigenvectors. We further investigate the utility of the penalized eigenvector approach over traditional PCA.

相關內容

PCA

關注 3

在(zai)統計(ji)中(zhong)(zhong)，主成分(fen)分(fen)析（PCA）是(shi)一(yi)(yi)(yi)種通過(guo)最(zui)大化(hua)每個(ge)維(wei)(wei)(wei)度(du)的(de)方差來將較高維(wei)(wei)(wei)度(du)空間(jian)中(zhong)(zhong)的(de)數(shu)據投影到較低維(wei)(wei)(wei)度(du)空間(jian)中(zhong)(zhong)的(de)方法。給定(ding)二維(wei)(wei)(wei)，三維(wei)(wei)(wei)或更高維(wei)(wei)(wei)空間(jian)中(zhong)(zhong)的(de)點集合(he)(he)，可以將“最(zui)佳擬合(he)(he)”線(xian)定(ding)義(yi)為(wei)最(zui)小化(hua)從(cong)點到線(xian)的(de)平均平方距離的(de)線(xian)。可以從(cong)垂直于第一(yi)(yi)(yi)條(tiao)直線(xian)的(de)方向(xiang)類似地選擇下一(yi)(yi)(yi)條(tiao)最(zui)佳擬合(he)(he)線(xian)。重復此(ci)過(guo)程會產生(sheng)一(yi)(yi)(yi)個(ge)正交(jiao)的(de)基(ji)礎，其中(zhong)(zhong)數(shu)據的(de)不同單個(ge)維(wei)(wei)(wei)度(du)是(shi)不相關的(de)。這(zhe)些基(ji)向(xiang)量稱為(wei)主成分(fen)。

Analysis · MoDELS · Networking · 變換 · Use Case ·

2023 年 11 月 7 日

Uncovering Causal Variables in Transformers using Circuit Probing

Michael A. Lepori,Thomas Serre,Ellie Pavlick

Neural network models have achieved high performance on a wide variety of complex tasks, but the algorithms that they implement are notoriously difficult to interpret. In order to understand these algorithms, it is often necessary to hypothesize intermediate variables involved in the network's computation. For example, does a language model depend on particular syntactic properties when generating a sentence? However, existing analysis tools make it difficult to test hypotheses of this type. We propose a new analysis technique -- circuit probing -- that automatically uncovers low-level circuits that compute hypothesized intermediate variables. This enables causal analysis through targeted ablation at the level of model parameters. We apply this method to models trained on simple arithmetic tasks, demonstrating its effectiveness at (1) deciphering the algorithms that models have learned, (2) revealing modular structure within a model, and (3) tracking the development of circuits over training. We compare circuit probing to other methods across these three experiments, and find it on par or more effective than existing analysis methods. Finally, we demonstrate circuit probing on a real-world use case, uncovering circuits that are responsible for subject-verb agreement and reflexive anaphora in GPT2-Small and Medium.

Performer · Processing（編程語言） · 稀疏 · 極小點 · 分離的 ·

2023 年 11 月 6 日

Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees

Alexander Terenin,David R. Burt,Artem Artemev,Seth Flaxman,Mark van der Wilk,Carl Edward Rasmussen,Hong Ge

Gaussian processes are frequently deployed as part of larger machine learning and decision-making systems, for instance in geospatial modeling, Bayesian optimization, or in latent Gaussian models. Within a system, the Gaussian process model needs to perform in a stable and reliable manner to ensure it interacts correctly with other parts of the system. In this work, we study the numerical stability of scalable sparse approximations based on inducing points. To do so, we first review numerical stability, and illustrate typical situations in which Gaussian process models can be unstable. Building on stability theory originally developed in the interpolation literature, we derive sufficient and in certain cases necessary conditions on the inducing points for the computations performed to be numerically stable. For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions. This is done via a modification of the cover tree data structure, which is of independent interest. We additionally propose an alternative sparse approximation for regression with a Gaussian likelihood which trades off a small amount of performance to further improve stability. We provide illustrative examples showing the relationship between stability of calculations and predictive performance of inducing point methods on spatial tasks.

MoDELS · 圖 · 可交換的 · Extensibility · Performer ·

2023 年 11 月 5 日

Differentially Private Pre-Trained Model Fusion using Decentralized Federated Graph Matching

Qian Chen,Yiqiang Chen,Xinlong Jiang,Teng Zhang,Weiwei Dai,Wuliang Huang,Zhen Yan,Bo Ye

Model fusion is becoming a crucial component in the context of model-as-a-service scenarios, enabling the delivery of high-quality model services to local users. However, this approach introduces privacy risks and imposes certain limitations on its applications. Ensuring secure model exchange and knowledge fusion among users becomes a significant challenge in this setting. To tackle this issue, we propose PrivFusion, a novel architecture that preserves privacy while facilitating model fusion under the constraints of local differential privacy. PrivFusion leverages a graph-based structure, enabling the fusion of models from multiple parties without necessitating retraining. By employing randomized mechanisms, PrivFusion ensures privacy guarantees throughout the fusion process. To enhance model privacy, our approach incorporates a hybrid local differentially private mechanism and decentralized federated graph matching, effectively protecting both activation values and weights. Additionally, we introduce a perturbation filter adapter to alleviate the impact of randomized noise, thereby preserving the utility of the fused model. Through extensive experiments conducted on diverse image datasets and real-world healthcare applications, we provide empirical evidence showcasing the effectiveness of PrivFusion in maintaining model performance while preserving privacy. Our contributions offer valuable insights and practical solutions for secure and collaborative data analysis within the domain of privacy-preserving model fusion.

穩健性 · 特化 · MoDELS · 分類模型 · 泛化理論 ·

2023 年 11 月 3 日

Obtaining Explainable Classification Models using Distributionally Robust Optimization

Sanjeeb Dash,Soumyadip Ghosh,Joao Goncalves,Mark S. Squillante

Model explainability is crucial for human users to be able to interpret how a proposed classifier assigns labels to data based on its feature values. We study generalized linear models constructed using sets of feature value rules, which can capture nonlinear dependencies and interactions. An inherent trade-off exists between rule set sparsity and its prediction accuracy. It is computationally expensive to find the right choice of sparsity -- e.g., via cross-validation -- with existing methods. We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors. Good generalization is ensured while keeping computational costs low by utilizing distributionally robust optimization. The formulation utilizes column generation to efficiently search the space of rule sets and constructs a sparse ensemble of rule sets, in contrast with techniques like random forests or boosting and their variants. We present theoretical results that motivate and justify the use of our distributionally robust formulation. Extensive numerical experiments establish that our method improves over competing methods -- on a large set of publicly available binary classification problem instances -- with respect to one or more of the following metrics: generalization quality, computational cost, and explainability.

Mixup · 泛化理論 · Learning · Performer · 數據增強 ·

2023 年 11 月 3 日

From SMOTE to Mixup for Deep Imbalanced Classification

Wei-Chao Cheng,Tan-Ha Mai,Hsuan-Tien Lin

from arxiv, 25 pages, 3 figures. The paper is accepted by TAAI 2023

Given imbalanced data, it is hard to train a good classifier using deep learning because of the poor generalization of minority classes. Traditionally, the well-known synthetic minority oversampling technique (SMOTE) for data augmentation, a data mining approach for imbalanced learning, has been used to improve this generalization. However, it is unclear whether SMOTE also benefits deep learning. In this work, we study why the original SMOTE is insufficient for deep learning, and enhance SMOTE using soft labels. Connecting the resulting soft SMOTE with Mixup, a modern data augmentation technique, leads to a unified framework that puts traditional and modern data augmentation techniques under the same umbrella. A careful study within this framework shows that Mixup improves generalization by implicitly achieving uneven margins between majority and minority classes. We then propose a novel margin-aware Mixup technique that more explicitly achieves uneven margins. Extensive experimental results demonstrate that our proposed technique yields state-of-the-art performance on deep imbalanced classification while achieving superior performance on extremely imbalanced data. The code is open-sourced in our developed package //github.com/ntucllab/imbalanced-DL to foster future research in this direction.

優化器 · 可約的 · 黑盒子 · Analysis · 線性的 ·

2023 年 11 月 2 日

Topological Optimization with Big Steps

Arnur Nigmetov,Dmitriy Morozov

from arxiv, 26 pages, 29 figures. Updated version (section on consistency of critical sets, more experiments) accepted to DCG

Using persistent homology to guide optimization has emerged as a novel application of topological data analysis. Existing methods treat persistence calculation as a black box and backpropagate gradients only onto the simplices involved in particular pairs. We show how the cycles and chains used in the persistence calculation can be used to prescribe gradients to larger subsets of the domain. In particular, we show that in a special case, which serves as a building block for general losses, the problem can be solved exactly in linear time. This relies on another contribution of this paper, which eliminates the need to examine a factorial number of permutations of simplices with the same value. We present empirical experiments that show the practical benefits of our algorithm: the number of steps required for the optimization is reduced by an order of magnitude.

contrastive · Performer · 變換 · 信息抽取 · INFORMS ·

2021 年 2 月 4 日

Contrastive Triple Extraction with Generative Transformer

Hongbin Ye,Ningyu Zhang,Shumin Deng,Mosha Chen,Chuanqi Tan,Fei Huang,Huajun Chen

from arxiv, Accepted by AAAI 2021

Triple extraction is an essential task in information extraction for natural language processing and knowledge graph construction. In this paper, we revisit the end-to-end triple extraction task for sequence generation. Since generative triple extraction may struggle to capture long-term dependencies and generate unfaithful triples, we introduce a novel model, contrastive triple extraction with a generative transformer. Specifically, we introduce a single shared transformer module for encoder-decoder-based generation. To generate faithful results, we propose a novel triplet contrastive training object. Moreover, we introduce two mechanisms to further improve model performance (i.e., batch-wise dynamic attention-masking and triple-wise calibration). Experimental results on three datasets (i.e., NYT, WebNLG, and MIE) show that our approach achieves better performance than that of baselines.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

異常點 · 異常檢測 · CIFAR-10 · Extensibility · Performance ·

2018 年 12 月 21 日

Deep Anomaly Detection with Outlier Exposure

Dan Hendrycks,Mantas Mazeika,Thomas G. Dietterich

from arxiv, ICLR 2019; PyTorch code available at //github.com/hendrycks/outlier-exposure

It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.

情感分析 · entity · 門控 · 卷積 · MoDELS ·

2018 年 5 月 18 日

Aspect Based Sentiment Analysis with Gated Convolutional Networks

Wei Xue,Tao Li

from arxiv, Accepted in ACL 2018

Aspect based sentiment analysis (ABSA) can provide more detailed information than general sentiment analysis, because it aims to predict the sentiment polarities of the given aspects or entities in text. We summarize previous approaches into two subtasks: aspect-category sentiment analysis (ACSA) and aspect-term sentiment analysis (ATSA). Most previous approaches employ long short-term memory and attention mechanisms to predict the sentiment polarity of the concerned targets, which are often complicated and need more training time. We propose a model based on convolutional neural networks and gating mechanisms, which is more accurate and efficient. First, the novel Gated Tanh-ReLU Units can selectively output the sentiment features according to the given aspect or entity. The architecture is much simpler than attention layer used in the existing models. Second, the computations of our model could be easily parallelized during training, because convolutional layers do not have time dependency as in LSTM layers, and gating units also work independently. The experiments on SemEval datasets demonstrate the efficiency and effectiveness of our models.