国产欧美日韩视频一区二区-亚洲清纯唯美色图

In this paper, a new feature selection algorithm, called SFE (Simple, Fast, and Efficient), is proposed for high-dimensional datasets. The SFE algorithm performs its search process using a search agent and two operators: non-selection and selection. It comprises two phases: exploration and exploitation. In the exploration phase, the non-selection operator performs a global search in the entire problem search space for the irrelevant, redundant, trivial, and noisy features, and changes the status of the features from selected mode to non-selected mode. In the exploitation phase, the selection operator searches the problem search space for the features with a high impact on the classification results, and changes the status of the features from non-selected mode to selected mode. The proposed SFE is successful in feature selection from high-dimensional datasets. However, after reducing the dimensionality of a dataset, its performance cannot be increased significantly. In these situations, an evolutionary computational method could be used to find a more efficient subset of features in the new and reduced search space. To overcome this issue, this paper proposes a hybrid algorithm, SFE-PSO (particle swarm optimization) to find an optimal feature subset. The efficiency and effectiveness of the SFE and the SFE-PSO for feature selection are compared on 40 high-dimensional datasets. Their performances were compared with six recently proposed feature selection algorithms. The results obtained indicate that the two proposed algorithms significantly outperform the other algorithms, and can be used as efficient and effective algorithms in selecting features from high-dimensional datasets.

相關內容

特征選擇

關注 0

特(te)征選(xuan)(xuan)擇(ze)( Feature Selection )也稱特(te)征子集選(xuan)(xuan)擇(ze)( Feature Subset Selection , FSS )，或屬性選(xuan)(xuan)擇(ze)( Attribute Selection )。是指(zhi)從(cong)(cong)已有(you)的(de)(de)M個(ge)特(te)征(Feature)中選(xuan)(xuan)擇(ze)N個(ge)特(te)征使得系統的(de)(de)特(te)定指(zhi)標最優化(hua)，是從(cong)(cong)原始(shi)特(te)征中選(xuan)(xuan)擇(ze)出一些最有(you)效特(te)征以降(jiang)低數據集維度的(de)(de)過程,是提高學習算(suan)法性能的(de)(de)一個(ge)重要手段(duan),也是模式識別中關(guan)鍵(jian)的(de)(de)數據預處理步驟(zou)。對于一個(ge)學習算(suan)法來(lai)說(shuo),好的(de)(de)學習樣本是訓練模型的(de)(de)關(guan)鍵(jian)。

有偏 · Facebook AI Research · AI · INFORMS · 邊緣化 ·

2023 年 5 月 11 日

Data quality dimensions for fair AI

Camilla Quaresmini,Giuseppe Primiero

AI systems are not intrinsically neutral and biases trickle in any type of technological tool. In particular when dealing with people, AI algorithms reflect technical errors originating with mislabeled data. As they feed wrong and discriminatory classifications, perpetuating structural racism and marginalization, these systems are not systematically guarded against bias. In this article we consider the problem of bias in AI systems from the point of view of Information Quality dimensions. We illustrate potential improvements of a bias mitigation tool in gender classification errors, referring to two typically difficult contexts: the classification of non-binary individuals and the classification of transgender individuals. The identification of data quality dimensions to implement in bias mitigation tool may help achieve more fairness. Hence, we propose to consider this issue in terms of completeness, consistency, timeliness and reliability, and offer some theoretical results.

主動學習 · Learning · 損失 · 標注 · 易處理的 ·

2023 年 5 月 11 日

Active Learning in the Predict-then-Optimize Framework: A Margin-Based Approach

Mo Liu,Paul Grigas,Heyuan Liu,Zuo-Jun Max Shen

We develop the first active learning method in the predict-then-optimize framework. Specifically, we develop a learning method that sequentially decides whether to request the "labels" of feature samples from an unlabeled data stream, where the labels correspond to the parameters of an optimization model for decision-making. Our active learning method is the first to be directly informed by the decision error induced by the predicted parameters, which is referred to as the Smart Predict-then-Optimize (SPO) loss. Motivated by the structure of the SPO loss, our algorithm adopts a margin-based criterion utilizing the concept of distance to degeneracy and minimizes a tractable surrogate of the SPO loss on the collected data. In particular, we develop an efficient active learning algorithm with both hard and soft rejection variants, each with theoretical excess risk (i.e., generalization) guarantees. We further derive bounds on the label complexity, which refers to the number of samples whose labels are acquired to achieve a desired small level of SPO risk. Under some natural low-noise conditions, we show that these bounds can be better than the naive supervised learning approach that labels all samples. Furthermore, when using the SPO+ loss function, a specialized surrogate of the SPO loss, we derive a significantly smaller label complexity under separability conditions. We also present numerical evidence showing the practical value of our proposed algorithms in the settings of personalized pricing and the shortest path problem.

統計量 · 線性的 · 線性回歸 · 穩健性 · 優化器 ·

2023 年 5 月 10 日

Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

Yinan Shen,Jingyang Li,Jian-Feng Cai,Dong Xia

from arxiv, This manuscript supersedes an earlier one (arXiv:2203.00953). Two manuscripts share around 60% contents. There will be no further update for the earlier manuscript

High-dimensional linear regression under heavy-tailed noise or outlier corruption is challenging, both computationally and statistically. Convex approaches have been proven statistically optimal but suffer from high computational costs, especially since the robust loss functions are usually non-smooth. More recently, computationally fast non-convex approaches via sub-gradient descent are proposed, which, unfortunately, fail to deliver a statistically consistent estimator even under sub-Gaussian noise. In this paper, we introduce a projected sub-gradient descent algorithm for both the sparse linear regression and low-rank linear regression problems. The algorithm is not only computationally efficient with linear convergence but also statistically optimal, be the noise Gaussian or heavy-tailed with a finite 1 + epsilon moment. The convergence theory is established for a general framework and its specific applications to absolute loss, Huber loss and quantile loss are investigated. Compared with existing non-convex methods, ours reveals a surprising phenomenon of two-phase convergence. In phase one, the algorithm behaves as in typical non-smooth optimization that requires gradually decaying stepsizes. However, phase one only delivers a statistically sub-optimal estimator, which is already observed in the existing literature. Interestingly, during phase two, the algorithm converges linearly as if minimizing a smooth and strongly convex objective function, and thus a constant stepsize suffices. Underlying the phase-two convergence is the smoothing effect of random noise to the non-smooth robust losses in an area close but not too close to the truth. Numerical simulations confirm our theoretical discovery and showcase the superiority of our algorithm over prior methods.

MoDELS · FAST · Performer · Networking · Neural Networks ·

2023 年 5 月 10 日

FreeREA: Training-Free Evolution-based Architecture Search

Niccolò Cavagnero,Luca Robbiano,Barbara Caputo,Giuseppe Averta

from arxiv, 16 pages, 4 figures

In the last decade, most research in Machine Learning contributed to the improvement of existing models, with the aim of increasing the performance of neural networks for the solution of a variety of different tasks. However, such advancements often come at the cost of an increase of model memory and computational requirements. This represents a significant limitation for the deployability of research output in realistic settings, where the cost, the energy consumption, and the complexity of the framework play a crucial role. To solve this issue, the designer should search for models that maximise the performance while limiting its footprint. Typical approaches to reach this goal rely either on manual procedures, which cannot guarantee the optimality of the final design, or upon Neural Architecture Search algorithms to automatise the process, at the expenses of extremely high computational time. This paper provides a solution for the fast identification of a neural network that maximises the model accuracy while preserving size and computational constraints typical of tiny devices. Our approach, named FreeREA, is a custom cell-based evolution NAS algorithm that exploits an optimised combination of training-free metrics to rank architectures during the search, thus without need of model training. Our experiments, carried out on the common benchmarks NAS-Bench-101 and NATS-Bench, demonstrate that i) FreeREA is a fast, efficient, and effective search method for models automatic design; ii) it outperforms State of the Art training-based and training-free techniques in all the datasets and benchmarks considered, and iii) it can easily generalise to constrained scenarios, representing a competitive solution for fast Neural Architecture Search in generic constrained applications. The code is available at \url{//github.com/NiccoloCavagnero/FreeREA}.

Continuity · Learning · 控制器 · Performer · 情景 ·

2023 年 5 月 10 日

Probabilistic Guarantees for Safe Reinforcement Learning in Continuous Action Spaces via Temporal Logic

Hanna Krasowski,Prithvi Akella,Aaron Ames,Matthias Althoff

Vanilla Reinforcement Learning (RL) can efficiently solve complex tasks but does not provide any guarantees on system behavior. Yet, for real-world systems, which are often safety-critical, such guarantees on safety specifications are necessary. To bridge this gap, we propose a safe RL procedure for continuous action spaces with verified probabilistic guarantees specified via temporal logic. First, our approach probabilistically verifies a candidate controller with respect to a temporal logic specification while randomizing the controller's inputs within an expansion set. Then, we use RL to improve the performance of this probabilistically verified controller and explore in the given expansion set around the controller's input. Finally, we calculate probabilistic safety guarantees with respect to temporal logic specifications for the learned agent. Our approach is efficiently implementable for continuous action and state spaces and separates safety verification and performance improvement into two distinct steps. We evaluate our approach on an evasion task where a robot has to reach a goal while evading a dynamic obstacle with a specific maneuver. Our results show that our safe RL approach leads to efficient learning while probablistically maintaining safety specifications.

近似誤差 · 近似 · 相關系數 · 層 · Tensor ·

2023 年 5 月 9 日

How Informative is the Approximation Error from Tensor Decomposition for Neural Network Compression?

Jetze T. Schuurmans,Kim Batselier,Julian F. P. Kooij

from arxiv, Published as a conference paper at ICLR 2023

Tensor decompositions have been successfully applied to compress neural networks. The compression algorithms using tensor decompositions commonly minimize the approximation error on the weights. Recent work assumes the approximation error on the weights is a proxy for the performance of the model to compress multiple layers and fine-tune the compressed model. Surprisingly, little research has systematically evaluated which approximation errors can be used to make choices regarding the layer, tensor decomposition method, and level of compression. To close this gap, we perform an experimental study to test if this assumption holds across different layers and types of decompositions, and what the effect of fine-tuning is. We include the approximation error on the features resulting from a compressed layer in our analysis to test if this provides a better proxy, as it explicitly takes the data into account. We find the approximation error on the weights has a positive correlation with the performance error, before as well as after fine-tuning. Basing the approximation error on the features does not improve the correlation significantly. While scaling the approximation error commonly is used to account for the different sizes of layers, the average correlation across layers is smaller than across all choices (i.e. layers, decompositions, and level of compression) before fine-tuning. When calculating the correlation across the different decompositions, the average rank correlation is larger than across all choices. This means multiple decompositions can be considered for compression and the approximation error can be used to choose between them.

輸出 · ESA · Analysis · 值域 · 泛函 ·

2023 年 5 月 7 日

Two-sided convexity testing with certificates

Adrian Dumitrescu

from arxiv, 15 pages, 2 figures

We revisit the problem of property testing for convex position for point sets in $\mathbb{R}^d$. Our results draw from previous ideas of Czumaj, Sohler, and Ziegler (ESA 2000). First, the algorithm is redesigned and its analysis is revised for correctness. Second, its functionality is expanded by (i)~exhibiting both negative and positive certificates along with the convexity determination, and (ii)~significantly extending the input range for moderate and higher dimensions. The behavior of the randomized tester is as follows: (i)~if $P$ is in convex position, it accepts; (ii)~if $P$ is far from convex position, with probability at least $2/3$, it rejects and outputs a $(d+2)$-point witness of non-convexity as a negative certificate; (iiii)~if $P$ is close to convex position, with probability at least $2/3$, it accepts and outputs an approximation of the largest subset in convex position. The algorithm examines a sublinear number of points and runs in subquadratic time for every dimension $d$.

簇 · 圖 · SC · 圖形處理器 · 匯聚 ·

2020 年 6 月 3 日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Filippo Maria Bianchi,Daniele Grattarola,Cesare Alippi

Spectral clustering (SC) is a popular clustering technique to find strongly connected communities on a graph. SC can be used in Graph Neural Networks (GNNs) to implement pooling operations that aggregate nodes belonging to the same cluster. However, the eigendecomposition of the Laplacian is expensive and, since clustering results are graph-specific, pooling methods based on SC must perform a new optimization for each new sample. In this paper, we propose a graph clustering approach that addresses these limitations of SC. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.

數據增強 · 泛化理論 · 矩 · 規范化的 · surge ·

2020 年 2 月 25 日

On Feature Normalization and Data Augmentation

Boyi Li,Felix Wu,Ser-Nam Lim,Serge Belongie,Kilian Q. Weinberger

Modern neural network training relies heavily on data augmentation for improved generalization. After the initial success of label-preserving augmentations, there has been a recent surge of interest in label-perturbing approaches, which combine features and labels across training samples to smooth the learned decision surface. In this paper, we propose a new augmentation method that leverages the first and second moments extracted and re-injected by feature normalization. We replace the moments of the learned features of one training image by those of another, and also interpolate the target labels. As our approach is fast, operates entirely in feature space, and mixes different signals than prior methods, one can effectively combine it with existing augmentation methods. We demonstrate its efficacy across benchmark data sets in computer vision, speech, and natural language processing, where it consistently improves the generalization performance of highly competitive baseline networks.