精品夜色国产国偷自产乱码_亚洲欧洲国产精品你懂的_久久中文字幕中文电影网站_欧美精品视频网站在线观看_在线国产国产黄色网址_最新亚洲一区在线观看影院_中文字幕日韩精品一区二区三区四区

This paper introduces a novel framework for dynamic classification in high dimensional spaces, addressing the evolving nature of class distributions over time or other index variables. Traditional discriminant analysis techniques are adapted to learn dynamic decision rules with respect to the index variable. In particular, we propose and study a new supervised dimension reduction method employing kernel smoothing to identify the optimal subspace, and provide a comprehensive examination of this approach for both linear discriminant analysis and quadratic discriminant analysis. We illustrate the effectiveness of the proposed methods through numerical simulations and real data examples. The results show considerable improvements in classification accuracy and computational efficiency. This work contributes to the field by offering a robust and adaptive solution to the challenges of scalability and non-staticity in high-dimensional data classification.

相關內容

Analysis

關注 2

Learning · AI · Networking · 深度學習 · 隨機森林 ·

2024 年 12 月 17 日

Deep Learning Based Superconductivity: Prediction and Experimental Tests

Daniel Kaplan,Adam Zhang,Joanna Blawat,Rongying Jin,Robert J. Cava,Viktor Oudovenko,Gabriel Kotliar,Anirvan M. Sengupta,Weiwei Xie

from arxiv, 14 pages + 2 appendices + references. EPJ submission

The discovery of novel superconducting materials is a longstanding challenge in materials science, with a wealth of potential for applications in energy, transportation, and computing. Recent advances in artificial intelligence (AI) have enabled expediting the search for new materials by efficiently utilizing vast materials databases. In this study, we developed an approach based on deep learning (DL) to predict new superconducting materials. We have synthesized a compound derived from our DL network and confirmed its superconducting properties in agreement with our prediction. Our approach is also compared to previous work based on random forests (RFs). In particular, RFs require knowledge of the chem-ical properties of the compound, while our neural net inputs depend solely on the chemical composition. With the help of hints from our network, we discover a new ternary compound $\textrm{Mo}_{20}\textrm{Re}_{6}\textrm{Si}_{4}$, which becomes superconducting below 5.4 K. We further discuss the existing limitations and challenges associated with using AI to predict and, along with potential future research directions.

標注 · MoDELS · Performer · Machine Learning · Continuity ·

2024 年 12 月 17 日

Sequential Harmful Shift Detection Without Labels

Salim I. Amoukou,Tom Bewley,Saumitra Mishra,Freddy Lecue,Daniele Magazzeni,Manuela Veloso

from arxiv, Accepted at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

We introduce a novel approach for detecting distribution shifts that negatively impact the performance of machine learning models in continuous production environments, which requires no access to ground truth data labels. It builds upon the work of Podkopaev and Ramdas [2022], who address scenarios where labels are available for tracking model errors over time. Our solution extends this framework to work in the absence of labels, by employing a proxy for the true error. This proxy is derived using the predictions of a trained error estimator. Experiments show that our method has high power and false alarm control under various distribution shifts, including covariate and label shifts and natural shifts over geography and time.

控制器 · 在線 · 全 · 異常檢測 · 假陽性 ·

2024 年 12 月 16 日

FDR Control for Online Anomaly Detection

Etienne Kr?nert,Alain Célisse,Dalila Hattab

A new online multiple testing procedure is described in the context of anomaly detection, which controls the False Discovery Rate (FDR). An accurate anomaly detector must control the false positive rate at a prescribed level while keeping the false negative rate as low as possible. However in the online context, such a constraint remains highly challenging due to the usual lack of FDR control: the online framework makes it impossible to use classical multiple testing approaches such as the Benjamini-Hochberg (BH) procedure, which would require knowing the entire time series. The developed strategy relies on exploiting the local control of the ``modified FDR'' (mFDR) criterion. It turns out that the local control of mFDR enables global control of the FDR over the full series up to additional modifications of the multiple testing procedures. An important ingredient in this control is the cardinality of the calibration dataset used to compute the empirical p-values. A dedicated strategy for tuning this parameter is designed for achieving the prescribed FDR control over the entire time series. The good statistical performance of the full strategy is analyzed by theoretical guarantees. Its practical behavior is assessed by several simulation experiments which support our conclusions.

圖 · Bagging · 極小點 · Extensibility · CASE ·

2024 年 12 月 16 日

Rewriting Consistent Answers on Annotated Data

Phokion G. Kolaitis,Nina Pardal,Jonni Virtema

We embark on a study of the consistent answers of queries over databases annotated with values from a naturally ordered positive semiring. In this setting, the consistent answers of a query are defined as the minimum of the semiring values that the query takes over all repairs of an inconsistent database. The main focus is on self-join free conjunctive queries and key constraints, which is the most extensively studied case of consistent query answering over standard databases. We introduce a variant of first-order logic with a limited form of negation, define suitable semiring semantics, and then establish the main result of the paper: the consistent query answers of a self-join free conjunctive query under key constraints are rewritable in this logic if and only if the attack graph of the query contains no cycles. This result generalizes an analogous result of Koutris and Wijsen for ordinary databases, but also yields new results for a multitude of semirings, including the bag semiring, the tropical semiring, and the fuzzy semiring. We also show that there are self-join free conjunctive queries with a cyclic attack graph whose certain answers under bag semantics have no polynomial-time constant-approximation algorithm, unless P = NP.

優化器 · 穩健性 · Lipschitz · MoDELS · 樣本 ·

2024 年 12 月 15 日

Optimal Rates for Robust Stochastic Convex Optimization

Changyu Gao,Andrew Lowy,Xingyu Zhou,Stephen J. Wright

The sensitivity of machine learning algorithms to outliers, particularly in high-dimensional spaces, necessitates the development of robust methods. Within the framework of $\epsilon$-contamination model, where the adversary can inspect and replace up to $\epsilon$ fraction of the samples, a fundamental open question is determining the optimal rates for robust stochastic convex optimization (robust SCO), provided the samples under $\epsilon$-contamination. We develop novel algorithms that achieve minimax-optimal excess risk (up to logarithmic factors) under the $\epsilon$-contamination model. Our approach advances beyonds existing algorithms, which are not only suboptimal but also constrained by stringent requirements, including Lipschitzness and smoothness conditions on sample functions.Our algorithms achieve optimal rates while removing these restrictive assumptions, and notably, remain effective for nonsmooth but Lipschitz population risks.

穩健性 · 歐幾里得距離 · anchor · 稀疏 · 樣本 ·

2024 年 12 月 14 日

Structured Sampling for Robust Euclidean Distance Geometry

Chandra Kundu,Abiy Tasissa,HanQin Cai

This paper addresses the problem of estimating the positions of points from distance measurements corrupted by sparse outliers. Specifically, we consider a setting with two types of nodes: anchor nodes, for which exact distances to each other are known, and target nodes, for which complete but corrupted distance measurements to the anchors are available. To tackle this problem, we propose a novel algorithm powered by Nystr\"om method and robust principal component analysis. Our method is computationally efficient as it processes only a localized subset of the distance matrix and does not require distance measurements between target nodes. Empirical evaluations on synthetic datasets, designed to mimic sensor localization, and on molecular experiments, demonstrate that our algorithm achieves accurate recovery with a modest number of anchors, even in the presence of high levels of sparse outliers.

優化器 · MoDELS · 正則化項 · 前向 · 操作 ·

2024 年 12 月 13 日

Stochastic Multiresolution Image Sketching for Inverse Imaging Problems

Alessandro Perelli,Carola-Bibiane Schonlieb,Matthias J. Ehrhardt

from arxiv, 26 pages, 11 figures, submitted to SIAM Journal on Imaging Sciences

A challenge in high-dimensional inverse problems is developing iterative solvers to find the accurate solution of regularized optimization problems with low computational cost. An important example is computed tomography (CT) where both image and data sizes are large and therefore the forward model is costly to evaluate. Since several years algorithms from stochastic optimization are used for tomographic image reconstruction with great success by subsampling the data. Here we propose a novel way how stochastic optimization can be used to speed up image reconstruction by means of image domain sketching such that at each iteration an image of different resolution is being used. Hence, we coin this algorithm ImaSk. By considering an associated saddle-point problem, we can formulate ImaSk as a gradient-based algorithm where the gradient is approximated in the same spirit as the stochastic average gradient am\'elior\'e (SAGA) and uses at each iteration one of these multiresolution operators at random. We prove that ImaSk is linearly converging for linear forward models with strongly convex regularization functions. Numerical simulations on CT show that ImaSk is effective and increasing the number of multiresolution operators reduces the computational time to reach the modeled solution.

命名實體消歧 · entity · 無監督 · 講稿 · Processing（編程語言） ·

2024 年 12 月 13 日

Unsupervised Named Entity Disambiguation for Low Resource Domains

Debarghya Datta,Soumajit Pramanik

from arxiv, Accepted in EMNLP-2024

In the ever-evolving landscape of natural language processing and information retrieval, the need for robust and domain-specific entity linking algorithms has become increasingly apparent. It is crucial in a considerable number of fields such as humanities, technical writing and biomedical sciences to enrich texts with semantics and discover more knowledge. The use of Named Entity Disambiguation (NED) in such domains requires handling noisy texts, low resource settings and domain-specific KBs. Existing approaches are mostly inappropriate for such scenarios, as they either depend on training data or are not flexible enough to work with domain-specific KBs. Thus in this work, we present an unsupervised approach leveraging the concept of Group Steiner Trees (GST), which can identify the most relevant candidates for entity disambiguation using the contextual similarities across candidate entities for all the mentions present in a document. We outperform the state-of-the-art unsupervised methods by more than 40\% (in avg.) in terms of Precision@1 across various domain-specific datasets.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

目標檢測 · 模型評估 · 學成 · 注意力機制 · Networking ·

2019 年 4 月 15 日

Reverse Attention for Salient Object Detection

Shuhan Chen,Xiuli Tan,Ben Wang,Xuelong Hu

from arxiv, ECCV 2018

Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).