顾美玲国产一区二区三区,中文字幕无线在线视频观看,久久ER国产精品免费观看4

This paper proposes an unsupervised method that leverages topological characteristics of data manifolds to estimate class separability of the data without requiring labels. Experiments conducted in this paper on several datasets demonstrate a clear correlation and consistency between the class separability estimated by the proposed method with supervised metrics like Fisher Discriminant Ratio~(FDR) and cross-validation of a classifier, which both require labels. This can enable implementing learning paradigms aimed at learning from both labeled and unlabeled data, like semi-supervised and transductive learning. This would be particularly useful when we have limited labeled data and a relatively large unlabeled dataset that can be used to enhance the learning process. The proposed method is implemented for language model fine-tuning with automated stopping criterion by monitoring class separability of the embedding-space manifold in an unsupervised setting. The proposed methodology has been first validated on synthetic data, where the results show a clear consistency between class separability estimated by the proposed method and class separability computed by FDR. The method has been also implemented on both public and internal data. The results show that the proposed method can effectively aid -- without the need for labels -- a decision on when to stop or continue the fine-tuning of a language model and which fine-tuning iteration is expected to achieve a maximum classification performance through quantification of the class separability of the embedding manifold.

相關內容

分離的

關注 1

統計量 · 不可約的 · motivation · Less · Projection ·

2023 年 11 月 15 日

Symmetry Lie Algebras of Varieties with Applications to Algebraic Statistics

Aida Maraj,Arpan Pal

from arxiv, 18 pages. Code attached. Comments welcome!

The motivation for this paper is to detect when an irreducible projective variety V is not toric. We do this by analyzing a Lie group and a Lie algebra associated to V. If the dimension of V is strictly less than the dimension of the above mentioned objects, then V is not a toric variety. We provide an algorithm to compute the Lie algebra of an irreducible variety and use it to provide examples of non-toric statistical models in algebraic statistics.

state-of-the-art · Performer · contrastive · 變換 · MoDELS ·

2023 年 11 月 15 日

Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search

Hefeng Wu,Weifeng Chen,Zhibin Liu,Tianshui Chen,Zhiguang Chen,Liang Lin

from arxiv, Accepted by IEEE T-CSVT

Given a descriptive text query, text-based person search (TBPS) aims to retrieve the best-matched target person from an image gallery. Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data. To better align the two modalities, most existing works focus on introducing sophisticated network structures and auxiliary tasks, which are complex and hard to implement. In this paper, we propose a simple yet effective dual Transformer model for text-based person search. By exploiting a hardness-aware contrastive learning strategy, our model achieves state-of-the-art performance without any special design for local feature alignment or side information. Moreover, we propose a proximity data generation (PDG) module to automatically produce more diverse data for cross-modal training. The PDG module first introduces an automatic generation algorithm based on a text-to-image diffusion model, which generates new text-image pair samples in the proximity space of original ones. Then it combines approximate text generation and feature-level mixup during training to further strengthen the data diversity. The PDG module can largely guarantee the reasonability of the generated samples that are directly used for training without any human inspection for noise rejection. It improves the performance of our model significantly, providing a feasible solution to the data insufficiency problem faced by such fine-grained visual-linguistic tasks. Extensive experiments on two popular datasets of the TBPS task (i.e., CUHK-PEDES and ICFG-PEDES) show that the proposed approach outperforms state-of-the-art approaches evidently, e.g., improving by 3.88%, 4.02%, 2.92% in terms of Top1, Top5, Top10 on CUHK-PEDES. The codes will be available at //github.com/HCPLab-SYSU/PersonSearch-CTLG

穩健性 · 估計/估計量 · 均方誤差 · 泛函 · 設計 ·

2023 年 11 月 14 日

Balancing Covariates in Randomized Experiments with the Gram-Schmidt Walk Design

Christopher Harshaw,Fredrik S?vje,Daniel Spielman,Peng Zhang

The design of experiments involves a compromise between covariate balance and robustness. This paper provides a formalization of this trade-off and describes an experimental design that allows experimenters to navigate it. The design is specified by a robustness parameter that bounds the worst-case mean squared error of an estimator of the average treatment effect. Subject to the experimenter's desired level of robustness, the design aims to simultaneously balance all linear functions of potentially many covariates. Less robustness allows for more balance. We show that the mean squared error of the estimator is bounded in finite samples by the minimum of the loss function of an implicit ridge regression of the potential outcomes on the covariates. Asymptotically, the design perfectly balances all linear functions of a growing number of covariates with a diminishing reduction in robustness, effectively allowing experimenters to escape the compromise between balance and robustness in large samples. Finally, we describe conditions that ensure asymptotic normality and provide a conservative variance estimator, which facilitate the construction of asymptotically valid confidence intervals.

預測器/決策函數 · Networking · 論文 · MoDELS · motivation ·

2023 年 11 月 13 日

Supply Chain Characteristics as Predictors of Cyber Risk: A Machine-Learning Assessment

Kevin Hu,Retsef Levi,Raphael Yahalom,El Ghali Zerhouni

This paper provides the first large-scale data-driven analysis to evaluate the predictive power of different attributes for assessing risk of cyberattack data breaches. Furthermore, motivated by rapid increase in third party enabled cyberattacks, the paper provides the first quantitative empirical evidence that digital supply-chain attributes are significant predictors of enterprise cyber risk. The paper leverages outside-in cyber risk scores that aim to capture the quality of the enterprise internal cybersecurity management, but augment these with supply chain features that are inspired by observed third party cyberattack scenarios, as well as concepts from network science research. The main quantitative result of the paper is to show that supply chain network features add significant detection power to predicting enterprise cyber risk, relative to merely using enterprise-only attributes. Particularly, compared to a base model that relies only on internal enterprise features, the supply chain network features improve the out-of-sample AUC by 2.3\%. Given that each cyber data breach is a low probability high impact risk event, these improvements in the prediction power have significant value. Additionally, the model highlights several cybersecurity risk drivers related to third party cyberattack and breach mechanisms and provides important insights as to what interventions might be effective to mitigate these risks.

分解的 · 估計/估計量 · 優化器 · 潛變量/隱變量 · 潛在 ·

2023 年 11 月 13 日

Optimal Estimation of Large-Dimensional Nonlinear Factor Models

Yingjie Feng

from arxiv, arXiv admin note: text overlap with arXiv:2008.13651

This paper studies optimal estimation of large-dimensional nonlinear factor models. The key challenge is that the observed variables are possibly nonlinear functions of some latent variables where the functional forms are left unspecified. A local principal component analysis method is proposed to estimate the factor structure and recover information on latent variables and latent functions, which combines $K$-nearest neighbors matching and principal component analysis. Large-sample properties are established, including a sharp bound on the matching discrepancy of nearest neighbors, sup-norm error bounds for estimated local factors and factor loadings, and the uniform convergence rate of the factor structure estimator. Under mild conditions our estimator of the latent factor structure can achieve the optimal rate of uniform convergence for nonparametric regression. The method is illustrated with a Monte Carlo experiment and an empirical application studying the effect of tax cuts on economic growth.

INFORMS · MoDELS · 服務器 · 約束 · Learning ·

2023 年 11 月 12 日

The Capacity Region of Information Theoretic Secure Aggregation with Uncoded Groupwise Keys

Kai Wan,Hua Sun,Mingyue Ji,Tiebin Mi,Giuseppe Caire

from arxiv, 37 pages, 3 figures

This paper considers the secure aggregation problem for federated learning under an information theoretic cryptographic formulation, where distributed training nodes (referred to as users) train models based on their own local data and a curious-but-honest server aggregates the trained models without retrieving other information about users' local data. Secure aggregation generally contains two phases, namely key sharing phase and model aggregation phase. Due to the common effect of user dropouts in federated learning, the model aggregation phase should contain two rounds, where in the first round the users transmit masked models and, in the second round, according to the identity of surviving users after the first round, these surviving users transmit some further messages to help the server decrypt the sum of users' trained models. The objective of the considered information theoretic formulation is to characterize the capacity region of the communication rates in the two rounds from the users to the server in the model aggregation phase, assuming that key sharing has already been performed offline in prior. In this context, Zhao and Sun completely characterized the capacity region under the assumption that the keys can be arbitrary random variables. More recently, an additional constraint, known as "uncoded groupwise keys," has been introduced. This constraint entails the presence of multiple independent keys within the system, with each key being shared by precisely S users. The capacity region for the information-theoretic secure aggregation problem with uncoded groupwise keys was established in our recent work subject to the condition S > K - U, where K is the number of total users and U is the designed minimum number of surviving users. In this paper we fully characterize of the the capacity region for this problem by proposing a new converse bound and an achievable scheme.

全 · 設計 · 知識 (knowledge) · 論文 · AI ·

2023 年 11 月 12 日

Conversational Data Exploration: A Game-Changer for Designing Data Science Pipelines

Genoveva Vargas-Solar,Tania Cerquitelli,Javier A. Espinosa-Oviedo,Fran?ois Cheval,Anthelme Buchaille,Luca Polgar

This paper proposes a conversational approach implemented by the system Chatin for driving an intuitive data exploration experience. Our work aims to unlock the full potential of data analytics and artificial intelligence with a new generation of data science solutions. Chatin is a cutting-edge tool that democratises access to AI-driven solutions, empowering non-technical users from various disciplines to explore data and extract knowledge from it.

優化器 · Principle · Performer · 正交 · 泛函 ·

2023 年 11 月 10 日

On Capacity Optimality of OAMP: Beyond IID Sensing Matrices and Gaussian Signaling

Lei Liu,Shansuo Liang,Li Ping

from arxiv, Double columns, 17 pages, 9 figures

This paper investigates a large unitarily invariant system (LUIS) involving a unitarily invariant sensing matrix, an arbitrarily fixed signal distribution, and forward error control (FEC) coding. A universal Gram-Schmidt orthogonalization is considered for constructing orthogonal approximate message passing (OAMP), enabling its applicability to a wide range of prototypes without the constraint of differentiability. We develop two single-input-single-output variational transfer functions for OAMP with Lipschitz continuous local estimators, facilitating an analysis of achievable rates. Furthermore, when the state evolution of OAMP has a unique fixed point, we reveal that OAMP can achieve the constrained capacity predicted by the replica method of LUIS based on matched FEC coding, regardless of the signal distribution. The replica method is rigorously validated for LUIS with Gaussian signaling and certain sub-classes of LUIS with arbitrary signal distributions. Several area properties are established based on the variational transfer functions of OAMP. Meanwhile, we present a replica constrained capacity-achieving coding principle for LUIS. This principle serves as the basis for optimizing irregular low-density parity-check (LDPC) codes specifically tailored for binary signaling in our simulation results. The performance of OAMP with these optimized codes exhibits a remarkable improvement over the unoptimized codes and even surpasses the well-known Turbo-LMMSE algorithm. For quadrature phase-shift keying (QPSK) modulation, we observe bit error rates (BER) performance near the replica constrained capacity across diverse channel conditions.

知識 (knowledge) · Machine Learning · MoDELS · 學成 · Conformer ·

2022 年 5 月 10 日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Julian W?rmann,Daniel Bogdoll,Etienne Bührle,Han Chen,Evaristus Fuh Chuo,Kostadin Cvejoski,Ludger van Elst,Tobias Glei?ner,Philip Gottschall,Stefan Griesche,Christian Hellert,Christian Hesels,Sebastian Houben,Tim Joseph,Niklas Keil,Johann Kelsch,Hendrik K?nigshof,Erwin Kraft,Leonie Kreuser,Kevin Krone,Tobias Latka,Denny Mattern,Stefan Matthes,Mohsin Munir,Moritz Nekolla,Adrian Paschke,Maximilian Alexander Pintz,Tianming Qiu,Faraz Qureishi,Syed Tahseen Raza Rizvi,J?rg Reichardt,Laura von Rueden,Stefan Rudolph,Alexander Sagel,Gerhard Schunk,Hao Shen,Hendrik Stapelbroek,Vera Stehr,Gurucharan Srinivas,Anh Tuan Tran,Abhishek Vivekanandan,Ya Wang,Florian Wasserrab,Tino Werner,Christian Wirth,Stefan Zwicklbauer

from arxiv, 93 pages

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.

命名實體識別 · entity · 學成 · 深度學習 · 可辨認的 ·

2020 年 3 月 13 日

A Survey on Deep Learning for Named Entity Recognition

Jing Li,Aixin Sun,Jianglei Han,Chenliang Li

from arxiv, 20 pages, 12 figures, 3 tables. arXiv admin note: text overlap with arXiv:1702.02098, arXiv:1904.10503 by other authors

Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.