亚洲综合蜜桃久久丁香婷-人人操人人莫人人草

Word embedding methods (WEMs) are extensively used for representing text data. The dimensionality of these embeddings varies across various tasks and implementations. The effect of dimensionality change on the accuracy of the downstream task is a well-explored question. However, how the dimensionality change affects the bias of word embeddings needs to be investigated. Using the English Wikipedia corpus, we study this effect for two static (Word2Vec and fastText) and two context-sensitive (ElMo and BERT) WEMs. We have two observations. First, there is a significant variation in the bias of word embeddings with the dimensionality change. Second, there is no uniformity in how the dimensionality change affects the bias of word embeddings. These factors should be considered while selecting the dimensionality of word embeddings.

相關內容

詞向量表示

關注 37

分散式表示即將語言表示為稠密、低維、連續的向量。研究者最早發現學習得到詞嵌入之間存在類比關系。比如apple?apples ≈ car?cars， man?woman ≈ king – queen 等。這些方法都可以直接在大規模無標注語料上進行訓練。詞嵌入的質量也非常依賴于上下文窗口大小的選擇。通常大的上下文窗口學到的詞嵌入更反映主題信息，而小的上下文窗口學到的詞嵌入更反映詞的功能和上下文語義信息。

AI · Learning · 模型評估 · 原點 · Analysis ·

2024 年 2 月 15 日

Current and future roles of artificial intelligence in retinopathy of prematurity

Ali Jafarizadeh,Shadi Farabi Maleki,Parnia Pouya,Navid Sobhi,Mirsaeed Abdollahi,Siamak Pedrammehr,Chee Peng Lim,Houshyar Asadi,Roohallah Alizadehsani,Ru-San Tan,Sheikh Mohammad Shariful Islam,U. Rajendra Acharya

from arxiv, 28 pages, 8 figures, 2 tables, 235 references, 1 supplementary table

Retinopathy of prematurity (ROP) is a severe condition affecting premature infants, leading to abnormal retinal blood vessel growth, retinal detachment, and potential blindness. While semi-automated systems have been used in the past to diagnose ROP-related plus disease by quantifying retinal vessel features, traditional machine learning (ML) models face challenges like accuracy and overfitting. Recent advancements in deep learning (DL), especially convolutional neural networks (CNNs), have significantly improved ROP detection and classification. The i-ROP deep learning (i-ROP-DL) system also shows promise in detecting plus disease, offering reliable ROP diagnosis potential. This research comprehensively examines the contemporary progress and challenges associated with using retinal imaging and artificial intelligence (AI) to detect ROP, offering valuable insights that can guide further investigation in this domain. Based on 89 original studies in this field (out of 1487 studies that were comprehensively reviewed), we concluded that traditional methods for ROP diagnosis suffer from subjectivity and manual analysis, leading to inconsistent clinical decisions. AI holds great promise for improving ROP management. This review explores AI's potential in ROP detection, classification, diagnosis, and prognosis.

優化器 · cancer · 超參數 · 相關系數 · 分解的 ·

2024 年 2 月 14 日

Optimal transport for automatic alignment of untargeted metabolomic data

Marie Breeur,George Stepaniants,Pekka Keski-Rahkonen,Philippe Rigollet,Vivian Viallon

from arxiv, 47 pages, 16 figures

Untargeted metabolomic profiling through liquid chromatography-mass spectrometry (LC-MS) measures a vast array of metabolites within biospecimens, advancing drug development, disease diagnosis, and risk prediction. However, the low throughput of LC-MS poses a major challenge for biomarker discovery, annotation, and experimental comparison, necessitating the merging of multiple datasets. Current data pooling methods encounter practical limitations due to their vulnerability to data variations and hyperparameter dependence. Here we introduce GromovMatcher, a flexible and user-friendly algorithm that automatically combines LC-MS datasets using optimal transport. By capitalizing on feature intensity correlation structures, GromovMatcher delivers superior alignment accuracy and robustness compared to existing approaches. This algorithm scales to thousands of features requiring minimal hyperparameter tuning. Applying our method to experimental patient studies of liver and pancreatic cancer, we discover shared metabolic features related to patient alcohol intake, demonstrating how GromovMatcher facilitates the search for biomarkers associated with lifestyle risk factors linked to several cancer types.

通道 · 輸出 · CASE · Alphabet · surge ·

2024 年 2 月 14 日

Deterministic identification over channels with finite output: a dimensional perspective on superlinear rates

Pau Colomer,Christian Deppe,Holger Boche,Andreas Winter

from arxiv, 24 pages, 3 figures

Following initial work by JaJa and Ahlswede/Cai, and inspired by a recent renewed surge in interest in deterministic identification via noisy channels, we consider the problem in its generality for memoryless channels with finite output, but arbitrary input alphabets. Such a channel is essentially given by (the closure of) the subset of its output distributions in the probability simplex. Our main findings are that the maximum number of messages thus identifiable scales super-exponentially as $2^{R\,n\log n}$ with the block length $n$, and that the optimal rate $R$ is upper and lower bounded in terms of the covering (aka Minkowski, or Kolmogorov, or entropy) dimension $d$ of the output set: $\frac14 d \leq R \leq d$. Leading up to the general case, we treat the important special case of the so-called Bernoulli channel with input alphabet $[0;1]$ and binary output, which has $d=1$, to gain intuition. Along the way, we show a certain Hypothesis Testing Lemma (generalising an earlier insight of Ahlswede regarding the intersection of typical sets) that implies that for the construction of a deterministic identification code, it is sufficient to ensure pairwise reliable distinguishability of the output distributions. These results are then shown to generalise directly to classical-quantum channels with finite-dimensional output quantum system (but arbitrary input alphabet), and in particular to quantum channels on finite-dimensional quantum systems under the constraint that the identification code can only use tensor product inputs.

圖 · 圖形處理器 · Networking · Neural Networks · Performer ·

2024 年 2 月 14 日

Multiscale graph neural networks with adaptive mesh refinement for accelerating mesh-based simulations

Roberto Perera,Vinamra Agrawal

Mesh-based Graph Neural Networks (GNNs) have recently shown capabilities to simulate complex multiphysics problems with accelerated performance times. However, mesh-based GNNs require a large number of message-passing (MP) steps and suffer from over-smoothing for problems involving very fine mesh. In this work, we develop a multiscale mesh-based GNN framework mimicking a conventional iterative multigrid solver, coupled with adaptive mesh refinement (AMR), to mitigate challenges with conventional mesh-based GNNs. We use the framework to accelerate phase field (PF) fracture problems involving coupled partial differential equations with a near-singular operator due to near-zero modulus inside the crack. We define the initial graph representation using all mesh resolution levels. We perform a series of downsampling steps using Transformer MP GNNs to reach the coarsest graph followed by upsampling steps to reach the original graph. We use skip connectors from the generated embedding during coarsening to prevent over-smoothing. We use Transfer Learning (TL) to significantly reduce the size of training datasets needed to simulate different crack configurations and loading conditions. The trained framework showed accelerated simulation times, while maintaining high accuracy for all cases compared to physics-based PF fracture model. Finally, this work provides a new approach to accelerate a variety of mesh-based engineering multiphysics problems

圖 · 語言模型化 · MoDELS · Microsoft Windows · CASES ·

2024 年 2 月 13 日

Sequence graphs realizations and ambiguity in language models

Sammy Khalife,Yann Ponty,Laurent Bulteau

Several popular language models represent local contexts in an input text as bags of words. Such representations are naturally encoded by a sequence graph whose vertices are the distinct words occurring in x, with edges representing the (ordered) co-occurrence of two words within a sliding window of size w. However, this compressed representation is not generally bijective, and may introduce some degree of ambiguity. Some sequence graphs may admit several realizations as a sequence, while others may not admit any realization. In this paper, we study the realizability and ambiguity of sequence graphs from a combinatorial and computational point of view. We consider the existence and enumeration of realizations of a sequence graph under multiple settings: window size w, presence/absence of graph orientation, and presence/absence of weights (multiplicities). When w = 2, we provide polynomial time algorithms for realizability and enumeration in all cases except the undirected/weighted setting, where we show the #P-hardness of enumeration. For a window of size at least 3, we prove hardness of all variants, even when w is considered as a constant, with the notable exception of the undirected/unweighted case for which we propose an XP algorithms for both (realizability and enumeration) problems, tight due to a corresponding W[1]-hardness result. We conclude with an integer program formulation to solve the realizability problem, and with dynamic programming to solve the enumeration problem. This work leaves open the membership to NP for both problems, a non-trivial question due to the existence of minimum realizations having exponential size on the instance encoding.

估計/估計量 · 邊緣化 · 穩健性 · 得分 · MoDELS ·

2024 年 2 月 13 日

Covariate selection for the estimation of marginal hazard ratios in high-dimensional data

Guilherme W. F. Barros,Jenny H?ggstr?m

Hazard ratios are frequently reported in time-to-event and epidemiological studies to assess treatment effects. In observational studies, the combination of propensity score weights with the Cox proportional hazards model facilitates the estimation of the marginal hazard ratio (MHR). The methods for estimating MHR are analogous to those employed for estimating common causal parameters, such as the average treatment effect. However, MHR estimation in the context of high-dimensional data remain unexplored. This paper seeks to address this gap through a simulation study that consider variable selection methods from causal inference combined with a recently proposed multiply robust approach for MHR estimation. Additionally, a case study utilizing stroke register data is conducted to demonstrate the application of these methods. The results from the simulation study indicate that the double selection covariate selection method is preferable to several other strategies when estimating MHR. Nevertheless, the estimation can be further improved by employing the multiply robust approach to the set of propensity score models obtained during the double selection process.

INTERACT · 泛函 · 講稿 · 樣例 · 分解 ·

2024 年 2 月 13 日

Interaction Decomposition of prediction function

Hirokazu Iwasawa,Yoshihiro Matsumori

from arxiv, 25 pages, 0 figures

This paper discusses the foundation of methods for accurately grasping the interaction effects. Among the existing methods that capture the interaction effects as terms, PD and ALE are known as global modelagnostic methods in the IML field. ALE, among the two, can theoretically provide a functional decomposition of the prediction function, and this study focuses on functional decomposition. Specifically, we mathematically formalize what we consider to be the requirements that must always be met by a decomposition (interaction decomposition, hereafter, ID) that decomposes the prediction function into main and interaction effect terms. We also present a theorem about how to produce a decomposition that meets these requirements. Furthermore, we confirm that while ALE is ID, PD is not, and we present examples of decomposition that meet the requirements of ID using methods other than existing ones (i.e., new methods).

極大似然 · 似然 · 估計/估計量 · 因子分析 · 分解的 ·

2024 年 2 月 13 日

Algebraic approach to maximum likelihood factor analysis

Ryoya Fukasaku,Kei Hirose,Yutaro Kabata,Keisuke Teramoto

In exploratory factor analysis, model parameters are usually estimated by maximum likelihood method. The maximum likelihood estimate is obtained by solving a complicated multivariate algebraic equation. Since the solution to the equation is usually intractable, it is typically computed with continuous optimization methods, such as Newton-Raphson methods. With this procedure, however, the solution is inevitably dependent on the estimation algorithm and initial value since the log-likelihood function is highly non-concave. Particularly, the estimates of unique variances can result in zero or negative, referred to as improper solutions; in this case, the maximum likelihood estimate can be severely unstable. To delve into the issue of the instability of the maximum likelihood estimate, we compute exact solutions to the multivariate algebraic equation by using algebraic computations. We provide a computationally efficient algorithm based on the algebraic computations specifically optimized for maximum likelihood factor analysis. To be specific, Gr\"oebner basis and cylindrical decomposition are employed, powerful tools for solving the multivariate algebraic equation. Our proposed procedure produces all exact solutions to the algebraic equation; therefore, these solutions are independent of the initial value and estimation algorithm. We conduct Monte Carlo simulations to investigate the characteristics of the maximum likelihood solutions.

Cognition · 有偏 · LLaMA · 語言模型化 · Llama 2 ·

2024 年 2 月 12 日

Addressing cognitive bias in medical language models

Samuel Schmidgall,Carl Harris,Ime Essien,Daniel Olshvang,Tawsifur Rahman,Ji Woong Kim,Rojin Ziaei,Jason Eshraghian,Peter Abadir,Rama Chellappa

The integration of large language models (LLMs) into the medical field has gained significant attention due to their promising accuracy in simulated clinical decision-making settings. However, clinical decision-making is more complex than simulations because physicians' decisions are shaped by many factors, including the presence of cognitive bias. However, the degree to which LLMs are susceptible to the same cognitive biases that affect human clinicians remains unexplored. Our hypothesis posits that when LLMs are confronted with clinical questions containing cognitive biases, they will yield significantly less accurate responses compared to the same questions presented without such biases.In this study, we developed BiasMedQA, a novel benchmark for evaluating cognitive biases in LLMs applied to medical tasks. Using BiasMedQA we evaluated six LLMs, namely GPT-4, Mixtral-8x70B, GPT-3.5, PaLM-2, Llama 2 70B-chat, and the medically specialized PMC Llama 13B. We tested these models on 1,273 questions from the US Medical Licensing Exam (USMLE) Steps 1, 2, and 3, modified to replicate common clinically-relevant cognitive biases. Our analysis revealed varying effects for biases on these LLMs, with GPT-4 standing out for its resilience to bias, in contrast to Llama 2 70B-chat and PMC Llama 13B, which were disproportionately affected by cognitive bias. Our findings highlight the critical need for bias mitigation in the development of medical LLMs, pointing towards safer and more reliable applications in healthcare.

哈希學習 · 圖像檢索 · 卷積神經網絡 · 優化器 · 損失函數（機器學習） ·

2020 年 6 月 10 日

A survey on deep hashing for image retrieval

Xiaopeng Zhang

Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.