爱琴海论坛视频播放三免费,动漫AV观看网站不卡无码

Kamalkumar Rathinasamy,Jayarama Nettar,Amit Kumar,Vishal Manchanda,Arun Vijayakumar,Ayush Kataria,Venkateshprasanna Manjunath,Chidambaram GS,Jaskirat Singh Sodhi,Shoeb Shaikh,Wasim Akhtar Khan,Prashant Singh,Tanishq Dattatray Ige,Vipin Tiwari,Rajab Ali Mondal,Harshini K,S Reka,Chetana Amancharla,Faiz ur Rahman,Harikrishnan P A,Indraneel Saha,Bhavya Tiwary,Navin Shankar Patel,Pradeep T S,Balaji A J, Priyapravas,Mohammed Rafee Tarafdar

Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components. While pre-trained embeddings may exhibit proximity or disparity based on their original training objectives, they might not fully align with the unique characteristics of enterprise-specific data, leading to suboptimal alignment with the retrieval goals of enterprise environments. In this paper, we propose a comprehensive methodology for contextualizing pre-trained embedding models to enterprise environments, covering the entire process from data preparation to model fine-tuning and evaluation. By adapting the embeddings to better suit the retrieval tasks prevalent in enterprises, we aim to enhance the performance of information retrieval solutions. We discuss the process of fine-tuning, its effect on retrieval accuracy, and the potential benefits for enterprise information management. Our findings demonstrate the efficacy of fine-tuned embedding models in improving the precision and relevance of search results in enterprise settings.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · Networking · 圖 · Extensibility · 變換 ·

2024 年 11 月 6 日

Topology-guided Hypergraph Transformer Network: Unveiling Structural Insights for Improved Representation

Khaled Mohammed Saifuddin,Mehmet Emin Aktas,Esra Akbas

from arxiv, 10 pages, 3 figures

Hypergraphs, with their capacity to depict high-order relationships, have emerged as a significant extension of traditional graphs. Although Graph Neural Networks (GNNs) have remarkable performance in graph representation learning, their extension to hypergraphs encounters challenges due to their intricate structures. Furthermore, current hypergraph transformers, a special variant of GNN, utilize semantic feature-based self-attention, ignoring topological attributes of nodes and hyperedges. To address these challenges, we propose a Topology-guided Hypergraph Transformer Network (THTN). In this model, we first formulate a hypergraph from a graph while retaining its structural essence to learn higher-order relations within the graph. Then, we design a simple yet effective structural and spatial encoding module to incorporate the topological and spatial information of the nodes into their representation. Further, we present a structure-aware self-attention mechanism that discovers the important nodes and hyperedges from both semantic and structural viewpoints. By leveraging these two modules, THTN crafts an improved node representation, capturing both local and global topological expressions. Extensive experiments conducted on node classification tasks demonstrate that the performance of the proposed model consistently exceeds that of the existing approaches.

穩健性 · 優化器 · MoDELS · Processing（編程語言） · Prompt ·

2024 年 11 月 6 日

ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization

Huayang Huang,Yu Wu,Qian Wang

from arxiv, Accept to NeurIPS 2024

Watermarking generative content serves as a vital tool for authentication, ownership protection, and mitigation of potential misuse. Existing watermarking methods face the challenge of balancing robustness and concealment. They empirically inject a watermark that is both invisible and robust and passively achieve concealment by limiting the strength of the watermark, thus reducing the robustness. In this paper, we propose to explicitly introduce a watermark hiding process to actively achieve concealment, thus allowing the embedding of stronger watermarks. To be specific, we implant a robust watermark in an intermediate diffusion state and then guide the model to hide the watermark in the final generated image. We employ an adversarial optimization algorithm to produce the optimal hiding prompt guiding signal for each watermark. The prompt embedding is optimized to minimize artifacts in the generated image, while the watermark is optimized to achieve maximum strength. The watermark can be verified by reversing the generation process. Experiments on various diffusion models demonstrate the watermark remains verifiable even under significant image tampering and shows superior invisibility compared to other state-of-the-art robust watermarking methods.

contrastive · 表示 · Performer · Subspace · MoDELS ·

2024 年 11 月 6 日

ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework

Hengyuan Zhang,Chenming Shang,Sizhe Wang,Dongdong Zhang,Renliang Sun,Yiyao Yu,Yujiu Yang,Furu Wei

from arxiv, 23 pages, 11 figures

Although fine-tuning Large Language Models (LLMs) with multilingual data can rapidly enhance the multilingual capabilities of LLMs, they still exhibit a performance gap between the dominant language (e.g., English) and non-dominant ones due to the imbalance of training data across languages. To further enhance the performance of non-dominant languages, we propose ShifCon, a Shift-based Contrastive framework that aligns the internal forward process of other languages toward that of the dominant one. Specifically, it shifts the representations of non-dominant languages into the dominant language subspace, allowing them to access relatively rich information encoded in the model parameters. The enriched representations are then shifted back into their original language subspace before generation. Moreover, we introduce a subspace distance metric to pinpoint the optimal layer area for shifting representations and employ multilingual contrastive learning to further enhance the alignment of representations within this area. Experiments demonstrate that our ShifCon framework significantly enhances the performance of non-dominant languages, particularly for low-resource ones. Further analysis offers extra insights to verify the effectiveness of ShifCon and propel future research

Networking · 控制器 · UNet · 設計 · 穩健性 ·

2024 年 11 月 5 日

UNet: A Generic and Reliable Multi-UAV Communication and Networking Architecture for Heterogeneous Applications

Sanku Kumar Roy,Mohamed Samshad,Ketan Rajawat

from arxiv, 11 pages, 20 figures, Journal paper

The rapid growth of UAV applications necessitates a robust communication and networking architecture capable of addressing the diverse requirements of various applications concurrently, rather than relying on application-specific solutions. This paper proposes a generic and reliable multi-UAV communication and networking architecture designed to support the varying demands of heterogeneous applications, including short-range and long-range communication, star and mesh topologies, different data rates, and multiple wireless standards. Our architecture accommodates both adhoc and infrastructure networks, ensuring seamless connectivity throughout the network. Additionally, we present the design of a multi-protocol UAV gateway that enables interoperability among various communication protocols. Furthermore, we introduce a data processing and service layer framework with a graphical user interface of a ground control station that facilitates remote control and monitoring from any location at any time. We practically implemented the proposed architecture and evaluated its performance using different metrics, demonstrating its effectiveness.

穩健性 · 統計量 · Python · TOOLS · Analysis ·

2024 年 11 月 4 日

RobPy: a Python Package for Robust Statistical Methods

Sarah Leyder,Jakob Raymaekers,Peter J. Rousseeuw,Thomas Servotte,Tim Verdonck

Robust estimation provides essential tools for analyzing data that contain outliers, ensuring that statistical models remain reliable even in the presence of some anomalous data. While robust methods have long been available in R, users of Python have lacked a comprehensive package that offers these methods in a cohesive framework. RobPy addresses this gap by offering a wide range of robust methods in Python, built upon established libraries including NumPy, SciPy, and scikit-learn. This package includes tools for robust preprocessing, univariate estimation, covariance matrices, regression, and principal component analysis, which are able to detect outliers and to mitigate their effect. In addition, RobPy provides specialized diagnostic plots for visualizing casewise and cellwise outliers. This paper presents the structure of the RobPy package, demonstrates its functionality through examples, and compares its features to existing implementations in other statistical software. By bringing robust methods to Python, RobPy enables more users to perform robust data analysis in a modern and versatile programming language.

GPU · 優化器 · 匯聚 · 推斷 · Machine Learning ·

2024 年 11 月 4 日

FaaSTube: Optimizing GPU-oriented Data Transfer for Serverless Computing

Hao Wu,Junxiao Deng,Minchen Yu,Yue Yu,Yaochen Liu,Hao Fan,Song Wu,Wei Wang

Serverless computing has gained significant traction for machine learning inference applications, which are often deployed as serverless workflows consisting of multiple CPU and GPU functions with data dependency. However, existing data-passing solutions for serverless computing primarily reply on host memory for fast data transfer, mandating substantial data movement and resulting in salient I/O overhead. In this paper, we present FaaSTube, a GPU-efficient data passing system for serverless inference. FaaSTube manages intermediate data within a GPU memory pool to facilitate direct data exchange between GPU functions. It enables fine-grained bandwidth sharing over PCIe and NVLink, minimizing data-passing latency for both host-to-GPU and GPU-to-GPU while providing performance isolation between functions. Additionally, FaaSTube implements an elastic GPU memory pool that dynamically scales to accommodate varying data-passing demands. Evaluations on real-world applications show that FaaSTube reduces end-to-end latency by up to 90\% and achieves up to 12x higher throughput compared to the state-of-the-art.

邊緣化 · MoDELS · 推斷 · Learning · 數據集 ·

2024 年 11 月 2 日

Marginal Causal Flows for Validation and Inference

Daniel de Vassimon Manela,Laura Battaglia,Robin J. Evans

from arxiv, 23 pages, 10 figures, Accepted as a Poster at NeurIPS 2024

Investigating the marginal causal effect of an intervention on an outcome from complex data remains challenging due to the inflexibility of employed models and the lack of complexity in causal benchmark datasets, which often fail to reproduce intricate real-world data patterns. In this paper we introduce Frugal Flows, a novel likelihood-based machine learning model that uses normalising flows to flexibly learn the data-generating process, while also directly inferring the marginal causal quantities from observational data. We propose that these models are exceptionally well suited for generating synthetic data to validate causal methods. They can create synthetic datasets that closely resemble the empirical dataset, while automatically and exactly satisfying a user-defined average treatment effect. To our knowledge, Frugal Flows are the first generative model to both learn flexible data representations and also exactly parameterise quantities such as the average treatment effect and the degree of unobserved confounding. We demonstrate the above with experiments on both simulated and real-world datasets.

統計量 · 假設檢驗 · Integration · Reminders · 可辨認的 ·

2024 年 11 月 1 日

StatWhy: Formal Verification Tool for Statistical Hypothesis Testing Programs

Yusuke Kawamoto,Kentaro Kobayashi,Kohei Suenaga

Statistical methods have been widely misused and misinterpreted in various scientific fields, raising significant concerns about the integrity of scientific research. To mitigate this problem, we propose a new method for formally specifying and automatically verifying the correctness of statistical programs. In this method, programmers are required to annotate the source code of the statistical programs with the requirements for these methods. Through this annotation, they are reminded to check the requirements for statistical methods, including those that cannot be formally verified, such as the distribution of the unknown true population. Our software tool StatWhy automatically checks whether programmers have properly specified the requirements for the statistical methods, thereby identifying any missing requirements that need to be addressed. This tool is implemented using the Why3 platform to verify the correctness of OCaml programs that conduct statistical hypothesis testing. We demonstrate how StatWhy can be used to avoid common errors in various popular statistical hypothesis testing programs.

entity · Performer · 圖 · 知識圖譜 · MoDELS ·

2019 年 6 月 4 日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Deepak Nathani,Jatin Chauhan,Charu Sharma,Manohar Kaul

from arxiv, accepted as long paper in ACL 2019

The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.

Performer · 判別器 · 正例 · 假陽性 · 監督 ·

2018 年 5 月 24 日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Pengda Qin,Weiran Xu,William Yang Wang

Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification. The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.