黄色视频在线观看男人插女人的视频在线观看-国产高潮白浆调教福利在线视频

The rising popularity of Large Language Models (LLMs) has motivated exploring their use in code-related tasks. Code LLMs with more than millions of parameters are trained on a massive amount of code in different Programming Languages (PLs). Such models are used for automating various Software Engineering (SE) tasks using prompt engineering. However, given the very large size of industry-scale project files, a major issue of these LLMs is their limited context window size, motivating the question of "Can these LLMs process very large files and can we effectively perform prompt engineering?". Code translation aims to convert source code from one PL to another. In this work, we assess the effect of method-level program decomposition on context window of LLMs and investigate how this approach can enable translation of very large files which originally could not be done due to out-of-context issue. Our observations from 20 well-known java projects and approximately 60K methods suggest that method-level program decomposition significantly improves the limited context window problem of LLMs by 99.5%. Furthermore, our empirical analysis indicate that with method-level decomposition, each input fragment on average only consumes 5% of the context window, leaving more context space for prompt engineering and the output. Finally, we investigate the effectiveness of a Call Graph (CG) approach for translating very large files when doing method-level program decomposition.

相關內容

上下文窗口

關注 0

秩 · Networking · Neural Networks · 可辨認的 · INFORMS ·

2024 年 3 月 5 日

Duplicate Question Retrieval and Confirmation Time Prediction in Software Communities

Rima Hazra,Debanjan Saha,Amruit Sahoo,Somnath Banerjee,Animesh Mukherjee

from arxiv, Full paper accepted at ASONAM 2023: The 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Community Question Answering (CQA) in different domains is growing at a large scale because of the availability of several platforms and huge shareable information among users. With the rapid growth of such online platforms, a massive amount of archived data makes it difficult for moderators to retrieve possible duplicates for a new question and identify and confirm existing question pairs as duplicates at the right time. This problem is even more critical in CQAs corresponding to large software systems like askubuntu where moderators need to be experts to comprehend something as a duplicate. Note that the prime challenge in such CQA platforms is that the moderators are themselves experts and are therefore usually extremely busy with their time being extraordinarily expensive. To facilitate the task of the moderators, in this work, we have tackled two significant issues for the askubuntu CQA platform: (1) retrieval of duplicate questions given a new question and (2) duplicate question confirmation time prediction. In the first task, we focus on retrieving duplicate questions from a question pool for a particular newly posted question. In the second task, we solve a regression problem to rank a pair of questions that could potentially take a long time to get confirmed as duplicates. For duplicate question retrieval, we propose a Siamese neural network based approach by exploiting both text and network-based features, which outperforms several state-of-the-art baseline techniques. Our method outperforms DupPredictor and DUPE by 5% and 7% respectively. For duplicate confirmation time prediction, we have used both the standard machine learning models and neural network along with the text and graph-based features. We obtain Spearman's rank correlation of 0.20 and 0.213 (statistically significant) for text and graph based features respectively.

Automator · 大語言模型 · 語言模型化 · 專利 · MoDELS ·

2024 年 3 月 5 日

From PARIS to LE-PARIS: Toward Patent Response Automation with Recommender Systems and Collaborative Large Language Models

Jung-Mei Chu,Hao-Cheng Lo,Jieh Hsiang,Chun-Chieh Cho

from arxiv, 28 pages, 5 figures, typos corrected, references added, under review

In patent prosecution, timely and effective responses to Office Actions (OAs) are crucial for securing patents. However, past automation and artificial intelligence research have largely overlooked this aspect. To bridge this gap, our study introduces the Patent Office Action Response Intelligence System (PARIS) and its advanced version, the Large Language Model (LLM) Enhanced PARIS (LE-PARIS). These systems are designed to enhance the efficiency of patent attorneys in handling OA responses through collaboration with AI. The systems' key features include the construction of an OA Topics Database, development of Response Templates, and implementation of Recommender Systems and LLM-based Response Generation. To validate the effectiveness of the systems, we have employed a multi-paradigm analysis using the USPTO Office Action database and longitudinal data based on attorney interactions with our systems over six years. Through five studies, we have examined the constructiveness of OA topics (studies 1 and 2) using topic modeling and our proposed Delphi process, the efficacy of our proposed hybrid LLM-based recommender system tailored for OA responses (study 3), the quality of generated responses (study 4), and the systems' practical value in real-world scenarios through user studies (study 5). The results indicate that both PARIS and LE-PARIS significantly achieve key metrics and have a positive impact on attorney performance.

可約的 · FAST · 近似 · 采樣法 · INTERACT ·

2024 年 3 月 3 日

Fast Algorithm for Quasi-2D Coulomb Systems

Zecheng Gan,Xuanzhao Gao,Jiuyang Liang,Zhenli Xu

from arxiv, 39 pages

Quasi-2D Coulomb systems are of fundamental importance and have attracted much attention in many areas nowadays. Their reduced symmetry gives rise to interesting collective behaviors, but also brings great challenges for particle-based simulations. Here, we propose a novel algorithm framework to address the $\mathcal O(N^2)$ simulation complexity associated with the long-range nature of Coulomb interactions. First, we introduce an efficient Sum-of-Exponentials (SOE) approximation for the long-range kernel associated with Ewald splitting, achieving uniform convergence in terms of inter-particle distance, which reduces the complexity to $\mathcal{O}(N^{7/5})$. We then introduce a random batch sampling method in the periodic dimensions, the stochastic approximation is proven to be both unbiased and with reduced variance via a tailored importance sampling strategy, further reducing the computational cost to $\mathcal{O}(N)$. The performance of our algorithm is demonstrated via varies numerical examples. Notably, it achieves a speedup of $2\sim 3$ orders of magnitude comparing with Ewald2D method, enabling molecular dynamics (MD) simulations with up to $10^6$ particles on a single core. The present approach is therefore well-suited for large-scale particle-based simulations of Coulomb systems under confinement, making it possible to investigate the role of Coulomb interaction in many practical situations.

可理解性 · Performer · SOFT · MoDELS · 大語言模型 ·

2024 年 3 月 2 日

Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding

Ha-Thanh Nguyen,Ken Satoh

from arxiv, JURISIN 2024

Finetuning approaches in NLP often focus on exploitation rather than exploration, which may lead to suboptimal models. Given the vast search space of natural language, this limited exploration can restrict their performance in complex, high-stakes domains, where accurate negation understanding and logical reasoning abilities are crucial. To address this issue, we leverage Reinforcement Learning from Logical Feedback (RLLF) to create an effective balance between exploration and exploitation in LLMs. Our approach employs an appropriate benchmark dataset for training and evaluation, highlighting the importance of exploration in enhancing negation understanding capabilities. We compare the performance of our RLLF-enhanced LLMs with baseline models trained without RLLF, demonstrating the value of this balanced approach. Furthermore, we showcase the potential of our method in legal AI applications by employing transfer learning and evaluating its impact on negation understanding. Our experimental results exhibit the effectiveness of balancing exploration and exploitation with RLLF in improving LLMs' negation capabilities. This has implications for the development of more accurate, reliable, and logically consistent language models in high-stakes domains.

Gram 矩陣 · MoDELS · 泛函 · 全局優化 · 平方損失函數 ·

2024 年 3 月 1 日

Global Convergence Rate of Deep Equilibrium Models with General Activations

Lan V. Truong

from arxiv, 32 pages, 4 figures

In a recent paper, Ling et al. investigated the over-parametrized Deep Equilibrium Model (DEQ) with ReLU activation. They proved that the gradient descent converges to a globally optimal solution for the quadratic loss function at a linear convergence rate. This paper shows that this fact still holds for DEQs with any generally bounded activation with bounded first and second derivatives. Since the new activation function is generally non-homogeneous, bounding the least eigenvalue of the Gram matrix of the equilibrium point is particularly challenging. To accomplish this task, we must create a novel population Gram matrix and develop a new form of dual activation with Hermite polynomial expansion.

可理解性 · 虛擬現實（VR） · VR · 生成式人工智能 · 輸出 ·

2024 年 2 月 20 日

Virtual Reality for Understanding Artificial-Intelligence-driven Scientific Discovery with an Application in Quantum Optics

Philipp Schmidt,S?ren Arlt,Carlos Ruiz-Gonzalez,Xuemei Gu,Carla Rodríguez,Mario Krenn

from arxiv, 12 pages, 6 figures, comments welcome

Generative Artificial Intelligence (AI) models can propose solutions to scientific problems beyond human capability. To truly make conceptual contributions, researchers need to be capable of understanding the AI-generated structures and extracting the underlying concepts and ideas. When algorithms provide little explanatory reasoning alongside the output, scientists have to reverse-engineer the fundamental insights behind proposals based solely on examples. This task can be challenging as the output is often highly complex and thus not immediately accessible to humans. In this work we show how transferring part of the analysis process into an immersive Virtual Reality (VR) environment can assist researchers in developing an understanding of AI-generated solutions. We demonstrate the usefulness of VR in finding interpretable configurations of abstract graphs, representing Quantum Optics experiments. Thereby, we can manually discover new generalizations of AI-discoveries as well as new understanding in experimental quantum optics. Furthermore, it allows us to customize the search space in an informed way - as a human-in-the-loop - to achieve significantly faster subsequent discovery iterations. As concrete examples, with this technology, we discover a new resource-efficient 3-dimensional entanglement swapping scheme, as well as a 3-dimensional 4-particle Greenberger-Horne-Zeilinger-state analyzer. Our results show the potential of VR for increasing a human researcher's ability to derive knowledge from graph-based generative AI that, which is a common abstract data representation used in diverse fields of science.

圖 · 異常點 · Networking · Performer · Learning ·

2023 年 8 月 13 日

Learning on Graphs with Out-of-Distribution Nodes

Yu Song,Donglin Wang

from arxiv, Accepted by KDD'22

Graph Neural Networks (GNNs) are state-of-the-art models for performing prediction tasks on graphs. While existing GNNs have shown great performance on various tasks related to graphs, little attention has been paid to the scenario where out-of-distribution (OOD) nodes exist in the graph during training and inference. Borrowing the concept from CV and NLP, we define OOD nodes as nodes with labels unseen from the training set. Since a lot of networks are automatically constructed by programs, real-world graphs are often noisy and may contain nodes from unknown distributions. In this work, we define the problem of graph learning with out-of-distribution nodes. Specifically, we aim to accomplish two tasks: 1) detect nodes which do not belong to the known distribution and 2) classify the remaining nodes to be one of the known classes. We demonstrate that the connection patterns in graphs are informative for outlier detection, and propose Out-of-Distribution Graph Attention Network (OODGAT), a novel GNN model which explicitly models the interaction between different kinds of nodes and separate inliers from outliers during feature propagation. Extensive experiments show that OODGAT outperforms existing outlier detection methods by a large margin, while being better or comparable in terms of in-distribution classification.

模態 · 潛在 · 正則化 · 損失 · Learning ·

2023 年 3 月 10 日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Qian Jiang,Changyou Chen,Han Zhao,Liqun Chen,Qing Ping,Son Dinh Tran,Yi Xu,Belinda Zeng,Trishul Chilimbi

from arxiv, 14 pages, 8 figure, CVPR 2023 accepted

Contrastive loss has been increasingly used in learning representations from multiple modalities. In the limit, the nature of the contrastive loss encourages modalities to exactly match each other in the latent space. Yet it remains an open question how the modality alignment affects the downstream task performance. In this paper, based on an information-theoretic argument, we first prove that exact modality alignment is sub-optimal in general for downstream prediction tasks. Hence we advocate that the key of better performance lies in meaningful latent modality structures instead of perfect modality alignment. To this end, we propose three general approaches to construct latent modality structures. Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization. Extensive experiments are conducted on two popular multi-modal representation learning frameworks: the CLIP-based two-tower model and the ALBEF-based fusion model. We test our model on a variety of tasks including zero/few-shot image classification, image-text retrieval, visual question answering, visual reasoning, and visual entailment. Our method achieves consistent improvements over existing methods, demonstrating the effectiveness and generalizability of our proposed approach on latent modality structure regularization.

復合數據 · 可穿戴設備 · 求逆 · INFORMS · Integration ·

2021 年 6 月 2 日

IoT Solutions with Multi-Sensor Fusion and Signal-Image Encoding for Secure Data Transfer and Decision Making

Piyush K. Sharma,Mark Dennison,Adrienne Raglin

from arxiv, Advances in Mass Data Analysis of Images and Signals in Artificial Intelligence and Pattern Recognition 15th International Conference, MDA 2020 Amsterdam, The Netherlands, July 20-21, 2020. //www.ibai-publishing.org/html/proceedings_2020/pdf/proceedings_book_MDA-AI&PR_2020.pdf

Deployment of Internet of Things (IoT) devices and Data Fusion techniques have gained popularity in public and government domains. This usually requires capturing and consolidating data from multiple sources. As datasets do not necessarily originate from identical sensors, fused data typically results in a complex data problem. Because military is investigating how heterogeneous IoT devices can aid processes and tasks, we investigate a multi-sensor approach. Moreover, we propose a signal to image encoding approach to transform information (signal) to integrate (fuse) data from IoT wearable devices to an image which is invertible and easier to visualize supporting decision making. Furthermore, we investigate the challenge of enabling an intelligent identification and detection operation and demonstrate the feasibility of the proposed Deep Learning and Anomaly Detection models that can support future application that utilizes hand gesture data from wearable devices.

多峰值 · 學成 · Extensibility · 深度學習 · Processing（編程語言） ·

2021 年 5 月 24 日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Jabeen Summaira,Xi Li,Amin Muhammad Shoib,Songyuan Li,Jabbar Abdul

Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years. The goal of multimodal deep learning is to create models that can process and link information using various modalities. Despite the extensive development made for unimodal learning, it still cannot cover all the aspects of human learning. Multimodal learning helps to understand and analyze better when various senses are engaged in the processing of information. This paper focuses on multiple types of modalities, i.e., image, video, text, audio, body gestures, facial expressions, and physiological signals. Detailed analysis of past and current baseline approaches and an in-depth study of recent advancements in multimodal deep learning applications has been provided. A fine-grained taxonomy of various multimodal deep learning applications is proposed, elaborating on different applications in more depth. Architectures and datasets used in these applications are also discussed, along with their evaluation metrics. Last, main issues are highlighted separately for each domain along with their possible future research directions.