国产裸体美女永久免费无遮挡久久-免费一区二区日韩精品视频

The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and development. To address this, we introduce the DeepSeek-Coder series, a range of open-source code models with sizes from 1.3B to 33B, trained from scratch on 2 trillion tokens. These models are pre-trained on a high-quality project-level code corpus and employ a fill-in-the-blank task with a 16K window to enhance code generation and infilling. Our extensive evaluations demonstrate that DeepSeek-Coder not only achieves state-of-the-art performance among open-source code models across multiple benchmarks but also surpasses existing closed-source models like Codex and GPT-3.5. Furthermore, DeepSeek-Coder models are under a permissive license that allows for both research and unrestricted commercial use.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 系統設計 · XAI · NVM · 設計 ·

2024 年 3 月 7 日

Explainable AI for Embedded Systems Design: A Case Study of Static Redundant NVM Memory Write Prediction

Abdoulaye Gamatié,Yuyang Wang

This paper investigates the application of eXplainable Artificial Intelligence (XAI) in the design of embedded systems using machine learning (ML). As a case study, it addresses the challenging problem of static silent store prediction. This involves identifying redundant memory writes based only on static program features. Eliminating such stores enhances performance and energy efficiency by reducing memory access and bus traffic, especially in the presence of emerging non-volatile memory technologies. To achieve this, we propose a methodology consisting of: 1) the development of relevant ML models for explaining silent store prediction, and 2) the application of XAI to explain these models. We employ two state-of-the-art model-agnostic XAI methods to analyze the causes of silent stores. Through the case study, we evaluate the effectiveness of the methods. We find that these methods provide explanations for silent store predictions, which are consistent with known causes of silent store occurrences from previous studies. Typically, this allows us to confirm the prevalence of silent stores in operations that write the zero constant into memory, or the absence of silent stores in operations involving loop induction variables. This suggests the potential relevance of XAI in analyzing ML models' decision in embedded system design. From the case study, we share some valuable insights and pitfalls we encountered. More generally, this study aims to lay the groundwork for future research in the emerging field of XAI for embedded system design.

INFORMS · 話題模型 · TF-IDF · Processing（編程語言） · 樣例 ·

2024 年 3 月 6 日

Does Documentation Matter? An Empirical Study of Practitioners' Perspective on Open-Source Software Adoption

Aaron Imani,Shiva Radmanesh,Iftekhar Ahmed,Mohammad Moshirpour

In recent years, open-source software (OSS) has become increasingly prevalent in developing software products. While OSS documentation is the primary source of information provided by the developers' community about a product, its role in the industry's adoption process has yet to be examined. We conducted semi-structured interviews and an online survey to provide insight into this area. Based on interviews and survey insights, we developed a topic model to collect relevant information from OSS documentation automatically. Additionally, according to our survey responses regarding challenges associated with OSS documentation, we propose a novel information augmentation approach, DocMentor, by combining OSS documentation corpus TF-IDF scores and ChatGPT. Through explaining technical terms and providing examples and references, our approach enhances the documentation context and improves practitioners' understanding. Our tool's effectiveness is assessed by surveying practitioners.

向量化 · MoDELS · SimPLe · 語言模型化 · 基 ·

2024 年 3 月 6 日

Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages

Shih-Cheng Huang,Pin-Zu Li,Yu-Chi Hsu,Kuang-Ming Chen,Yu Tung Lin,Shih-Kai Hsiao,Richard Tzong-Han Tsai,Hung-yi Lee

Recently, the development of open-source large language models (LLMs) has advanced rapidly. Nevertheless, due to data constraints, the capabilities of most open-source LLMs are primarily focused on English. To address this issue, we introduce the concept of chat vector to equip pre-trained language models with instruction following and human value alignment via simple model arithmetic. The chat vector is derived by subtracting the weights of a pre-trained base model (e.g. LLaMA2) from those of its corresponding chat model (e.g. LLaMA2-chat). By simply adding the chat vector to a continual pre-trained model's weights, we can endow the model with chat capabilities in new languages without the need for further training. Our empirical studies demonstrate the superior efficacy of the chat vector from three different aspects: instruction following, toxicity mitigation, and multi-turn dialogue. Moreover, to showcase the adaptability of our approach, we extend our experiments to encompass various languages, base models, and chat vectors. The results underscore the chat vector's simplicity, effectiveness, and wide applicability, making it a compelling solution for efficiently enabling conversational capabilities in pre-trained language models.

向量化 · INFORMS · Learning · entity · Integration ·

2024 年 3 月 6 日

Learning Category Trees for ID-Based Recommendation: Exploring the Power of Differentiable Vector Quantization

Qijiong Liu,Jiaren Xiao,Lu Fan,Jieming Zhu,Xiao-Ming Wu

from arxiv, TheWebConf 2024

Category information plays a crucial role in enhancing the quality and personalization of recommender systems. Nevertheless, the availability of item category information is not consistently present, particularly in the context of ID-based recommendations. In this work, we propose a novel approach to automatically learn and generate entity (i.e., user or item) category trees for ID-based recommendation. Specifically, we devise a differentiable vector quantization framework for automatic category tree generation, namely CAGE, which enables the simultaneous learning and refinement of categorical code representations and entity embeddings in an end-to-end manner, starting from the randomly initialized states. With its high adaptability, CAGE can be easily integrated into both sequential and non-sequential recommender systems. We validate the effectiveness of CAGE on various recommendation tasks including list completion, collaborative filtering, and click-through rate prediction, across different recommendation models. We release the code and data for others to reproduce the reported results.

Performer · 優化器 · 黑盒 · surge · 粒子群優化算法 ·

2024 年 3 月 6 日

Adversarial Infrared Geometry: Using Geometry to Perform Adversarial Attack against Infrared Pedestrian Detectors

Kalibinuer Tiliwalidi

from arxiv, arXiv admin note: text overlap with arXiv:2312.14217 by other authors

Currently, infrared imaging technology enjoys widespread usage, with infrared object detection technology experiencing a surge in prominence. While previous studies have delved into physical attacks on infrared object detectors, the implementation of these techniques remains complex. For instance, some approaches entail the use of bulb boards or infrared QR suits as perturbations to execute attacks, which entail costly optimization and cumbersome deployment processes. Other methodologies involve the utilization of irregular aerogel as physical perturbations for infrared attacks, albeit at the expense of optimization expenses and perceptibility issues. In this study, we propose a novel infrared physical attack termed Adversarial Infrared Geometry (\textbf{AdvIG}), which facilitates efficient black-box query attacks by modeling diverse geometric shapes (lines, triangles, ellipses) and optimizing their physical parameters using Particle Swarm Optimization (PSO). Extensive experiments are conducted to evaluate the effectiveness, stealthiness, and robustness of AdvIG. In digital attack experiments, line, triangle, and ellipse patterns achieve attack success rates of 93.1\%, 86.8\%, and 100.0\%, respectively, with average query times of 71.7, 113.1, and 2.57, respectively, thereby confirming the efficiency of AdvIG. Physical attack experiments are conducted to assess the attack success rate of AdvIG at different distances. On average, the line, triangle, and ellipse achieve attack success rates of 61.1\%, 61.2\%, and 96.2\%, respectively. Further experiments are conducted to comprehensively analyze AdvIG, including ablation experiments, transfer attack experiments, and adversarial defense mechanisms. Given the superior performance of our method as a simple and efficient black-box adversarial attack in both digital and physical environments, we advocate for widespread attention to AdvIG.

Learning · Processing（編程語言） · MoDELS · 樣例 · 數據集 ·

2024 年 3 月 4 日

From Displacements to Distributions: A Machine-Learning Enabled Framework for Quantifying Uncertainties in Parameters of Computational Models

Taylor Roper,Harri Hakula,Troy Butler

from arxiv, 35 pages

This work presents novel extensions for combining two frameworks for quantifying both aleatoric (i.e., irreducible) and epistemic (i.e., reducible) sources of uncertainties in the modeling of engineered systems. The data-consistent (DC) framework poses an inverse problem and solution for quantifying aleatoric uncertainties in terms of pullback and push-forward measures for a given Quantity of Interest (QoI) map. Unfortunately, a pre-specified QoI map is not always available a priori to the collection of data associated with system outputs. The data themselves are often polluted with measurement errors (i.e., epistemic uncertainties), which complicates the process of specifying a useful QoI. The Learning Uncertain Quantities (LUQ) framework defines a formal three-step machine-learning enabled process for transforming noisy datasets into samples of a learned QoI map to enable DC-based inversion. We develop a robust filtering step in LUQ that can learn the most useful quantitative information present in spatio-temporal datasets. The learned QoI map transforms simulated and observed datasets into distributions to perform DC-based inversion. We also develop a DC-based inversion scheme that iterates over time as new spatial datasets are obtained and utilizes quantitative diagnostics to identify both the quality and impact of inversion at each iteration. Reproducing Kernel Hilbert Space theory is leveraged to mathematically analyze the learned QoI map and develop a quantitative sufficiency test for evaluating the filtered data. An illustrative example is utilized throughout while the final two examples involve the manufacturing of shells of revolution to demonstrate various aspects of the presented frameworks.

Learning · Continuity · 穩健性 · 輸入分布 · state-of-the-art ·

2023 年 1 月 18 日

A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

Megan M. Baker,Alexander New,Mario Aguilar-Simon,Ziad Al-Halah,Sébastien M. R. Arnold,Ese Ben-Iwhiwhu,Andrew P. Brna,Ethan Brooks,Ryan C. Brown,Zachary Daniels,Anurag Daram,Fabien Delattre,Ryan Dellana,Eric Eaton,Haotian Fu,Kristen Grauman,Jesse Hostetler,Shariq Iqbal,Cassandra Kent,Nicholas Ketz,Soheil Kolouri,George Konidaris,Dhireesha Kudithipudi,Erik Learned-Miller,Seungwon Lee,Michael L. Littman,Sandeep Madireddy,Jorge A. Mendez,Eric Q. Nguyen,Christine D. Piatko,Praveen K. Pilly,Aswin Raghavan,Abrar Rahman,Santhosh Kumar Ramakrishnan,Neale Ratzlaff,Andrea Soltoggio,Peter Stone,Indranil Sur,Zhipeng Tang,Saket Tiwari,Kyle Vedder,Felix Wang,Zifan Xu,Angel Yanguas-Gil,Harel Yedidsion,Shangqun Yu,Gautam K. Vallabha

from arxiv, To appear in Neural Networks

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.

知識 (knowledge) · Processing（編程語言） · 圖 · NLP · 知識圖譜 ·

2022 年 9 月 30 日

A Decade of Knowledge Graphs in Natural Language Processing: A Survey

Phillip Schneider,Tim Schopf,Juraj Vladika,Mikhail Galkin,Elena Simperl,Florian Matthes

from arxiv, Accepted to AACL-IJCNLP 2022

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

2022 年 9 月 21 日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Dong Zhang,Yi Lin,Hao Chen,Zhuotao Tian,Xin Yang,Jinhui Tang,Kwang Ting Cheng

from arxiv, Under consideration

Over the past few years, the rapid development of deep learning technologies for computer vision has greatly promoted the performance of medical image segmentation (MedISeg). However, the recent MedISeg publications usually focus on presentations of the major contributions (e.g., network architectures, training strategies, and loss functions) while unwittingly ignoring some marginal implementation details (also known as "tricks"), leading to a potential problem of the unfair experimental result comparisons. In this paper, we collect a series of MedISeg tricks for different model implementation phases (i.e., pre-training model, data pre-processing, data augmentation, model implementation, model inference, and result post-processing), and experimentally explore the effectiveness of these tricks on the consistent baseline models. Compared to paper-driven surveys that only blandly focus on the advantages and limitation analyses of segmentation models, our work provides a large number of solid experiments and is more technically operable. With the extensive experimental results on both the representative 2D and 3D medical image datasets, we explicitly clarify the effect of these tricks. Moreover, based on the surveyed tricks, we also open-sourced a strong MedISeg repository, where each of its components has the advantage of plug-and-play. We believe that this milestone work not only completes a comprehensive and complementary survey of the state-of-the-art MedISeg approaches, but also offers a practical guide for addressing the future medical image processing challenges including but not limited to small dataset learning, class imbalance learning, multi-modality learning, and domain adaptation. The code has been released at: //github.com/hust-linyi/MedISeg

MoDELS · 協同過濾 · INFORMS · Neural Networks · Networks ·

2021 年 4 月 27 日

A Survey on Neural Recommendation: From Collaborative Filtering to Content and Context Enriched Recommendation

Le Wu,Xiangnan He,Xiang Wang,Kun Zhang,Meng Wang

from arxiv, In submission

Influenced by the stunning success of deep learning in computer vision and language understanding, research in recommendation has shifted to inventing new recommender models based on neural networks. In recent years, we have witnessed significant progress in developing neural recommender models, which generalize and surpass traditional recommender models owing to the strong representation power of neural networks. In this survey paper, we conduct a systematic review on neural recommender models, aiming to summarize the field to facilitate future progress. Distinct from existing surveys that categorize existing methods based on the taxonomy of deep learning techniques, we instead summarize the field from the perspective of recommendation modeling, which could be more instructive to researchers and practitioners working on recommender systems. Specifically, we divide the work into three types based on the data they used for recommendation modeling: 1) collaborative filtering models, which leverage the key source of user-item interaction data; 2) content enriched models, which additionally utilize the side information associated with users and items, like user profile and item knowledge graph; and 3) context enriched models, which account for the contextual information associated with an interaction, such as time, location, and the past interactions. After reviewing representative works for each type, we finally discuss some promising directions in this field, including benchmarking recommender systems, graph reasoning based recommendation models, and explainable and fair recommendations for social good.