人人干人人摸人人操_亚洲自偷拍狠无码_国产精品免费在线一区二区_亚洲免费一区二区三区在线观看_无遮挡很黄很色很湿的视频_日韩国产欧美A级在线视频_综合欧美亚洲日本一区二区

This research idea paper proposes leveraging Large Language Models (LLMs) to enhance the productivity of Dafny developers. Although the use of verification-aware languages, such as Dafny, has increased considerably in the last decade, these are still not widely adopted. Often the cost of using such languages is too high, due to the level of expertise required from the developers and challenges that they often face when trying to prove a program correct. Even though Dafny automates a lot of the verification process, sometimes there are steps that are too complex for Dafny to perform on its own. One such case is that of missing lemmas, i.e. Dafny is unable to prove a result without being given further help in the form of a theorem that can assist it in the proof of the step. In this paper, we describe preliminary work on a new Dafny plugin that leverages LLMs to assist developers by generating suggestions for relevant lemmas that Dafny is unable to discover and use. Moreover, for the lemmas that cannot be proved automatically, the plugin also attempts to provide accompanying calculational proofs. We also discuss ideas for future work by describing a research agenda on using LLMs to increase the adoption of verification-aware languages in general, by increasing developers productivity and by reducing the level of expertise required for crafting formal specifications and proving program properties.

相關內容

語言模型化

關注 9

大語言模型 · 語言模型化 · MoDELS · 置換 · state-of-the-art ·

2024 年 2 月 16 日

Jailbreaking Proprietary Large Language Models using Word Substitution Cipher

Divij Handa,Advait Chirmule,Bimal Gajera,Chitta Baral

from arxiv, 15 pages

Large Language Models (LLMs) are aligned to moral and ethical guidelines but remain susceptible to creative prompts called Jailbreak that can bypass the alignment process. However, most jailbreaking prompts contain harmful questions in the natural language (mainly English), which can be detected by the LLM themselves. In this paper, we present jailbreaking prompts encoded using cryptographic techniques. We first present a pilot study on the state-of-the-art LLM, GPT-4, in decoding several safe sentences that have been encrypted using various cryptographic techniques and find that a straightforward word substitution cipher can be decoded most effectively. Motivated by this result, we use this encoding technique for writing jailbreaking prompts. We present a mapping of unsafe words with safe words and ask the unsafe question using these mapped words. Experimental results show an attack success rate (up to 59.42%) of our proposed jailbreaking approach on state-of-the-art proprietary models including ChatGPT, GPT-4, and Gemini-Pro. Additionally, we discuss the over-defensiveness of these models. We believe that our work will encourage further research in making these LLMs more robust while maintaining their decoding capabilities.

語音增強 · 多樣性 · MoDELS · 大學 · 樣本 ·

2024 年 2 月 16 日

Toward Universal Speech Enhancement for Diverse Input Conditions

Wangyou Zhang,Kohei Saijo,Zhong-Qiu Wang,Shinji Watanabe,Yanmin Qian

from arxiv, 6 pages, 3 figures, 5 tables, published in ASRU 2023 (corrected the results of noisy speech on CHiME-4 (Simu) in Table 4)

The past decade has witnessed substantial growth of data-driven speech enhancement (SE) techniques thanks to deep learning. While existing approaches have shown impressive performance in some common datasets, most of them are designed only for a single condition (e.g., single-channel, multi-channel, or a fixed sampling frequency) or only consider a single task (e.g., denoising or dereverberation). Currently, there is no universal SE approach that can effectively handle diverse input conditions with a single model. In this paper, we make the first attempt to investigate this line of research. First, we devise a single SE model that is independent of microphone channels, signal lengths, and sampling frequencies. Second, we design a universal SE benchmark by combining existing public corpora with multiple conditions. Our experiments on a wide range of datasets show that the proposed single model can successfully handle diverse conditions with strong performance.

Learning · Microsoft Windows · 強化學習 · 優化器 · 可約的 ·

2024 年 2 月 15 日

Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows

Zangir Iklassov,Ikboljon Sobirov,Ruben Solozabal,Martin Takac

This paper introduces a reinforcement learning approach to optimize the Stochastic Vehicle Routing Problem with Time Windows (SVRP), focusing on reducing travel costs in goods delivery. We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows. An attention-based neural network trained through reinforcement learning is employed to minimize routing costs. Our approach addresses a gap in SVRP research, which traditionally relies on heuristic methods, by leveraging machine learning. The model outperforms the Ant-Colony Optimization algorithm, achieving a 1.73% reduction in travel costs. It uniquely integrates external information, demonstrating robustness in diverse environments, making it a valuable benchmark for future SVRP studies and industry application.

大語言模型 · MoDELS · 語言模型化 · Machine Translation · 穩健性 ·

2024 年 2 月 15 日

Simultaneous Machine Translation with Large Language Models

Minghan Wang,Jinming Zhao,Thuy-Trang Vu,Fatemeh Shiri,Ehsan Shareghi,Gholamreza Haffari

Real-world simultaneous machine translation (SimulMT) systems face more challenges than just the quality-latency trade-off. They also need to address issues related to robustness with noisy input, processing long contexts, and flexibility for knowledge injection. These challenges demand models with strong language understanding and generation capabilities which may not often equipped by dedicated MT models. In this paper, we investigate the possibility of applying Large Language Models (LLM) to SimulMT tasks by using existing incremental-decoding methods with a newly proposed RALCP algorithm for latency reduction. We conducted experiments using the \texttt{Llama2-7b-chat} model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics. Further analysis indicates that LLM has advantages in terms of tuning efficiency and robustness. However, it is important to note that the computational cost of LLM remains a significant obstacle to its application in SimulMT.\footnote{We will release our code, weights, and data with publication.}

優化器 · Integration · 控制器 · FAST · 峰值 ·

2024 年 2 月 15 日

Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles

Kohei Honda,Naoki Akai,Kosuke Suzuki,Mizuho Aoki,Hirotaka Hosogaya,Hiroyuki Okuda,Tatsuya Suzuki

from arxiv, 7 pages, 5 figures

This paper presents a novel Stochastic Optimal Control (SOC) method based on Model Predictive Path Integral control (MPPI), named Stein Variational Guided MPPI (SVG-MPPI), designed to handle rapidly shifting multimodal optimal action distributions. While MPPI can find a Gaussian-approximated optimal action distribution in closed form, i.e., without iterative solution updates, it struggles with the multimodality of the optimal distributions. This is due to the less representative nature of the Gaussian. To overcome this limitation, our method aims to identify a target mode of the optimal distribution and guide the solution to converge to fit it. In the proposed method, the target mode is roughly estimated using a modified Stein Variational Gradient Descent (SVGD) method and embedded into the MPPI algorithm to find a closed-form ``mode-seeking'' solution that covers only the target mode, thus preserving the fast convergence property of MPPI. Our simulation and real-world experimental results demonstrate that SVG-MPPI outperforms both the original MPPI and other state-of-the-art sampling-based SOC algorithms in terms of path-tracking and obstacle-avoidance capabilities. Source code: //github.com/kohonda/proj-svg_mppi

優化器 · MAC · 通道 · Networking · 閾值 ·

2024 年 2 月 14 日

Dynamic Cooperative MAC Optimization in RSU-Enhanced VANETs: A Distributed Approach

Zhou Zhang,Saman Atapattu,Yizhu Wang,Sumei Sun,Kandeepan Sithamparanathan

from arxiv, 6 pages, 5 figures, IEEE ICC 2024

This paper presents an optimization approach for cooperative Medium Access Control (MAC) techniques in Vehicular Ad Hoc Networks (VANETs) equipped with Roadside Unit (RSU) to enhance network throughput. Our method employs a distributed cooperative MAC scheme based on Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) protocol, featuring selective RSU probing and adaptive transmission. It utilizes a dual timescale channel access framework, with a ``large-scale'' phase accounting for gradual changes in vehicle locations and a ``small-scale'' phase adapting to rapid channel fluctuations. We propose the RSU Probing and Cooperative Access (RPCA) strategy, a two-stage approach based on dynamic inter-vehicle distances from the RSU. Using optimal sequential planned decision theory, we rigorously prove its optimality in maximizing average system throughput per large-scale phase. For practical implementation in VANETs, we develop a distributed MAC algorithm with periodic location updates. It adjusts thresholds based on inter-vehicle and vehicle-RSU distances during the large-scale phase and accesses channels following the RPCA strategy with updated thresholds during the small-scale phase. Simulation results confirm the effectiveness and efficiency of our algorithm.

Automator · 大語言模型 · Instagram · 語言模型化 · Facebook ·

2024 年 2 月 14 日

Automated Unit Test Improvement using Large Language Models at Meta

Nadia Alshahwan,Jubin Chheda,Anastasia Finegenova,Beliz Gokkaya,Mark Harman,Inna Harper,Alexandru Marginean,Shubho Sengupta,Eddy Wang

from arxiv, 12 pages, 8 figures, 32nd ACM Symposium on the Foundations of Software Engineering (FSE 24)

This paper describes Meta's TestGen-LLM tool, which uses LLMs to automatically improve existing human-written tests. TestGen-LLM verifies that its generated test classes successfully clear a set of filters that assure measurable improvement over the original test suite, thereby eliminating problems due to LLM hallucination. We describe the deployment of TestGen-LLM at Meta test-a-thons for the Instagram and Facebook platforms. In an evaluation on Reels and Stories products for Instagram, 75% of TestGen-LLM's test cases built correctly, 57% passed reliably, and 25% increased coverage. During Meta's Instagram and Facebook test-a-thons, it improved 11.5% of all classes to which it was applied, with 73% of its recommendations being accepted for production deployment by Meta software engineers. We believe this is the first report on industrial scale deployment of LLM-generated code backed by such assurances of code improvement.

Med-PaLM 2 · Performer · 語言模型化 · MoDELS · 自動問答 ·

2023 年 5 月 16 日

Towards Expert-Level Medical Question Answering with Large Language Models

Karan Singhal,Tao Tu,Juraj Gottweis,Rory Sayres,Ellery Wulczyn,Le Hou,Kevin Clark,Stephen Pfohl,Heather Cole-Lewis,Darlene Neal,Mike Schaekermann,Amy Wang,Mohamed Amin,Sami Lachgar,Philip Mansfield,Sushant Prakash,Bradley Green,Ewa Dominowska,Blaise Aguera y Arcas,Nenad Tomasev,Yun Liu,Renee Wong,Christopher Semturs,S. Sara Mahdavi,Joelle Barral,Dale Webster,Greg S. Corrado,Yossi Matias,Shekoofeh Azizi,Alan Karthikesalingam,Vivek Natarajan

Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been viewed as one such grand challenge. Large language models (LLMs) have catalyzed significant progress in medical question answering; Med-PaLM was the first model to exceed a "passing" score in US Medical Licensing Examination (USMLE) style questions with a score of 67.2% on the MedQA dataset. However, this and other prior work suggested significant room for improvement, especially when models' answers were compared to clinicians' answers. Here we present Med-PaLM 2, which bridges these gaps by leveraging a combination of base LLM improvements (PaLM 2), medical domain finetuning, and prompting strategies including a novel ensemble refinement approach. Med-PaLM 2 scored up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19% and setting a new state-of-the-art. We also observed performance approaching or exceeding state-of-the-art across MedMCQA, PubMedQA, and MMLU clinical topics datasets. We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility (p < 0.001). We also observed significant improvements compared to Med-PaLM on every evaluation axis (p < 0.001) on newly introduced datasets of 240 long-form "adversarial" questions to probe LLM limitations. While further studies are necessary to validate the efficacy of these models in real-world settings, these results highlight rapid progress towards physician-level performance in medical question answering.

INFORMS · 圖 · 結構化學習 · Extensibility · 學成 ·

2021 年 12 月 16 日

Graph Structure Learning with Variational Information Bottleneck

Qingyun Sun,Jianxin Li,Hao Peng,Jia Wu,Xingcheng Fu,Cheng Ji,Philip S. Yu

from arxiv, Accepted by AAAI 2022, Preprint version with Appendix

Graph Neural Networks (GNNs) have shown promising results on a broad spectrum of applications. Most empirical studies of GNNs directly take the observed graph as input, assuming the observed structure perfectly depicts the accurate and complete relations between nodes. However, graphs in the real world are inevitably noisy or incomplete, which could even exacerbate the quality of graph representations. In this work, we propose a novel Variational Information Bottleneck guided Graph Structure Learning framework, namely VIB-GSL, in the perspective of information theory. VIB-GSL advances the Information Bottleneck (IB) principle for graph structure learning, providing a more elegant and universal framework for mining underlying task-relevant relations. VIB-GSL learns an informative and compressive graph structure to distill the actionable information for specific downstream tasks. VIB-GSL deduces a variational approximation for irregular graph data to form a tractable IB objective function, which facilitates training stability. Extensive experimental results demonstrate that the superior effectiveness and robustness of VIB-GSL.

BERT · Performer · Extensibility · 注意力機制 · MoDELS ·

2021 年 2 月 22 日

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

Tingyu Xia,Yue Wang,Yuan Tian,Yi Chang

from arxiv, 10 pages, WWW'21, April19-23, 2021, Ljubljana, Slovenia

We study the problem of incorporating prior knowledge into a deep Transformer-based model,i.e.,Bidirectional Encoder Representations from Transformers (BERT), to enhance its performance on semantic textual matching tasks. By probing and analyzing what BERT has already known when solving this task, we obtain better understanding of what task-specific knowledge BERT needs the most and where it is most needed. The analysis further motivates us to take a different approach than most existing works. Instead of using prior knowledge to create a new training task for fine-tuning BERT, we directly inject knowledge into BERT's multi-head attention mechanism. This leads us to a simple yet effective approach that enjoys fast training stage as it saves the model from training on additional data or tasks other than the main task. Extensive experiments demonstrate that the proposed knowledge-enhanced BERT is able to consistently improve semantic textual matching performance over the original BERT model, and the performance benefit is most salient when training data is scarce.