亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='KQNQ0'></tfoot>

<legend id='xISge'><style id='xbMwQ'><dir id='1JRa4'><q id='SaNtk'></q></dir></style></legend>

<i id='4i03d'><tr id='AURpz'><dt id='k3uKB'><q id='nDsQR'><span id='iqfeP'><b id='xfgVq'><form id='6v4W1'><ins id='KrFOz'></ins><ul id='zS9xy'></ul><sub id='65Ura'></sub></form><legend id='bpFHy'></legend><bdo id='Md5nQ'><pre id='uN70E'><center id='Y9HT2'></center></pre></bdo></b><th id='69Orr'></th></span></q></dt></tr></i><div id='cKnp0'><tfoot id='tvpCn'></tfoot><dl id='0YlUX'><fieldset id='8MjGu'></fieldset></dl></div>

·

基 · 模型評估 · MoDELS · Nuance · state-of-the-art ·

2024 年 6 月 21 日

A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems

Florin Cuconasu,Giovanni Trappolini,Nicola Tonellotto,Fabrizio Silvestri

Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are aligned with human preferences using state-of-the-art techniques. Contrary to popular belief, our study demonstrates that base models outperform their instructed counterparts in RAG tasks by 20% on average under our experimental settings. This finding challenges the prevailing assumptions about the superiority of instructed LLMs in RAG applications. Further investigations reveal a more nuanced situation, questioning fundamental aspects of RAG and suggesting the need for broader discussions on the topic; or, as Fromm would have it, "Seldom is a glance at the statistics enough to understand the meaning of the figures".

相關內容

Taxonomy · 設計 · Analysis · 講稿 · 可辨認的 ·

2024 年 7 月 28 日

A Qualitative Analysis of Common Practices in Annotations: A Taxonomy and Design Space

Md Dilshadur Rahman,Ghulam Jilani Quadri,Bhavana Doppalapudi,Danielle Albers Szafir,Paul Rosen

Annotations play a vital role in highlighting critical aspects of visualizations, aiding in data externalization and exploration, collaborative sensemaking, and visual storytelling. However, despite their widespread use, we identified a lack of a design space for common practices for annotations. In this paper, we evaluated over 1,800 static annotated charts to understand how people annotate visualizations in practice. Through qualitative coding of these diverse real-world annotated charts, we explored three primary aspects of annotation usage patterns: analytic purposes for chart annotations (e.g., present, identify, summarize, or compare data features), mechanisms for chart annotations (e.g., types and combinations of annotations used, frequency of different annotation types across chart types, etc.), and the data source used to generate the annotations. We then synthesized our findings into a design space of annotations, highlighting key design choices for chart annotations. We presented three case studies illustrating our design space as a practical framework for chart annotations to enhance the communication of visualization insights. All supplemental materials are available at {//shorturl.at/bAGM1}.

MoDELS · 鏈路預測 · Better · 數據集 · 評論員 ·

2024 年 7 月 25 日

Numerical Literals in Link Prediction: A Critical Examination of Models and Datasets

Moritz Blum,Basil Ell,Hannes Ill,Philipp Cimiano

Link Prediction(LP) is an essential task over Knowledge Graphs(KGs), traditionally focussed on using and predicting the relations between entities. Textual entity descriptions have already been shown to be valuable, but models that incorporate numerical literals have shown minor improvements on existing benchmark datasets. It is unclear whether a model is actually better in using numerical literals, or better capable of utilizing the graph structure. This raises doubts about the effectiveness of these methods and about the suitability of the existing benchmark datasets. We propose a methodology to evaluate LP models that incorporate numerical literals. We propose i) a new synthetic dataset to better understand how well these models use numerical literals and ii) dataset ablations strategies to investigate potential difficulties with the existing datasets. We identify a prevalent trend: many models underutilize literal information and potentially rely on additional parameters for performance gains. Our investigation highlights the need for more extensive evaluations when releasing new models and datasets.

可交換的 · Continuity · 合一 · 有偏 · 可辨認的 ·

2024 年 7 月 25 日

A Unification of Exchangeability and Continuous Exposure and Confounder Measurement Errors: Probabilistic Exchangeability

from arxiv, Revisited/Revised

Exchangeability concerning a continuous exposure, X, may be assumed to identify average exposure effects of X, AEE(X). When X is measured with error (Xep), three challenges arise. First, exchangeability regarding Xep does not equal exchangeability regarding X. Second, the non-differential error assumption (NDEA) could be overly stringent in practice. Third, a definition of exchangeability that implies that AEE(Xep) can differ from AEE(X) is lacking. To address them, this article proposes unifying exchangeability and exposure/confounder measurement errors with three novel concepts. The first, Probabilistic Exchangeability (PE) is an exchangeability assumption that allows for the difference between AEE(Xep) and AEE(X). The second concept, Emergent Pseudo Confounding (EPC), describes the bias introduced by exposure measurement error through mechanisms like confounding mechanisms. The third, Emergent Confounding, describes when bias due to confounder measurement error arises. PE requires adjustment for E(P)C, which can be performed like confounding adjustment. Under PE, the coefficient of determination (R2) in the regression of Xep against X may sometimes be sufficient to measure the difference between AEE(Xep) and AEE(X) in risk difference and ratio scales. This paper provides comprehensive insight into when AEE(Xep) is a surrogate of AEE(X). Differential errors could be addressed and may not compromise causal inference

Things · Learning · 可辨認的 · 聯邦學習 · 論文 ·

2024 年 7 月 25 日

Privacy Threats and Countermeasures in Federated Learning for Internet of Things: A Systematic Review

Adel ElZemity,Budi Arief

Federated Learning (FL) in the Internet of Things (IoT) environments can enhance machine learning by utilising decentralised data, but at the same time, it might introduce significant privacy and security concerns due to the constrained nature of IoT devices. This represents a research challenge that we aim to address in this paper. We systematically analysed recent literature to identify privacy threats in FL within IoT environments, and evaluate the defensive measures that can be employed to mitigate these threats. Using a Systematic Literature Review (SLR) approach, we searched five publication databases (Scopus, IEEE Xplore, Wiley, ACM, and Science Direct), collating relevant papers published between 2017 and April 2024, a period which spans from the introduction of FL until now. Guided by the PRISMA protocol, we selected 49 papers to focus our systematic review on. We analysed these papers, paying special attention to the privacy threats and defensive measures -- specifically within the context of IoT -- using inclusion and exclusion criteria tailored to highlight recent advances and critical insights. We identified various privacy threats, including inference attacks, poisoning attacks, and eavesdropping, along with defensive measures such as Differential Privacy and Secure Multi-Party Computation. These defences were evaluated for their effectiveness in protecting privacy without compromising the functional integrity of FL in IoT settings. Our review underscores the necessity for robust and efficient privacy-preserving strategies tailored for IoT environments. Notably, there is a need for strategies against replay, evasion, and model stealing attacks. Exploring lightweight defensive measures and emerging technologies such as blockchain may help improve the privacy of FL in IoT, leading to the creation of FL models that can operate under variable network conditions.

優化器 · 解碼 · Learning · Processing（編程語言） · 樣本 ·

2024 年 7 月 24 日

Take a Step and Reconsider: Sequence Decoding for Self-Improved Neural Combinatorial Optimization

Jonathan Pirnay,Dominik G. Grimm

from arxiv, Accepted at ECAI-2024

The constructive approach within Neural Combinatorial Optimization (NCO) treats a combinatorial optimization problem as a finite Markov decision process, where solutions are built incrementally through a sequence of decisions guided by a neural policy network. To train the policy, recent research is shifting toward a 'self-improved' learning methodology that addresses the limitations of reinforcement learning and supervised approaches. Here, the policy is iteratively trained in a supervised manner, with solutions derived from the current policy serving as pseudo-labels. The way these solutions are obtained from the policy determines the quality of the pseudo-labels. In this paper, we present a simple and problem-independent sequence decoding method for self-improved learning based on sampling sequences without replacement. We incrementally follow the best solution found and repeat the sampling process from intermediate partial solutions. By modifying the policy to ignore previously sampled sequences, we force it to consider only unseen alternatives, thereby increasing solution diversity. Experimental results for the Traveling Salesman and Capacitated Vehicle Routing Problem demonstrate its strong performance. Furthermore, our method outperforms previous NCO approaches on the Job Shop Scheduling Problem.

相似度 · Vision · CASES · 相似度度量 · Transformer ·

2024 年 7 月 24 日

Case-Enhanced Vision Transformer: Improving Explanations of Image Similarity with a ViT-based Similarity Metric

Ziwei Zhao,David Leake,Xiaomeng Ye,David Crandall

This short paper presents preliminary research on the Case-Enhanced Vision Transformer (CEViT), a similarity measurement method aimed at improving the explainability of similarity assessments for image data. Initial experimental results suggest that integrating CEViT into k-Nearest Neighbor (k-NN) classification yields classification accuracy comparable to state-of-the-art computer vision models, while adding capabilities for illustrating differences between classes. CEViT explanations can be influenced by prior cases, to illustrate aspects of similarity relevant to those cases.

Engineering · MoDELS · 可辨認的 · AIM · 語言模型化 ·

2024 年 7 月 22 日

The Shadow of Fraud: The Emerging Danger of AI-powered Social Engineering and its Possible Cure

Jingru Yu,Yi Yu,Xuhong Wang,Yilun Lin,Manzhi Yang,Yu Qiao,Fei-Yue Wang

Social engineering (SE) attacks remain a significant threat to both individuals and organizations. The advancement of Artificial Intelligence (AI), including diffusion models and large language models (LLMs), has potentially intensified these threats by enabling more personalized and convincing attacks. This survey paper categorizes SE attack mechanisms, analyzes their evolution, and explores methods for measuring these threats. It highlights the challenges in raising awareness about the risks of AI-enhanced SE attacks and offers insights into developing proactive and adaptable defense strategies. Additionally, we introduce a categorization of the evolving nature of AI-powered social engineering attacks into "3E phases": Enlarging, wherein the magnitude of attacks expands through the leverage of digital media; Enriching, introducing novel attack vectors and techniques; and Emerging, signifying the advent of novel threats and methods. Moreover, we emphasize the necessity for a robust framework to assess the risk of AI-powered SE attacks. By identifying and addressing gaps in existing research, we aim to guide future studies and encourage the development of more effective defenses against the growing threat of AI-powered social engineering.

核化 · 估計/估計量 · Machine Learning · Performer · 查準率/準確率 ·

2024 年 7 月 22 日

In Search of Quantum Advantage: Estimating the Number of Shots in Quantum Kernel Methods

Artur Miroszewski,Marco Fellous Asiani,Jakub Mielczarek,Bertrand Le Saux,Jakub Nalepa

from arxiv, 18 + 13 pages, 8 figures. This manuscript is a first release that will be improved in future versions. We wanted to provide this preview now as we recently became aware of extensive modifications in arXiv:2208.11060

Quantum Machine Learning (QML) has gathered significant attention through approaches like Quantum Kernel Machines. While these methods hold considerable promise, their quantum nature presents inherent challenges. One major challenge is the limited resolution of estimated kernel values caused by the finite number of circuit runs performed on a quantum device. In this study, we propose a comprehensive system of rules and heuristics for estimating the required number of circuit runs in quantum kernel methods. We introduce two critical effects that necessitate an increased measurement precision through additional circuit runs: the spread effect and the concentration effect. The effects are analyzed in the context of fidelity and projected quantum kernels. To address these phenomena, we develop an approach for estimating desired precision of kernel values, which, in turn, is translated into the number of circuit runs. Our methodology is validated through extensive numerical simulations, focusing on the problem of exponential value concentration. We stress that quantum kernel methods should not only be considered from the machine learning performance perspective, but also from the context of the resource consumption. The results provide insights into the possible benefits of quantum kernel methods, offering a guidance for their application in quantum machine learning tasks.

INTERACT · 端到端 · Better · Processing（編程語言） · MoDELS ·

2024 年 7 月 22 日

PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving

Zhili Chen,Maosheng Ye,Shuangjie Xu,Tongyi Cao,Qifeng Chen

from arxiv, Accepted to ECCV 2024. Project page: //github.com/zlichen/PPAD

We present a new interaction mechanism of prediction and planning for end-to-end autonomous driving, called PPAD (Iterative Interaction of Prediction and Planning Autonomous Driving), which considers the timestep-wise interaction to better integrate prediction and planning. An ego vehicle performs motion planning at each timestep based on the trajectory prediction of surrounding agents (e.g., vehicles and pedestrians) and its local road conditions. Unlike existing end-to-end autonomous driving frameworks, PPAD models the interactions among ego, agents, and the dynamic environment in an autoregressive manner by interleaving the Prediction and Planning processes at every timestep, instead of a single sequential process of prediction followed by planning. Specifically, we design ego-to-agent, ego-to-map, and ego-to-BEV interaction mechanisms with hierarchical dynamic key objects attention to better model the interactions. The experiments on the nuScenes benchmark show that our approach outperforms state-of-the-art methods.

語言模型化 · 可辨認的 · 大語言模型 · MoDELS · 可理解性 ·

2024 年 4 月 15 日

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Usman Anwar,Abulhair Saparov,Javier Rando,Daniel Paleka,Miles Turpin,Peter Hase,Ekdeep Singh Lubana,Erik Jenner,Stephen Casper,Oliver Sourbut,Benjamin L. Edelman,Zhaowei Zhang,Mario Günther,Anton Korinek,Jose Hernandez-Orallo,Lewis Hammond,Eric Bigelow,Alexander Pan,Lauro Langosco,Tomasz Korbak,Heidi Zhang,Ruiqi Zhong,Seán ó héigeartaigh,Gabriel Recchia,Giulio Corsi,Alan Chan,Markus Anderljung,Lilian Edwards,Yoshua Bengio,Danqi Chen,Samuel Albanie,Tegan Maharaj,Jakob Foerster,Florian Tramer,He He,Atoosa Kasirzadeh,Yejin Choi,David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

模型評估(gu)

state-of-the-art

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191