亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The wide deployment of Large Language Models (LLMs) has given rise to strong demands for optimizing their inference performance. Today's techniques serving this purpose primarily focus on reducing latency and improving throughput through algorithmic and hardware enhancements, while largely overlooking their privacy side effects, particularly in a multi-user environment. In our research, for the first time, we discovered a set of new timing side channels in LLM systems, arising from shared caches and GPU memory allocations, which can be exploited to infer both confidential system prompts and those issued by other users. These vulnerabilities echo security challenges observed in traditional computing systems, highlighting an urgent need to address potential information leakage in LLM serving infrastructures. In this paper, we report novel attack strategies designed to exploit such timing side channels inherent in LLM deployments, specifically targeting the Key-Value (KV) cache and semantic cache widely used to enhance LLM inference performance. Our approach leverages timing measurements and classification models to detect cache hits, allowing an adversary to infer private prompts with high accuracy. We also propose a token-by-token search algorithm to efficiently recover shared prompt prefixes in the caches, showing the feasibility of stealing system prompts and those produced by peer users. Our experimental studies on black-box testing of popular online LLM services demonstrate that such privacy risks are completely realistic, with significant consequences. Our findings underscore the need for robust mitigation to protect LLM systems against such emerging threats.

相關內容

Convolutional Neural Networks (CNNs) are crucial in various applications, but their deployment on resource-constrained edge devices poses challenges. This study presents the Sum-of-Products (SOP) units for convolution, which utilize low-latency left-to-right bit-serial arithmetic to minimize response time and enhance overall performance. The study proposes a methodology for fusing multiple convolution layers to reduce off-chip memory communication and increase overall performance. An effective mechanism detects and skips inefficient convolutions after ReLU layers, minimizing power consumption without compromising accuracy. Furthermore, efficient tile movement guarantees uniform access to the fusion pyramid. An analysis demonstrates the utile stride strategy improves operational intensity. Two designs cater to varied demands: one focuses on minimal response time for mission-critical applications, and another focuses on resource-constrained devices with comparable latency. This approach notably reduced redundant computations, improving the efficiency of CNN deployment on edge devices.

The rapid evolution of Internet of Things (IoT) environments has created an urgent need for secure and trustworthy distributed computing systems, particularly when dealing with heterogeneous devices and applications where centralized trust cannot be assumed. This paper proposes TrustMesh, a novel blockchain-enabled framework that addresses these challenges through a unique three-layer architecture combining permissioned blockchain technology with a novel multi-phase Practical Byzantine Fault Tolerance (PBFT) consensus protocol. The key innovation lies in TrustMesh's ability to support non-deterministic scheduling algorithms while maintaining Byzantine fault tolerance - features traditionally considered mutually exclusive in blockchain systems. The framework supports a sophisticated resource management approach that enables flexible scheduling decisions while preserving the security guarantees of blockchain-based verification. Our experimental evaluation using a real-world cold chain monitoring scenario demonstrates that TrustMesh successfully maintains Byzantine fault tolerance with fault detection latencies under 150 milliseconds, while maintaining consistent framework overhead across varying computational workloads even with network scaling. These results establish TrustMesh's effectiveness in balancing security, performance, and flexibility requirements in trustless IoT environments, advancing the state-of-the-art in secure distributed computing frameworks.

We study the problem of statically optimizing select-project-join (SPJ) plans where unary key constraints are allowed. A natural measure of a plan, which we call the output degree and which has been studied previously, is the minimum degree of a polynomial bounding the plan's output relation, as a function of the input database's maximum relation size. This measure is, by definition, invariant under passing from a plan to another plan that is semantically equivalent to the first. In this article, we consider a plan measure which we call the intermediate degree; this measure is defined to be the minimum degree bounding the size of all intermediate relations computed during a plan's execution -- again, as a function of the input database's maximum relation size. We present an algorithm that, given an SPJ plan $q$ and a set $\Sigma$ of unary keys, computes an SPJ plan $q'$ that is semantically equivalent to $q$ (over databases satisfying $\Sigma$) and that has the minimum intermediate degree over all such semantically equivalent plans. For the types of plans considered, we thus obtain a complete and effective understanding of intermediate degree.

In the realm of computer systems, efficient utilisation of the CPU (Central Processing Unit) has always been a paramount concern. Researchers and engineers have long sought ways to optimise process execution on the CPU, leading to the emergence of CPU scheduling as a field of study. This research proposes a novel algorithm for batch processing that operates on a preemptive model, dynamically assigning priorities based on a robust ratio, employing a dynamic time slice, and utilising periodic sorting technique to achieve fairness. By engineering this responsive and fair model, the proposed algorithm strikes a delicate balance between efficiency and fairness, providing an optimised solution for batch scheduling while ensuring system responsiveness.

The domain of Natural Language Processing (NLP) has experienced notable progress in the evolution of Bangla Question Answering (QA) systems. This paper presents a comprehensive review of seven research articles that contribute to the progress in this domain. These research studies explore different aspects of creating question-answering systems for the Bangla language. They cover areas like collecting data, preparing it for analysis, designing models, conducting experiments, and interpreting results. The papers introduce innovative methods like using LSTM-based models with attention mechanisms, context-based QA systems, and deep learning techniques based on prior knowledge. However, despite the progress made, several challenges remain, including the lack of well-annotated data, the absence of high-quality reading comprehension datasets, and difficulties in understanding the meaning of words in context. Bangla QA models' precision and applicability are constrained by these challenges. This review emphasizes the significance of these research contributions by highlighting the developments achieved in creating Bangla QA systems as well as the ongoing effort required to get past roadblocks and improve the performance of these systems for actual language comprehension tasks.

Analysts in Security Operations Centers (SOCs) are often occupied with time-consuming investigations of alerts from Network Intrusion Detection Systems (NIDS). Many NIDS rules lack clear explanations and associations with attack techniques, complicating the alert triage and the generation of attack hypotheses. Large Language Models (LLMs) may be a promising technology to reduce the alert explainability gap by associating rules with attack techniques. In this paper, we investigate the ability of three prominent LLMs (ChatGPT, Claude, and Gemini) to reason about NIDS rules while labeling them with MITRE ATT&CK tactics and techniques. We discuss prompt design and present experiments performed with 973 Snort rules. Our results indicate that while LLMs provide explainable, scalable, and efficient initial mappings, traditional Machine Learning (ML) models consistently outperform them in accuracy, achieving higher precision, recall, and F1-scores. These results highlight the potential for hybrid LLM-ML approaches to enhance SOC operations and better address the evolving threat landscape.

Active imaging systems sample the Transient Light Transport Matrix (TLTM) for a scene by sequentially illuminating various positions in this scene using a controllable light source, and then measuring the resulting spatiotemporal light transport with time of flight (ToF) sensors. Time-resolved Non-line-of-sight (NLOS) imaging employs an active imaging system that measures part of the TLTM of an intermediary relay surface, and uses the indirect reflections of light encoded within this TLTM to "see around corners". Such imaging systems have applications in diverse areas such as disaster response, remote surveillance, and autonomous navigation. While existing NLOS imaging systems usually measure a subset of the full TLTM, development of customized gated Single Photon Avalanche Diode (SPAD) arrays \cite{riccardo_fast-gated_2022} has made it feasible to probe the full measurement space. In this work, we demonstrate that the full TLTM on the relay surface can be processed with efficient algorithms to computationally focus and detect our illumination in different parts of the hidden scene, turning the relay surface into a second-order active imaging system. These algorithms allow us to iterate on the measured, first-order TLTM, and extract a \textbf{second order TLTM for surfaces in the hidden scene}. We showcase three applications of TLTMs in NLOS imaging: (1) Scene Relighting with novel illumination, (2) Separation of direct and indirect components of light transport in the hidden scene, and (3) Dual Photography. Additionally, we empirically demonstrate that SPAD arrays enable parallel acquisition of photons, effectively mitigating long acquisition times.

Vision-Language Models (VLMs) have shown promising capabilities in handling various multimodal tasks, yet they struggle in long-context scenarios, particularly in tasks involving videos, high-resolution images, or lengthy image-text documents. In our work, we first conduct an empirical analysis of the long-context capabilities of VLMs using our augmented long-context multimodal datasets. Our findings reveal that directly applying the positional encoding mechanism used for textual tokens to visual tokens is suboptimal, and VLM performance degrades sharply when the position encoding exceeds the model's context window. To address this, we propose Variable Visual Position Encoding (V2PE), a novel positional encoding approach that employs variable and smaller increments for visual tokens, enabling more efficient management of long multimodal sequences. Our experiments demonstrate the effectiveness of V2PE to enhances VLMs' ability to effectively understand and reason over long multimodal contexts. We further integrate V2PE with our augmented long-context multimodal datasets to fine-tune the open-source VLM, InternVL2. The fine-tuned model achieves strong performance on both standard and long-context multimodal tasks. Notably, when the sequence length of the training dataset is increased to 256K tokens, the model is capable of processing multimodal sequences up to 1M tokens, highlighting its potential for real-world long-context applications.

Predicting roll call votes through modeling political actors has emerged as a focus in quantitative political science and computer science. Widely used embedding-based methods generate vectors for legislators from diverse data sets to predict legislative behaviors. However, these methods often contend with challenges such as the need for manually predefined features, reliance on extensive training data, and a lack of interpretability. Achieving more interpretable predictions under flexible conditions remains an unresolved issue. This paper introduces the Political Actor Agent (PAA), a novel agent-based framework that utilizes Large Language Models to overcome these limitations. By employing role-playing architectures and simulating legislative system, PAA provides a scalable and interpretable paradigm for predicting roll-call votes. Our approach not only enhances the accuracy of predictions but also offers multi-view, human-understandable decision reasoning, providing new insights into political actor behaviors. We conducted comprehensive experiments using voting records from the 117-118th U.S. House of Representatives, validating the superior performance and interpretability of PAA. This study not only demonstrates PAA's effectiveness but also its potential in political science research.

Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.

北京阿比特科技有限公司