亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Deep Learning (DL) is increasingly used in safety-critical applications, raising concerns about its reliability. DL suffers from a well-known problem of lacking robustness, especially when faced with adversarial perturbations known as Adversarial Examples (AEs). Despite recent efforts to detect AEs using advanced attack and testing methods, these approaches often overlook the input distribution and perceptual quality of the perturbations. As a result, the detected AEs may not be relevant in practical applications or may appear unrealistic to human observers. This can waste testing resources on rare AEs that seldom occur during real-world use, limiting improvements in DL model dependability. In this paper, we propose a new robustness testing approach for detecting AEs that considers both the feature level distribution and the pixel level distribution, capturing the perceptual quality of adversarial perturbations. The two considerations are encoded by a novel hierarchical mechanism. First, we select test seeds based on the density of feature level distribution and the vulnerability of adversarial robustness. The vulnerability of test seeds are indicated by the auxiliary information, that are highly correlated with local robustness. Given a test seed, we then develop a novel genetic algorithm based local test case generation method, in which two fitness functions work alternatively to control the perceptual quality of detected AEs. Finally, extensive experiments confirm that our holistic approach considering hierarchical distributions is superior to the state-of-the-arts that either disregard any input distribution or only consider a single (non-hierarchical) distribution, in terms of not only detecting imperceptible AEs but also improving the overall robustness of the DL model under testing.

相關內容

Extensive system scales (i.e. thousands of GPU/TPUs) and prolonged training periods (i.e. months of pretraining) significantly escalate the probability of failures when training large language models (LLMs). Thus, efficient and reliable fault-tolerance methods are in urgent need. Checkpointing is the primary fault-tolerance method to periodically save parameter snapshots from GPU memory to disks via CPU memory. In this paper, we identify the frequency of existing checkpoint-based fault-tolerance being significantly limited by the storage I/O overheads, which results in hefty re-training costs on restarting from the nearest checkpoint. In response to this gap, we introduce an in-memory fault-tolerance framework for large-scale LLM pretraining. The framework boosts the efficiency and reliability of fault tolerance from three aspects: (1) Reduced Data Transfer and I/O: By asynchronously caching parameters, i.e., sharded model parameters, optimizer states, and RNG states, to CPU volatile memory, Our framework significantly reduces communication costs and bypasses checkpoint I/O. (2) Enhanced System Reliability: Our framework enhances parameter protection with a two-layer hierarchy: snapshot management processes (SMPs) safeguard against software failures, together with Erasure Coding (EC) protecting against node failures. This double-layered protection greatly improves the survival probability of the parameters compared to existing checkpointing methods. (3) Improved Snapshotting Frequency: Our framework achieves more frequent snapshotting compared with asynchronous checkpointing optimizations under the same saving time budget, which improves the fault tolerance efficiency. Empirical results demonstrate that Our framework minimizes the overhead of fault tolerance of LLM pretraining by effectively leveraging redundant CPU resources.

Generative AI systems produce a range of risks. To ensure the safety of generative AI systems, these risks must be evaluated. In this paper, we make two main contributions toward establishing such evaluations. First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks. This framework encompasses capability evaluations, which are the main current approach to safety evaluation. It then reaches further by building on system safety principles, particularly the insight that context determines whether a given capability may cause harm. To account for relevant context, our framework adds human interaction and systemic impacts as additional layers of evaluation. Second, we survey the current state of safety evaluation of generative AI systems and create a repository of existing evaluations. Three salient evaluation gaps emerge from this analysis. We propose ways forward to closing these gaps, outlining practical steps as well as roles and responsibilities for different actors. Sociotechnical safety evaluation is a tractable approach to the robust and comprehensive safety evaluation of generative AI systems.

Large Language Models (LLMs) have shown impressive capabilities in various applications, but they still face various inconsistency issues. Existing works primarily focus on the inconsistency issues within a single LLM, while we complementarily explore the inter-consistency among multiple LLMs for collaboration. To examine whether LLMs can collaborate effectively to achieve a consensus for a shared goal, we focus on commonsense reasoning, and introduce a formal debate framework (FORD) to conduct a three-stage debate among LLMs with real-world scenarios alignment: fair debate, mismatched debate, and roundtable debate. Through extensive experiments on various datasets, LLMs can effectively collaborate to reach a consensus despite noticeable inter-inconsistencies, but imbalances in their abilities can lead to domination by superior LLMs. Leveraging a more advanced LLM like GPT-4 as an authoritative judge can boost collaboration performance. Our work contributes to understanding the inter-consistency among LLMs and lays the foundation for developing future collaboration methods. Codes and data are available at //github.com/Waste-Wood/FORD

Guidance in conditional diffusion generation is of great importance for sample quality and controllability. However, existing guidance schemes are to be desired. On one hand, mainstream methods such as classifier guidance and classifier-free guidance both require extra training with labeled data, which is time-consuming and unable to adapt to new conditions. On the other hand, training-free methods such as universal guidance, though more flexible, have yet to demonstrate comparable performance. In this work, through a comprehensive investigation into the design space, we show that it is possible to achieve significant performance improvements over existing guidance schemes by leveraging off-the-shelf classifiers in a training-free fashion, enjoying the best of both worlds. Employing calibration as a general guideline, we propose several pre-conditioning techniques to better exploit pretrained off-the-shelf classifiers for guiding diffusion generation. Extensive experiments on ImageNet validate our proposed method, showing that state-of-the-art diffusion models (DDPM, EDM, DiT) can be further improved (up to 20%) using off-the-shelf classifiers with barely any extra computational cost. With the proliferation of publicly available pretrained classifiers, our proposed approach has great potential and can be readily scaled up to text-to-image generation tasks. The code is available at //github.com/AlexMaOLS/EluCD/tree/main.

The integrated sensing and communication (ISAC) problem with general state and channel distributions is investigated. General formulas of the capacity-distortion tradeoff for the ISAC problem under maximal and average distortion constraints are provided. The results cover some existing communication models such as the general point-to-point channel and Gel'fand-Pinsker channel. More details including memoryless states and channels, mixed states and channels, and rate-limited CSI at one side are considered. Numerical results focus on writing-on-dirty paper model and channel with ergodic/non-ergodic fading.

Data-driven predictions are often perceived as inaccurate in hindsight due to behavioral responses. We consider the role of interface design choices on how individuals respond to predictions presented on a shared information display in a strategic setting. We introduce a novel staged experimental design to investigate the effects of interface design features, such as the visualization of prediction uncertainty and prediction error, within a repeated congestion game. In this game, participants assume the role of taxi drivers and use a shared information display to decide where to search for their next ride. Our experimental design endows agents with varying level-$k$ depths of thinking, allowing some agents to possess greater sophistication in anticipating the decisions of others using the same information display. Through several large pre-registered experiments, we identify trade-offs between displays that are optimal for individual decisions and those that best serve the collective social welfare of the system. Additionally, we note that the influence of display characteristics varies based on an agent's strategic sophistication. We observe that design choices promoting individual-level decision-making can lead to suboptimal system outcomes, as manifested by a lower realization of potential social welfare. However, this decline in social welfare is offset by a slight reduction in distribution shift, narrowing the gap between predicted and realized system outcomes. This may enhance the perceived reliability and trustworthiness of the information display post hoc. Our findings pave the way for new research questions concerning the design of effective prediction interfaces in strategic environments.

Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring the safety of the robot and its environment. Safe Robot RL (SRRL) is a crucial step towards achieving human-robot coexistence. In this paper, we envision a human-centered SRRL framework consisting of three stages: safe exploration, safety value alignment, and safe collaboration. We examine the research gaps in these areas and propose to leverage interactive behaviors for SRRL. Interactive behaviors enable bi-directional information transfer between humans and robots, such as conversational robot ChatGPT. We argue that interactive behaviors need further attention from the SRRL community. We discuss four open challenges related to the robustness, efficiency, transparency, and adaptability of SRRL with interactive behaviors.

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks. Despite its powerful capacity to learn and generalize from few samples, GNN usually suffers from severe over-fitting and over-smoothing as the model becomes deep, which limit the model scalability. In this work, we propose a novel Attentive GNN to tackle these challenges, by incorporating a triple-attention mechanism, \ie node self-attention, neighborhood attention, and layer memory attention. We explain why the proposed attentive modules can improve GNN for few-shot learning with theoretical analysis and illustrations. Extensive experiments show that the proposed Attentive GNN outperforms the state-of-the-art GNN-based methods for few-shot learning over the mini-ImageNet and Tiered-ImageNet datasets, with both inductive and transductive settings.

Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.

北京阿比特科技有限公司