亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We revisit the classical problem of multiclass classification with bandit feedback (Kakade, Shalev-Shwartz and Tewari, 2008), where each input classifies to one of $K$ possible labels and feedback is restricted to whether the predicted label is correct or not. Our primary inquiry is with regard to the dependency on the number of labels $K$, and whether $T$-step regret bounds in this setting can be improved beyond the $\smash{\sqrt{KT}}$ dependence exhibited by existing algorithms. Our main contribution is in showing that the minimax regret of bandit multiclass is in fact more nuanced, and is of the form $\smash{\widetilde{\Theta}\left(\min \left\{|\mathcal{H}| + \sqrt{T}, \sqrt{KT \log |{\mathcal{H}|}} \right\} \right) }$, where $\mathcal{H}$ is the underlying (finite) hypothesis class. In particular, we present a new bandit classification algorithm that guarantees regret $\smash{\widetilde{O}(|\mathcal{H}|+\sqrt{T})}$, improving over classical algorithms for moderately-sized hypothesis classes, and give a matching lower bound establishing tightness of the upper bounds (up to log-factors) in all parameter regimes.

相關內容

DRAM read disturbance can break memory isolation, a fundamental property to ensure system robustness (i.e., reliability, security, safety). RowHammer and RowPress are two different DRAM read disturbance phenomena. RowHammer induces bitflips in physically adjacent victim DRAM rows by repeatedly opening and closing an aggressor DRAM row, while RowPress induces bitflips by keeping an aggressor DRAM row open for a long period of time. In this study, we characterize a DRAM access pattern that combines RowHammer and RowPress in 84 real DDR4 DRAM chips from all three major DRAM manufacturers. Our key results show that 1) this combined RowHammer and RowPress pattern takes significantly smaller amount of time (up to 46.1% faster) to induce the first bitflip compared to the state-of-the-art RowPress pattern, and 2) at the minimum aggressor row activation count to induce at least one bitflip, the bits that flip are different across RowHammer, RowPress, and the combined patterns. Based on our results, we provide a key hypothesis that the read disturbance effect caused by RowPress from one of the two aggressor rows in a double-sided pattern is much more significant than the other.

Cyber Essentials (CE) comprise a set of controls designed to protect organisations, irrespective of their size, against cyber attacks. The controls are firewalls, secure configuration, user access control, malware protection & security update management. In this work, we explore the extent to which CE remains robust against an ever-evolving threat landscape. To that end, we reconstruct 45 breaches mapped to MiTRE ATT&CK using an Incident Fault Tree ( IFT ) approach. Our method reveals the intersections where the placement of controls could have protected organisations. Then we identify appropriate Cyber Essential controls and/or Additional Controls for these vulnerable intersections. Our results show that CE controls can effectively protect against most attacks during the initial attack phase. However, they may need to be complemented with additional Controls if the attack proceeds further into organisational systems & networks. The Additional Controls (AC) we identify include back-ups, security awareness, logging and monitoring. Our analysis brings to the fore a foundational issue as to whether controls should exclude recovery and focus only on pre-emption. The latter makes the strong assumption that a prior identification of all controls in a dynamic threat landscape is indeed possible. Furthermore, any potential broadening of technical controls entails re-scoping the skills that are required for a Cyber Essentials (CE) assessor. To that end, we suggest human factors and security operations and incident management as two potential knowledge areas from Cyber Security Body of Knowledge (CyBOK) if there is any broadening of CE based on these findings.

Load balancing and auto scaling are at the core of scalable, contemporary systems, addressing dynamic resource allocation and service rate adjustments in response to workload changes. This paper introduces a novel model and algorithms for tuning load balancers coupled with auto scalers, considering bursty traffic arriving at finite queues. We begin by presenting the problem as a weakly coupled Markov Decision Processes (MDP), solvable via a linear program (LP). However, as the number of control variables of such LP grows combinatorially, we introduce a more tractable relaxed LP formulation, and extend it to tackle the problem of online parameter learning and policy optimization using a two-timescale algorithm based on the LP Lagrangian.

Language models (LMs) are known to suffer from forgetting of previously learned examples when fine-tuned, breaking stability of deployed LM systems. Despite efforts on mitigating forgetting, few have investigated whether, and how forgotten upstream examples are associated with newly learned tasks. Insights on such associations enable efficient and targeted mitigation of forgetting. In this paper, we empirically analyze forgetting that occurs in $N$ upstream examples while the model learns $M$ new tasks and visualize their associations with a $M \times N$ matrix. We empirically demonstrate that the degree of forgetting can often be approximated by simple multiplicative contributions of the upstream examples and newly learned tasks. We also reveal more complicated patterns where specific subsets of examples are forgotten with statistics and visualization. Following our analysis, we predict forgetting that happens on upstream examples when learning a new task with matrix completion over the empirical associations, outperforming prior approaches that rely on trainable LMs. Project website: //inklab.usc.edu/lm-forgetting-prediction/

Vision-language models (VLMs) seamlessly integrate visual and textual data to perform tasks such as image classification, caption generation, and visual question answering. However, adversarial images often struggle to deceive all prompts effectively in the context of cross-prompt migration attacks, as the probability distribution of the tokens in these images tends to favor the semantics of the original image rather than the target tokens. To address this challenge, we propose a Contextual-Injection Attack (CIA) that employs gradient-based perturbation to inject target tokens into both visual and textual contexts, thereby improving the probability distribution of the target tokens. By shifting the contextual semantics towards the target tokens instead of the original image semantics, CIA enhances the cross-prompt transferability of adversarial images.Extensive experiments on the BLIP2, InstructBLIP, and LLaVA models show that CIA outperforms existing methods in cross-prompt transferability, demonstrating its potential for more effective adversarial strategies in VLMs.

Cooperative Multi-Agent Reinforcement Learning (MARL) algorithms, trained only to optimize task reward, can lead to a concentration of power where the failure or adversarial intent of a single agent could decimate the reward of every agent in the system. In the context of teams of people, it is often useful to explicitly consider how power is distributed to ensure no person becomes a single point of failure. Here, we argue that explicitly regularizing the concentration of power in cooperative RL systems can result in systems which are more robust to single agent failure, adversarial attacks, and incentive changes of co-players. To this end, we define a practical pairwise measure of power that captures the ability of any co-player to influence the ego agent's reward, and then propose a power-regularized objective which balances task reward and power concentration. Given this new objective, we show that there always exists an equilibrium where every agent is playing a power-regularized best-response balancing power and task reward. Moreover, we present two algorithms for training agents towards this power-regularized objective: Sample Based Power Regularization (SBPR), which injects adversarial data during training; and Power Regularization via Intrinsic Motivation (PRIM), which adds an intrinsic motivation to regulate power to the training objective. Our experiments demonstrate that both algorithms successfully balance task reward and power, leading to lower power behavior than the baseline of task-only reward and avoid catastrophic events in case an agent in the system goes off-policy.

Bayesian inference has been used in the past to model visual perception (Kersten, Mamassian, & Yuille, 2004), accounting for the Helmholtz principle of perception as "unconscious inference" that is constrained by bottom-up sensory evidence (likelihood) while subject to top-down expectation, priming, or other contextual influences (prior bias); here "unconsciousness" merely relates to the "directness" of perception in the sense of Gibson. Here, we adopt the same Bayesian framework to model emotion process in accordance with Schachter-Singer's Two-Factor theory, which argues that emotion is the outcome of cognitive labeling or attribution of a diffuse pattern of autonomic arousal (Schachter & Singer, 1962). In analogous to visual perception, we conceptualize the emotion process as an instance of Bayesian inference, combining the contextual information with a person's physiological arousal patterns. Drift-diffusion models were constructed to simulate emotional processes, where the decision boundaries correspond to the emotional state experienced by the participants, and boundary-crossing constitutes "labeling" in Schachter-Singer's sense. Our model is tested against experimental data from the Schachter & Singer's study (1962) and the Ross et al. study (1969). Two model scenarios are investigated, in which arousal pattern as one factor is pitted against contextual interaction with an confederate (in Schachter-Singer case) or explicitly instructed mis-attribution (in Ross et al. case) as another factor, mapping onto the Bayesian prior (initial position of the drift) and the likelihood function (evidence accumulation or drift rate). We find that the first scenario (arousal as the prior and context as the likelihood) has a better fit with Schachter & Singer (1962) whereas the second scenario (context as the prior and arousal as the likelihood) has a better fit with Ross et al. (1969).

This paper addresses the task of modeling Deformable Linear Objects (DLOs), such as ropes and cables, during dynamic motion over long time horizons. This task presents significant challenges due to the complex dynamics of DLOs. To address these challenges, this paper proposes differentiable Discrete Elastic Rods For deformable linear Objects with Real-time Modeling (DEFORM), a novel framework that combines a differentiable physics-based model with a learning framework to model DLOs accurately and in real-time. The performance of DEFORM is evaluated in an experimental setup involving two industrial robots and a variety of sensors. A comprehensive series of experiments demonstrate the efficacy of DEFORM in terms of accuracy, computational speed, and generalizability when compared to state-of-the-art alternatives. To further demonstrate the utility of DEFORM, this paper integrates it into a perception pipeline and illustrates its superior performance when compared to the state-of-the-art methods while tracking a DLO even in the presence of occlusions. Finally, this paper illustrates the superior performance of DEFORM when compared to state-of-the-art methods when it is applied to perform autonomous planning and control of DLOs. Project page: //roahmlab.github.io/DEFORM/.

Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements to address the static limitations of large language models (LLMs) by enabling the dynamic integration of up-to-date external information. This methodology, focusing primarily on the text domain, provides a cost-effective solution to the generation of plausible but incorrect responses by LLMs, thereby enhancing the accuracy and reliability of their outputs through the use of real-world data. As RAG grows in complexity and incorporates multiple concepts that can influence its performance, this paper organizes the RAG paradigm into four categories: pre-retrieval, retrieval, post-retrieval, and generation, offering a detailed perspective from the retrieval viewpoint. It outlines RAG's evolution and discusses the field's progression through the analysis of significant studies. Additionally, the paper introduces evaluation methods for RAG, addressing the challenges faced and proposing future research directions. By offering an organized framework and categorization, the study aims to consolidate existing research on RAG, clarify its technological underpinnings, and highlight its potential to broaden the adaptability and applications of LLMs.

This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.

北京阿比特科技有限公司