亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Activation patching is a popular mechanistic interpretability technique, but has many subtleties regarding how it is applied and how one may interpret the results. We provide a summary of advice and best practices, based on our experience using this technique in practice. We include an overview of the different ways to apply activation patching and a discussion on how to interpret the results. We focus on what evidence patching experiments provide about circuits, and on the choice of metric and associated pitfalls.

相關內容

Recurrent neural networks (RNNs) notoriously struggle to learn long-term memories, primarily due to vanishing and exploding gradients. The recent success of state-space models (SSMs), a subclass of RNNs, to overcome such difficulties challenges our theoretical understanding. In this paper, we delve into the optimization challenges of RNNs and discover that, as the memory of a network increases, changes in its parameters result in increasingly large output variations, making gradient-based learning highly sensitive, even without exploding gradients. Our analysis further reveals the importance of the element-wise recurrence design pattern combined with careful parametrizations in mitigating this effect. This feature is present in SSMs, as well as in other architectures, such as LSTMs. Overall, our insights provide a new explanation for some of the difficulties in gradient-based learning of RNNs and why some architectures perform better than others.

Robotic exploration has long captivated researchers aiming to map complex environments efficiently. Techniques such as potential fields and frontier exploration have traditionally been employed in this pursuit, primarily focusing on solitary agents. Recent advancements have shifted towards optimizing exploration efficiency through multiagent systems. However, many existing approaches overlook critical real-world factors, such as broadcast range limitations, communication costs, and coverage overlap. This paper addresses these gaps by proposing a distributed maze exploration strategy (CU-LVP) that assumes constrained broadcast ranges and utilizes Voronoi diagrams for better area partitioning. By adapting traditional multiagent methods to distributed environments with limited broadcast ranges, this study evaluates their performance across diverse maze topologies, demonstrating the efficacy and practical applicability of the proposed method. The code and experimental results supporting this study are available in the following repository: //github.com/manouslinard/multiagent-exploration/.

Within distributed learning, workers typically compute gradients on their assigned dataset chunks and send them to the parameter server (PS), which aggregates them to compute either an exact or approximate version of $\nabla L$ (gradient of the loss function $L$). However, in large-scale clusters, many workers are slower than their promised speed or even failure-prone. A gradient coding solution introduces redundancy within the assignment of chunks to the workers and uses coding theoretic ideas to allow the PS to recover $\nabla L$ (exactly or approximately), even in the presence of stragglers. Unfortunately, most existing gradient coding protocols are inefficient from a computation perspective as they coarsely classify workers as operational or failed; the potentially valuable work performed by slow workers (partial stragglers) is ignored. In this work, we present novel gradient coding protocols that judiciously leverage the work performed by partial stragglers. Our protocols are efficient from a computation and communication perspective and numerically stable. For an important class of chunk assignments, we present efficient algorithms for optimizing the relative ordering of chunks within the workers; this ordering affects the overall execution time. For exact gradient reconstruction, our protocol is around $2\times$ faster than the original class of protocols and for approximate gradient reconstruction, the mean-squared-error of our reconstructed gradient is several orders of magnitude better.

Expander graphs, due to their mixing properties, are useful in many algorithms and combinatorial constructions. One can produce an expander graph with high probability by taking a random graph (e.g., the union of $d$ random bijections for a bipartite graph of degree $d$). This construction is much simpler than all known explicit constructions of expanders and gives graphs with good mixing properties (small second largest eigenvalue) with high probability. However, from the practical viewpoint, it uses too many random bits, so it is difficult to generate and store these bits for large graphs. The natural idea is to restrict the class of the bijections that we use. For example, if both sides are linear spaces $\mathbb{F}_q^k$ over a finite field $\mathbb{F}_q$, we may consider only \emph{linear} bijections, making the number of random bits polynomial in $k$ (and not $q^k$). In this paper we provide some experimental data that shows that this approach conserves the mixing properties (the second eigenvalue) for several types of graphs (undirected regular and biregular bipartite graphs). We also prove some upper bounds for the second eigenvalue (though they are quite weak compared with the experimental results). Finally, we discuss the possibility to decrease the number of random bits further by using Toeplitz matrices; our experiments show that this change makes the mixing properties only marginally worse while the number of random bits decreases significantly.

The execution of Belief-Desire-Intention (BDI) agents in a Multi-Agent System (MAS) can be practically implemented on top of low-level concurrency mechanisms that impact on efficiency, determinism, and reproducibility. We argue that developers should specify the MAS behaviour independently of the execution model, and choose or configure the concurrency model later on, according to the specific needs of their target domain, leaving the MAS specification unaffected. We identify patterns for mapping the agent execution over the underlying concurrency abstractions, and investigate which concurrency models are supported by some of the most commonly used BDI platforms. Although most frameworks support multiple concurrency models, we find that they mostly hide them under the hood, making them opaque to the developer, and actually limiting the possibility of fine-tuning the MAS.

Covariate-adaptive randomization is widely employed to balance baseline covariates in interventional studies such as clinical trials and experiments in development economics. Recent years have witnessed substantial progress in inference under covariate-adaptive randomization with a fixed number of strata. However, concerns have been raised about the impact of a large number of strata on its design and analysis, which is a common scenario in practice, such as in multicenter randomized clinical trials. In this paper, we propose a general framework for inference under covariate-adaptive randomization, which extends the seminal works of Bugni et al. (2018, 2019) by allowing for a diverging number of strata. Furthermore, we introduce a novel weighted regression adjustment that ensures efficiency improvement. On top of establishing the asymptotic theory, practical algorithms for handling situations involving an extremely large number of strata are also developed. Moreover, by linking design balance and inference robustness, we highlight the advantages of stratified block randomization, which enforces better covariate balance within strata compared to simple randomization. This paper offers a comprehensive landscape of inference under covariate-adaptive randomization, spanning from fixed to diverging to extremely large numbers of strata.

Characterized by an outer integral connected to an inner integral through a nonlinear function, nested integration is a challenging problem in various fields, such as engineering and mathematical finance. The available numerical methods for nested integration based on Monte Carlo (MC) methods can be prohibitively expensive owing to the error propagating from the inner to the outer integral. Attempts to enhance the efficiency of these approximations using the quasi-MC (QMC) or randomized QMC (rQMC) method have focused on either the inner or outer integral approximation. This work introduces a novel nested rQMC method that simultaneously addresses the approximation of the inner and outer integrals. This method leverages the unique nested integral structure to offer a more efficient approximation mechanism. By incorporating Owen's scrambling techniques, we address integrands exhibiting infinite variation in the Hardy--Krause sense, enabling theoretically sound error estimates. As the primary contribution, we derive asymptotic error bounds for the bias and variance of our estimator, along with the regularity conditions under which these bounds can be attained. In addition, we provide nearly optimal sample sizes for the rQMC approximations underlying the numerical implementation of the proposed method. Moreover, we derive a truncation scheme to make our estimator applicable in the context of expected information gain estimation and indicate how to use importance sampling to remedy the measure concentration arising in the inner integral. We verify the estimator quality through numerical experiments by comparing the computational efficiency of the nested rQMC method against standard nested MC integration for two case studies: one in thermomechanics and the other in pharmacokinetics. These examples highlight the computational savings and enhanced applicability of the proposed approach.

Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe and responsible deployment of large language models (LLMs). Developing effective protection against many modes of attack prompts requires discovering diverse attacks. Automated red-teaming typically uses reinforcement learning to fine-tune an attacker language model to generate prompts that elicit undesirable responses from a target LLM, as measured, for example, by an auxiliary toxicity classifier. We show that even with explicit regularization to favor novelty and diversity, existing approaches suffer from mode collapse or fail to generate effective attacks. As a flexible and probabilistically principled alternative, we propose to use GFlowNet fine-tuning, followed by a secondary smoothing phase, to train the attacker model to generate diverse and effective attack prompts. We find that the attacks generated by our method are effective against a wide range of target LLMs, both with and without safety tuning, and transfer well between target LLMs. Finally, we demonstrate that models safety-tuned using a dataset of red-teaming prompts generated by our method are robust to attacks from other RL-based red-teaming approaches.

This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language

Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly. We illustrate the utility of CheckList with tests for three tasks, identifying critical failures in both commercial and state-of-art models. In a user study, a team responsible for a commercial sentiment analysis model found new and actionable bugs in an extensively tested model. In another user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.

北京阿比特科技有限公司