18禁不卡无毒免费网站入口,黄色网站一级二级三级视频

We empirically study a simple layer-pruning strategy for popular families of open-weight pretrained LLMs, finding minimal degradation of performance on different question-answering benchmarks until after a large fraction (up to half) of the layers are removed. To prune these models, we identify the optimal block of layers to prune by considering similarity across layers; then, to "heal" the damage, we perform a small amount of finetuning. In particular, we use parameter-efficient finetuning (PEFT) methods, specifically quantization and Low Rank Adapters (QLoRA), such that each of our experiments can be performed on a single A100 GPU. From a practical perspective, these results suggest that layer pruning methods can complement other PEFT strategies to further reduce computational resources of finetuning on the one hand, and can improve the memory and latency of inference on the other hand. From a scientific perspective, the robustness of these LLMs to the deletion of layers implies either that current pretraining methods are not properly leveraging the parameters in the deeper layers of the network or that the shallow layers play a critical role in storing knowledge.

相關內容

層

關注 2

LastPass · 分解的 · Analysis · Integration · Cognition ·

2024 年 5 月 9 日

Human Factors in the LastPass Breach

Niroop Sugunaraj

This paper examines the complex nature of cyber attacks through an analysis of the LastPass breach. It argues for the integration of human-centric considerations into cybersecurity measures, focusing on mitigating factors such as goal-directed behavior, cognitive overload, human biases (e.g., optimism, anchoring), and risky behaviors. Findings from an analysis of this breach offers support to the perspective that addressing both the human and technical dimensions of cyber defense can significantly enhance the resilience of cyber systems against complex threats. This means maintaining a balanced approach while simultaneously simplifying user interactions, making users aware of biases, and discouraging risky practices are essential for preventing cyber incidents.

SLAM · 模型評估 · 穩健性 · 設計 · 數據集 ·

2024 年 5 月 9 日

Design and Evaluation of a Generic Visual SLAM Framework for Multi-Camera Systems

Pushyami Kaveti,Shankara Narayanan Vaidyanathan,Arvind Thamilchelvan,Hanumant Singh

Multi-camera systems have been shown to improve the accuracy and robustness of SLAM estimates, yet state-of-the-art SLAM systems predominantly support monocular or stereo setups. This paper presents a generic sparse visual SLAM framework capable of running on any number of cameras and in any arrangement. Our SLAM system uses the generalized camera model, which allows us to represent an arbitrary multi-camera system as a single imaging device. Additionally, it takes advantage of the overlapping fields of view (FoV) by extracting cross-matched features across cameras in the rig. This limits the linear rise in the number of features with the number of cameras and keeps the computational load in check while enabling an accurate representation of the scene. We evaluate our method in terms of accuracy, robustness, and run time on indoor and outdoor datasets that include challenging real-world scenarios such as narrow corridors, featureless spaces, and dynamic objects. We show that our system can adapt to different camera configurations and allows real-time execution for typical robotic applications. Finally, we benchmark the impact of the critical design parameters - the number of cameras and the overlap between their FoV that define the camera configuration for SLAM. All our software and datasets are freely available for further research.

相互獨立的 · Networking · MoDELS · Extensibility · INTERACT ·

2024 年 5 月 9 日

On the Potential of an Independent Avatar to Augment Metaverse Social Networks

Theofanis P. Raptis,Chiara Boldrini,Marco Conti,Andrea Passarella

from arxiv, Supported by the projects: Piano Nazionale di Ripresa e Resilienza IR0000013 - "SoBigData.it", Partenariato Esteso PE00000013 - "FAIR", Centro Nazionale CN00000013 - "ICSC"

We present a computational modelling approach which targets capturing the specifics on how to virtually augment a Metaverse user's available social time capacity via using an independent and autonomous version of her digital representation in the Metaverse. We motivate why this is a fundamental building block to model large-scale social networks in the Metaverse, and emerging properties herein. We envision a Metaverse-focused extension of the traditional avatar concept: An avatar can be as well programmed to operate independently when its user is not controlling it directly, thus turning it into an agent-based digital human representation. This way, we highlight how such an independent avatar could help its user to better navigate their social relationships and optimize their socializing time in the Metaverse by (partly) offloading some interactions to the avatar. We model the setting and identify the characteristic variables by using selected concepts from social sciences: ego networks, social presence, and social cues. Then, we formulate the problem of maximizing the user's non-avatar-mediated spare time as a linear optimization. Finally, we analyze the feasible region of the problem and we present some initial insights on the spare time that can be achieved for different parameter values of the avatar-mediated interactions.

類別 · MoDELS · 相互獨立的 · 操作 · 分離的 ·

2024 年 5 月 7 日

The Existential Theory of the Reals with Summation Operators

Markus Bl?ser,Julian D?rfler,Maciej Liskiewicz,Benito van der Zander

To characterize the computational complexity of satisfiability problems for probabilistic and causal reasoning within the Pearl's Causal Hierarchy, arXiv:2305.09508 [cs.AI] introduce a new natural class, named succ-$\exists$R. This class can be viewed as a succinct variant of the well-studied class $\exists$R based on the Existential Theory of the Reals (ETR). Analogously to $\exists$R, succ-$\exists$R is an intermediate class between NEXP and EXPSPACE, the exponential versions of NP and PSPACE. The main contributions of this work are threefold. Firstly, we characterize the class succ-$\exists$R in terms of nondeterministic real RAM machines and develop structural complexity theoretic results for real RAMs, including translation and hierarchy theorems. Notably, we demonstrate the separation of $\exists$R and succ-$\exists$R. Secondly, we examine the complexity of model checking and satisfiability of fragments of existential second-order logic and probabilistic independence logic. We show succ-$\exists$R- completeness of several of these problems, for which the best-known complexity lower and upper bounds were previously NEXP-hardness and EXPSPACE, respectively. Thirdly, while succ-$\exists$R is characterized in terms of ordinary (non-succinct) ETR instances enriched by exponential sums and a mechanism to index exponentially many variables, in this paper, we prove that when only exponential sums are added, the corresponding class $\exists$R^{\Sigma} is contained in PSPACE. We conjecture that this inclusion is strict, as this class is equivalent to adding a VNP-oracle to a polynomial time nondeterministic real RAM. Conversely, the addition of exponential products to ETR, yields PSPACE. Additionally, we study the satisfiability problem for probabilistic reasoning, with the additional requirement of a small model and prove that this problem is complete for $\exists$R^{\Sigma}.

近似 · 似然 · 推斷 · Integration · 高斯分布 ·

2024 年 5 月 7 日

Analytical Approximation of the ELBO Gradient in the Context of the Clutter Problem

Roumen Nikolaev Popov

from arxiv, 19 pages, 5 figures, supporting code available at //github.com/rpopov42/elbo_gaa

We propose an analytical solution for approximating the gradient of the Evidence Lower Bound (ELBO) in variational inference problems where the statistical model is a Bayesian network consisting of observations drawn from a mixture of a Gaussian distribution embedded in unrelated clutter, known as the clutter problem. The method employs the reparameterization trick to move the gradient operator inside the expectation and relies on the assumption that, because the likelihood factorizes over the observed data, the variational distribution is generally more compactly supported than the Gaussian distribution in the likelihood factors. This allows efficient local approximation of the individual likelihood factors, which leads to an analytical solution for the integral defining the gradient expectation. We integrate the proposed gradient approximation as the expectation step in an EM (Expectation Maximization) algorithm for maximizing ELBO and test against classical deterministic approaches in Bayesian inference, such as the Laplace approximation, Expectation Propagation and Mean-Field Variational Inference. The proposed method demonstrates good accuracy and rate of convergence together with linear computational complexity.

MoDELS · 圖 · Networking · CASE · 知識 (knowledge) ·

2024 年 5 月 7 日

Self-Stabilizing MIS Computation in the Beeping Model

George Giakkoupis,Volker Turau,Isabella Ziccardi

We consider self-stabilizing algorithms to compute a Maximal Independent Set (MIS) in the extremely weak beeping communication model. The model consists of an anonymous network with synchronous rounds. In each round, each vertex can optionally transmit a signal to all its neighbors (beep). After the transmission of a signal, each vertex can only differentiate between no signal received, or at least one signal received. We assume that vertices have some knowledge about the topology of the network. We revisit the not self-stabilizing algorithm proposed by Jeavons, Scott, and Xu (2013), which computes an MIS in the beeping model. We enhance this algorithm to be self-stabilizing, and explore two different variants, which differ in the knowledge about the topology available to the vertices. In the first variant, every vertex knows an upper bound on the maximum degree $\Delta$ of the graph. For this case, we prove that the proposed self-stabilizing version maintains the same run-time as the original algorithm, i.e. it stabilizes after $O(\log n)$ rounds w.h.p. on any $n$-vertex graph. In the second variant, each vertex only knows an upper bound on its own degree. For this case, we prove that the algorithm stabilizes after $O(\log n\cdot \log \log n)$ rounds on any $n$-vertex graph, w.h.p.

正則化項 · CASES · 有向 · 樣例 · 知識 (knowledge) ·

2024 年 5 月 7 日

On the Gelfand Problem and Viscosity Matrices for Two-Dimensional Hyperbolic Systems of Conservation Laws

Shaoshuai Chu,Igor Kliakhandler,Alexander Kurganov

We present counter-intuitive examples of a viscous regularizations of a two-dimensional strictly hyperbolic system of conservation laws. The regularizations are obtained using two different viscosity matrices. While for both of the constructed ``viscous'' systems waves propagating in either $x$- or $y$-directions are stable, oblique waves may be linearly unstable. Numerical simulations fully corroborate these analytical results. To the best of our knowledge, this is the first nontrivial result related to the multidimensional Gelfand problem. Our conjectures provide direct answer to Gelfand's problem both in one- and multi-dimensional cases.

MoDELS · Pivotal（公司） · 通用智能 · 語言模型化 · 多峰值 ·

2024 年 1 月 25 日

A Survey of Reasoning with Foundation Models

Jiankai Sun,Chuanyang Zheng,Enze Xie,Zhengying Liu,Ruihang Chu,Jianing Qiu,Jiaqi Xu,Mingyu Ding,Hongyang Li,Mengzhe Geng,Yue Wu,Wenhai Wang,Junsong Chen,Zhangyue Yin,Xiaozhe Ren,Jie Fu,Junxian He,Wu Yuan,Qi Liu,Xihui Liu,Yu Li,Hao Dong,Yu Cheng,Ming Zhang,Pheng Ann Heng,Jifeng Dai,Ping Luo,Jingdong Wang,Ji-Rong Wen,Xipeng Qiu,Yike Guo,Hui Xiong,Qun Liu,Zhenguo Li

from arxiv, 20 Figures, 160 Pages, 750+ References, Project Page //github.com/reasoning-survey/Awesome-Reasoning-Foundation-Models

Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves as a fundamental methodology in the field of Artificial General Intelligence (AGI). With the ongoing development of foundation models, e.g., Large Language Models (LLMs), there is a growing interest in exploring their abilities in reasoning tasks. In this paper, we introduce seminal foundation models proposed or adaptable for reasoning, highlighting the latest advancements in various reasoning tasks, methods, and benchmarks. We then delve into the potential future directions behind the emergence of reasoning abilities within foundation models. We also discuss the relevance of multimodal learning, autonomous agents, and super alignment in the context of reasoning. By discussing these future research directions, we hope to inspire researchers in their exploration of this field, stimulate further advancements in reasoning with foundation models, and contribute to the development of AGI.

Networking · 殘差網絡 · 縮放 · Weight · 平滑 ·

2021 年 5 月 25 日

Scaling Properties of Deep Residual Networks

Alain-Sam Cohen,Rama Cont,Alain Rossier,Renyuan Xu

from arxiv, Published at ICML 2021

Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

多峰值 · 情感分析 · MoDELS · AIM · Tumblr ·

2018 年 5 月 25 日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Anthony Hu,Seth Flaxman

from arxiv, Accepted as a conference paper at KDD 2018

We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing. Our goal is different than the standard sentiment analysis goal of predicting whether a sentence expresses positive or negative sentiment; instead, we aim to infer the latent emotional state of the user. Thus, we focus on predicting the emotion word tags attached by users to their Tumblr posts, treating these as "self-reported emotions." We demonstrate that our multimodal model combining both text and image features outperforms separate models based solely on either images or text. Our model's results are interpretable, automatically yielding sensible word lists associated with emotions. We explore the structure of emotions implied by our model and compare it to what has been posited in the psychology literature, and validate our model on a set of images that have been used in psychology studies. Finally, our work also provides a useful tool for the growing academic study of images - both photographs and memes - on social networks.