亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='egzu3'></tfoot>

<legend id='egzu3'><style id='egzu3'><dir id='egzu3'><q id='egzu3'></q></dir></style></legend>

<i id='egzu3'><tr id='egzu3'><dt id='egzu3'><q id='egzu3'><span id='egzu3'><b id='egzu3'><form id='egzu3'><ins id='egzu3'></ins><ul id='egzu3'></ul><sub id='egzu3'></sub></form><legend id='egzu3'></legend><bdo id='egzu3'><pre id='egzu3'><center id='egzu3'></center></pre></bdo></b><th id='egzu3'></th></span></q></dt></tr></i><div id='egzu3'><tfoot id='egzu3'></tfoot><dl id='egzu3'><fieldset id='egzu3'></fieldset></dl></div>

<li id='egzu3'><abbr id='egzu3'></abbr></li>

·

Networking · Neural Networks · 可理解性 · 神經元 · 代碼 ·

2024 年 1 月 31 日

Understanding polysemanticity in neural networks through coding theory

Simon C. Marshall,Jan H. Kirchner

Despite substantial efforts, neural network interpretability remains an elusive goal, with previous research failing to provide succinct explanations of most single neurons' impact on the network output. This limitation is due to the polysemantic nature of most neurons, whereby a given neuron is involved in multiple unrelated network states, complicating the interpretation of that neuron. In this paper, we apply tools developed in neuroscience and information theory to propose both a novel practical approach to network interpretability and theoretical insights into polysemanticity and the density of codes. We infer levels of redundancy in the network's code by inspecting the eigenspectrum of the activation's covariance matrix. Furthermore, we show how random projections can reveal whether a network exhibits a smooth or non-differentiable code and hence how interpretable the code is. This same framework explains the advantages of polysemantic neurons to learning performance and explains trends found in recent results by Elhage et al.~(2022). Our approach advances the pursuit of interpretability in neural networks, providing insights into their underlying structure and suggesting new avenues for circuit-level interpretability.

相關內容

Networking

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

Neural Networks · Networking · INFORMS · Performer · Boosting（一種模型訓練加速方式） ·

2024 年 3 月 12 日

Neural DAEs: Constrained neural networks

Tue Boesen,Eldad Haber,Uri Michael Ascher

from arxiv, Extended the paper to PDEs, added a third experiment denoising a vector field and updated the introduction to make the distinction between this work and physics informed neural networks more clear

This article investigates the effect of explicitly adding auxiliary algebraic trajectory information to neural networks for dynamical systems. We draw inspiration from the field of differential-algebraic equations and differential equations on manifolds and implement related methods in residual neural networks, despite some fundamental scenario differences. Constraint or auxiliary information effects are incorporated through stabilization as well as projection methods, and we show when to use which method based on experiments involving simulations of multi-body pendulums and molecular dynamics scenarios. Several of our methods are easy to implement in existing code and have limited impact on training performance while giving significant boosts in terms of inference.

MoDELS · Neural Networks · Networking · Continuity · 離散化 ·

2024 年 3 月 11 日

An interpretable neural network-based non-proportional odds model for ordinal regression

Akifumi Okuno,Kazuharu Harada

from arxiv, 35 pages, 18 figures, accepted to Journal of Computational and Graphical Statistics

This study proposes an interpretable neural network-based non-proportional odds model (N$^3$POM) for ordinal regression. N$^3$POM is different from conventional approaches to ordinal regression with non-proportional models in several ways: (1) N$^3$POM is defined for both continuous and discrete responses, whereas standard methods typically treat the ordered continuous variables as if they are discrete, (2) instead of estimating response-dependent finite-dimensional coefficients of linear models from discrete responses as is done in conventional approaches, we train a non-linear neural network to serve as a coefficient function. Thanks to the neural network, N$^3$POM offers flexibility while preserving the interpretability of conventional ordinal regression. We establish a sufficient condition under which the predicted conditional cumulative probability locally satisfies the monotonicity constraint over a user-specified region in the covariate space. Additionally, we provide a monotonicity-preserving stochastic (MPS) algorithm for effectively training the neural network. We apply N$^3$POM to several real-world datasets.

Neural Networks · Networking · MoDELS · Performer · 原點 ·

2024 年 3 月 11 日

An alternative approach to train neural networks using monotone variational inequality

Chen Xu,Xiuyuan Cheng,Yao Xie

We propose an alternative approach to neural network training using the monotone vector field, an idea inspired by the seminal work of Juditsky and Nemirovski [Juditsky & Nemirovsky, 2019] developed originally to solve parameter estimation problems for generalized linear models (GLM) by reducing the original non-convex problem to a convex problem of solving a monotone variational inequality (VI). Our approach leads to computationally efficient procedures that converge fast and offer guarantee in some special cases, such as training a single-layer neural network or fine-tuning the last layer of the pre-trained model. Our approach can be used for more efficient fine-tuning of a pre-trained model while freezing the bottom layers, an essential step for deploying many machine learning models such as large language models (LLM). We demonstrate its applicability in training fully-connected (FC) neural networks, graph neural networks (GNN), and convolutional neural networks (CNN) and show the competitive or better performance of our approach compared to stochastic gradient descent methods on both synthetic and real network data prediction tasks regarding various performance metrics.

Networking · 可理解性 · 評論員 · Performance · 回合 ·

2024 年 3 月 11 日

Communication network dynamics in a large organizational hierarchy

Nathaniel Josephs,Sida Peng,Forrest W. Crawford

Most businesses impose a supervisory hierarchy on employees to facilitate management, decision-making, and collaboration, yet routine inter-employee communication patterns within workplaces tend to emerge more naturally as a consequence of both supervisory relationships and the needs of the organization. What then is the relationship between a formal organizational structure and the emergent communications between its employees? Understanding the nature of this relationship is critical for the successful management of an organization. While scholars of organizational management have proposed theories relating organizational trees to communication dynamics, and separately, network scientists have studied the topological structure of communication patterns in different types of organizations, existing empirical analyses are both lacking in representativeness and limited in size. In fact, much of the methodology used to study the relationship between organizational hierarchy and communication patterns comes from analyses of the Enron email corpus, reflecting a uniquely dysfunctional corporate environment. In this paper, we develop new methodology for assessing the relationship between organizational hierarchy and communication dynamics and apply it to Microsoft Corporation, currently the highest valued company in the world, consisting of approximately 200,000 employees divided into 88 teams. This reveals distinct communication network structures within and between teams. We then characterize the relationship of routine employee communication patterns to these team supervisory hierarchies, while empirically evaluating several theories of organizational management and performance. To do so, we propose new measures of communication reciprocity and new shortest-path distances for trees to track the frequency of messages passed up, down, and across the organizational hierarchy.

模型評估 · 共軛梯度 · Notability · 線性的 · 共軛 ·

2024 年 3 月 11 日

A preconditioning for the spectral solution of incompressible variable-density flows

L. Reynier,Bastien Di Pierro,Frédéric Alizard,Anne Cadiou,Lionel Le Penven,Marc Buffat

In the present study, the efficiency of preconditioners for solving linear systems associated with the discretized variable-density incompressible Navier-Stokes equations with semiimplicit second-order accuracy in time and spectral accuracy in space is investigated. The method, in which the inverse operator for the constant-density flow system acts as preconditioner, is implemented for three iterative solvers: the General Minimal Residual, the Conjugate Gradient and the Richardson Minimal Residual. We discuss the method, first, in the context of the one-dimensional flow case where a top-hat like profile for the density is used. Numerical evidence shows that the convergence is significantly improved due to the notable decrease in the condition number of the operators. Most importantly, we then validate the robustness and convergence properties of the method on two more realistic problems: the two-dimensional Rayleigh-Taylor instability problem and the three-dimensional variable-density swirling jet.

泛函 · 表示定理 · 表示 · 統計理論 ·

2024 年 3 月 11 日

Maxitive functions with respect to general orders

M. Kupper,J. M. Zapata

In decision-making, maxitive functions are used for worst-case and best-case evaluations. Maxitivity gives rise to a rich structure that is well-studied in the context of the pointwise order. In this article, we investigate maxitivity with respect to general preorders and provide a representation theorem for such functionals. The results are illustrated for different stochastic orders in the literature, including the usual stochastic order, the increasing convex/concave order, and the dispersive order.

INTERACT · TOOLS · SimPLe · 基準 · 試驗 ·

2024 年 3 月 10 日

Interaction tests with covariate-adaptive randomization

Likun Zhang,Wei Ma

Treatment-covariate interaction tests are commonly applied by researchers to examine whether the treatment effect varies across patient subgroups defined by baseline characteristics. The objective of this study is to explore treatment-covariate interaction tests involving covariate-adaptive randomization. Without assuming a parametric data generating model, we investigate usual interaction tests and observe that they tend to be conservative: specifically, their limiting rejection probabilities under the null hypothesis do not exceed the nominal level and are typically strictly lower than it. To address this problem, we propose modifications to the usual tests to obtain corresponding valid tests. Moreover, we introduce a novel class of stratified-adjusted interaction tests that are simple, more powerful than the usual and modified tests, and broadly applicable to most covariate-adaptive randomization methods. The results are general to encompass two types of interaction tests: one involving stratification covariates and the other involving additional covariates that are not used for randomization. Our study clarifies the application of interaction tests in clinical trials and offers valuable tools for revealing treatment heterogeneity, crucial for advancing personalized medicine.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

Neural Networks · 優化器 · Networks · 局部極小 · Networking ·

2019 年 12 月 19 日

Optimization for deep learning: theory and algorithms

from arxiv, 38 pages of main body; 5 pages of appendix; 12 pages of references

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

Neural Networks

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tfoot id='HQE1v'></tfoot>

<legend id='DiLcl'><style id='x6XCd'><dir id='5TcuL'><q id='sp7tC'></q></dir></style></legend>

<i id='B6M4e'><tr id='s56OL'><dt id='vFCkB'><q id='sH4bg'><span id='0kN0P'><b id='DNulS'><form id='C8ujR'><ins id='ewfLs'></ins><ul id='L7hUZ'></ul><sub id='xcYcM'></sub></form><legend id='zO9tu'></legend><bdo id='qsNnL'><pre id='PJT7i'><center id='yltCx'></center></pre></bdo></b><th id='6BsTE'></th></span></q></dt></tr></i><div id='8vpZs'><tfoot id='mAE7l'></tfoot><dl id='1rkNu'><fieldset id='AlEbM'></fieldset></dl></div>

<li id='H8FXA'><abbr id='RsXNX'></abbr></li>