苍井空无码免费换线_18GAY国产小鲜肉可播放_五月天婷婷欧美亚洲国产_亚洲国产欧美日韩精品综合_一区二区三区在线精品免费视频_日韩国产欧美综合视频在线_国产综合无码视频在线观看

Understanding how neural systems efficiently process information through distributed representations is a fundamental challenge at the interface of neuroscience and machine learning. Recent approaches analyze the statistical and geometrical attributes of neural representations as population-level mechanistic descriptors of task implementation. In particular, manifold capacity has emerged as a promising framework linking population geometry to the separability of neural manifolds. However, this metric has been limited to linear readouts. Here, we propose a theoretical framework that overcomes this limitation by leveraging contextual input information. We derive an exact formula for the context-dependent capacity that depends on manifold geometry and context correlations, and validate it on synthetic and real data. Our framework's increased expressivity captures representation untanglement in deep networks at early stages of the layer hierarchy, previously inaccessible to analysis. As context-dependent nonlinearity is ubiquitous in neural systems, our data-driven and theoretically grounded approach promises to elucidate context-dependent computation across scales, datasets, and models.

相關內容

流形

關注 3

置信度 · MoDELS · Performer · 輸出 · 估計/估計量 ·

2024 年 6 月 26 日

CAS: Confidence Assessments of classification algorithms for Semantic segmentation of EO data

Nikolaos Dionelis,Nicolas Longepe

from arxiv, 5 pages, 7 figures, 4 tables, Submitted

Confidence assessments of semantic segmentation algorithms in remote sensing are important. It is a desirable property of models to a priori know if they produce an incorrect output. Evaluations of the confidence assigned to the estimates of models for the task of classification in Earth Observation (EO) are crucial as they can be used to achieve improved semantic segmentation performance and prevent high error rates during inference and deployment. The model we develop, the Confidence Assessments of classification algorithms for Semantic segmentation (CAS) model, performs confidence evaluations at both the segment and pixel levels, and outputs both labels and confidence. The outcome of this work has important applications. The main application is the evaluation of EO Foundation Models on semantic segmentation downstream tasks, in particular land cover classification using satellite Copernicus Sentinel-2 data. The evaluation shows that the proposed model is effective and outperforms other alternative baseline models.

Learning · Networking · 約束 · contrastive · 泛函 ·

2024 年 6 月 24 日

Improving physics-informed DeepONets with hard constraints

Rüdiger Brecht,Dmytro R. Popovych,Alex Bihlo,Roman O. Popovych

from arxiv, 26 pages, 8 figures, 6 tables; extended version

Current physics-informed (standard or deep operator) neural networks still rely on accurately learning the initial and/or boundary conditions of the system of differential equations they are solving. In contrast, standard numerical methods involve such conditions in computations without needing to learn them. In this study, we propose to improve current physics-informed deep learning strategies such that initial and/or boundary conditions do not need to be learned and are represented exactly in the predicted solution. Moreover, this method guarantees that when a deep operator network is applied multiple times to time-step a solution of an initial value problem, the resulting function is at least continuous.

2024 年 6 月 24 日

Feature learning as alignment: a structural property of gradient descent in non-linear neural networks

Daniel Beaglehole,Ioannis Mitliagkas,Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs through feature learning is one of the most important unsolved problems in supervised learning. Prior works demonstrated that the gram matrices of the weights (the neural feature matrices, NFM) and the average gradient outer products (AGOP) become correlated during training, in a statement known as the neural feature ansatz (NFA). Through the NFA, the authors introduce mapping with the AGOP as a general mechanism for neural feature learning. However, these works do not provide a theoretical explanation for this correlation or its origins. In this work, we further clarify the nature of this correlation, and explain its emergence. We show that this correlation is equivalent to alignment between the left singular structure of the weight matrices and the newly defined pre-activation tangent features at each layer. We further establish that the alignment is driven by the interaction of weight changes induced by SGD with the pre-activation features, and analyze the resulting dynamics analytically at early times in terms of simple statistics of the inputs and labels. Finally, motivated by the observation that the NFA is driven by this centered correlation, we introduce a simple optimization rule that dramatically increases the NFA correlations at any given layer and improves the quality of features learned.

簇 · Performer · MoDELS · 視覺識別系統 · 分類數據 ·

2024 年 6 月 23 日

VICatMix: variational Bayesian clustering and variable selection for discrete biomedical data

Paul D. W. Kirk,Jackie Rao

Effective clustering of biomedical data is crucial in precision medicine, enabling accurate stratifiction of patients or samples. However, the growth in availability of high-dimensional categorical data, including `omics data, necessitates computationally efficient clustering algorithms. We present VICatMix, a variational Bayesian finite mixture model designed for the clustering of categorical data. The use of variational inference (VI) in its training allows the model to outperform competitors in term of efficiency, while maintaining high accuracy. VICatMix furthermore performs variable selection, enhancing its performance on high-dimensional, noisy data. The proposed model incorporates summarisation and model averaging to mitigate poor local optima in VI, allowing for improved estimation of the true number of clusters simultaneously with feature saliency. We demonstrate the performance of VICatMix with both simulated and real-world data, including applications to datasets from The Cancer Genome Atlas (TCGA), showing its use in cancer subtyping and driver gene discovery. We demonstrate VICatMix's utility in integrative cluster analysis with different `omics datasets, enabling the discovery of novel subtypes. \textbf{Availability:} VICatMix is freely available as an R package, incorporating C++ for faster computation, at \url{//github.com/j-ackierao/VICatMix}.

2024 年 6 月 23 日

Comparison of methods for mediation analysis with multiple correlated mediators

Mary Appah,D. Leann Long,George Howard,Melissa J. Smith

Various methods have emerged for conducting mediation analyses with multiple correlated mediators, each with distinct strengths and limitations. However, a comparative evaluation of these methods is lacking, providing the motivation for this paper. This study examines six mediation analysis methods for multiple correlated mediators that provide insights to the contributors for health disparities. We assessed the performance of each method in identifying joint or path-specific mediation effects in the context of binary outcome variables varying mediator types and levels of residual correlation between mediators. Through comprehensive simulations, the performance of six methods in estimating joint and/or path-specific mediation effects was assessed rigorously using a variety of metrics including bias, mean squared error, coverage and width of the 95$\%$ confidence intervals. Subsequently, these methods were applied to the REasons for Geographic And Racial Differences in Stroke (REGARDS) study, where differing conclusions were obtained depending on the mediation method employed. This evaluation provides valuable guidance for researchers grappling with complex multi-mediator scenarios, enabling them to select an optimal mediation method for their research question and dataset.

可辨認的 · 近似 · 散度 · 有向 · 方陣 ·

2024 年 6 月 21 日

Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities

Matthew T. C. Li,Tiangang Cui,Fengyi Li,Youssef Marzouk,Olivier Zahm

Identifying low-dimensional structure in high-dimensional probability measures is an essential pre-processing step for efficient sampling. We introduce a method for identifying and approximating a target measure $\pi$ as a perturbation of a given reference measure $\mu$ along a few significant directions of $\mathbb{R}^{d}$. The reference measure can be a Gaussian or a nonlinear transformation of a Gaussian, as commonly arising in generative modeling. Our method extends prior work on minimizing majorizations of the Kullback--Leibler divergence to identify optimal approximations within this class of measures. Our main contribution unveils a connection between the \emph{dimensional} logarithmic Sobolev inequality (LSI) and approximations with this ansatz. Specifically, when the target and reference are both Gaussian, we show that minimizing the dimensional LSI is equivalent to minimizing the KL divergence restricted to this ansatz. For general non-Gaussian measures, the dimensional LSI produces majorants that uniformly improve on previous majorants for gradient-based dimension reduction. We further demonstrate the applicability of this analysis to the squared Hellinger distance, where analogous reasoning shows that the dimensional Poincar\'e inequality offers improved bounds.

MoDELS · 控制器 · 樣本 · 可辨認的 · Performer ·

2024 年 6 月 19 日

Testing identification in mediation and dynamic treatment models

Martin Huber,Kevin Kloiber,Lukas Laffers

from arxiv, 49 pages, 4 figures

We propose a test for the identification of causal effects in mediation and dynamic treatment models that is based on two sets of observed variables, namely covariates to be controlled for and suspected instruments, building on the test by Huber and Kueck (2022) for single treatment models. We consider models with a sequential assignment of a treatment and a mediator to assess the direct treatment effect (net of the mediator), the indirect treatment effect (via the mediator), or the joint effect of both treatment and mediator. We establish testable conditions for identifying such effects in observational data. These conditions jointly imply (1) the exogeneity of the treatment and the mediator conditional on covariates and (2) the validity of distinct instruments for the treatment and the mediator, meaning that the instruments do not directly affect the outcome (other than through the treatment or mediator) and are unconfounded given the covariates. Our framework extends to post-treatment sample selection or attrition problems when replacing the mediator by a selection indicator for observing the outcome, enabling joint testing of the selectivity of treatment and attrition. We propose a machine learning-based test to control for covariates in a data-driven manner and analyze its finite sample performance in a simulation study. Additionally, we apply our method to Slovak labor market data and find that our testable implications are not rejected for a sequence of training programs typically considered in dynamic treatment evaluations.

可辨認的 · 估計/估計量 · 樣本 · INFORMS · 條件獨立的 ·

2024 年 6 月 18 日

Transporting treatment effects from difference-in-differences studies

Audrey Renson,Ellicott C. Matthay,Kara E. Rudolph

Difference-in-differences (DID) is a popular approach to identify the causal effects of treatments and policies in the presence of unmeasured confounding. DID identifies the sample average treatment effect in the treated (SATT). However, a goal of such research is often to inform decision-making in target populations outside the treated sample. Transportability methods have been developed to extend inferences from study samples to external target populations; these methods have primarily been developed and applied in settings where identification is based on conditional independence between the treatment and potential outcomes, such as in a randomized trial. We present a novel approach to identifying and estimating effects in a target population, based on DID conducted in a study sample that differs from the target population. We present a range of assumptions under which one may identify causal effects in the target population and employ causal diagrams to illustrate these assumptions. In most realistic settings, results depend critically on the assumption that any unmeasured confounders are not effect measure modifiers on the scale of the effect of interest (e.g., risk difference, odds ratio). We develop several estimators of transported effects, including g-computation, inverse odds weighting, and a doubly robust estimator based on the efficient influence function. Simulation results support theoretical properties of the proposed estimators. As an example, we apply our approach to study the effects of a 2018 US federal smoke-free public housing law on air quality in public housing across the US, using data from a DID study conducted in New York City alone.

Learning · Networking · Weight · Neural Networks · 線性的 ·

2024 年 6 月 18 日

On instabilities in neural network-based physics simulators

Daniel Floryan

from arxiv, 15 pages

When neural networks are trained from data to simulate the dynamics of physical systems, they encounter a persistent challenge: the long-time dynamics they produce are often unphysical or unstable. We analyze the origin of such instabilities when learning linear dynamical systems, focusing on the training dynamics. We make several analytical findings which empirical observations suggest extend to nonlinear dynamical systems. First, the rate of convergence of the training dynamics is uneven and depends on the distribution of energy in the data. As a special case, the dynamics in directions where the data have no energy cannot be learned. Second, in the unlearnable directions, the dynamics produced by the neural network depend on the weight initialization, and common weight initialization schemes can produce unstable dynamics. Third, injecting synthetic noise into the data during training adds damping to the training dynamics and can stabilize the learned simulator, though doing so undesirably biases the learned dynamics. For each contributor to instability, we suggest mitigative strategies. We also highlight important differences between learning discrete-time and continuous-time dynamics, and discuss extensions to nonlinear systems.

Automator · 約束 · 設計 · 表示 · AIM ·

2024 年 6 月 18 日

AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints

Yu-Zhe Shi,Haofei Hou,Zhangqian Bi,Fanxu Meng,Xiang Wei,Lecheng Ruan,Qining Wang

from arxiv, In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL'24)

Accurate representation of procedures in restricted scenarios, such as non-standardized scientific experiments, requires precise depiction of constraints. Unfortunately, Domain-specific Language (DSL), as an effective tool to express constraints structurally, often requires case-by-case hand-crafting, necessitating customized, labor-intensive efforts. To overcome this challenge, we introduce the AutoDSL framework to automate DSL-based constraint design across various domains. Utilizing domain specified experimental protocol corpora, AutoDSL optimizes syntactic constraints and abstracts semantic constraints. Quantitative and qualitative analyses of the DSLs designed by AutoDSL across five distinct domains highlight its potential as an auxiliary module for language models, aiming to improve procedural planning and execution.