欧美综合一本热第九页-全网最新黄色网站

Modern high-throughput sequencing assays efficiently capture not only gene expression and different levels of gene regulation but also a multitude of genome variants. Focused analysis of alternative alleles of variable sites at homologous chromosomes of the human genome reveals allele-specific gene expression and allele-specific gene regulation by assessing allelic imbalance of read counts at individual sites. Here we formally describe an advanced statistical framework for detecting the allelic imbalance in allelic read counts at single-nucleotide variants detected in diverse omics studies (ChIP-Seq, ATAC-Seq, DNase-Seq, CAGE-Seq, and others). MIXALIME accounts for copy-number variants and aneuploidy, reference read mapping bias, and provides several scoring models to balance between sensitivity and specificity when scoring data with varying levels of experimental noise-caused overdispersion.

相關內容

估計/估計量

關注 3

優化器 · PDE · MoDELS · 設計 · 近似 ·

2023 年 10 月 27 日

Optimal design of large-scale nonlinear Bayesian inverse problems under model uncertainty

Alen Alexanderian,Ruanui Nicholson,Noemi Petra

from arxiv, 31 pages; revised version

We consider optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by partial differential equations (PDEs) under model uncertainty. Specifically, we consider inverse problems in which, in addition to the inversion parameters, the governing PDEs include secondary uncertain parameters. We focus on problems with infinite-dimensional inversion and secondary parameters and present a scalable computational framework for optimal design of such problems. The proposed approach enables Bayesian inversion and OED under uncertainty within a unified framework. We build on the Bayesian approximation error (BAE) approach, to incorporate modeling uncertainties in the Bayesian inverse problem, and methods for A-optimal design of infinite-dimensional Bayesian nonlinear inverse problems. Specifically, a Gaussian approximation to the posterior at the maximum a posteriori probability point is used to define an uncertainty aware OED objective that is tractable to evaluate and optimize. In particular, the OED objective can be computed at a cost, in the number of PDE solves, that does not grow with the dimension of the discretized inversion and secondary parameters. The OED problem is formulated as a binary bilevel PDE constrained optimization problem and a greedy algorithm, which provides a pragmatic approach, is used to find optimal designs. We demonstrate the effectiveness of the proposed approach for a model inverse problem governed by an elliptic PDE on a three-dimensional domain. Our computational results also highlight the pitfalls of ignoring modeling uncertainties in the OED and/or inference stages.

Automator · Analysis · 統計量 · Integration · BASIC ·

2023 年 10 月 26 日

AutoCT: Automated CT registration, segmentation, and quantification

Zhe Bai,Abdelilah Essiari,Talita Perciano,Kristofer E. Bouchard

The processing and analysis of computed tomography (CT) imaging is important for both basic scientific development and clinical applications. In AutoCT, we provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans. The engineered pipeline enables atlas-based CT segmentation and quantification leveraging diffeomorphic transformations through efficient forward and inverse mappings. The extracted localized features from the deformation field allow for downstream statistical learning that may facilitate medical diagnostics. On a lightweight and portable software platform, AutoCT provides a new toolkit for the CT imaging community to underpin the deployment of artificial intelligence-driven applications.

語音增強 · MoDELS · 方差 · 生成模型 · 自編碼器 ·

2023 年 10 月 26 日

A weighted-variance variational autoencoder model for speech enhancement

Ali Golmakani,Mostafa Sadeghi,Xavier Alameda-Pineda,Romain Serizel

We address speech enhancement based on variational autoencoders, which involves learning a speech prior distribution in the time-frequency (TF) domain. A zero-mean complex-valued Gaussian distribution is usually assumed for the generative model, where the speech information is encoded in the variance as a function of a latent variable. In contrast to this commonly used approach, we propose a weighted variance generative model, where the contribution of each spectrogram time-frame in parameter learning is weighted. We impose a Gamma prior distribution on the weights, which would effectively lead to a Student's t-distribution instead of Gaussian for speech generative modeling. We develop efficient training and speech enhancement algorithms based on the proposed generative model. Our experimental results on spectrogram auto-encoding and speech enhancement demonstrate the effectiveness and robustness of the proposed approach compared to the standard unweighted variance model.

Analysis · 流形 · 平滑 · TOOLS · 數據分析 ·

2023 年 10 月 26 日

Riemannian geometry for efficient analysis of protein dynamics data

Willem Diepeveen,Carlos Esteve-Yagüe,Jan Lellmann,Ozan ?ktem,Carola-Bibiane Sch?nlieb

An increasingly common viewpoint is that protein dynamics data sets reside in a non-linear subspace of low conformational energy. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich structure to account for a wide range of geometries that can be modelled after an energy landscape. Second, many standard data analysis tools initially developed for data in Euclidean space can also be generalised to data on a Riemannian manifold. In the context of protein dynamics, a conceptual challenge comes from the lack of a suitable smooth manifold and the lack of guidelines for constructing a smooth Riemannian structure based on an energy landscape. In addition, computational feasibility in computing geodesics and related mappings poses a major challenge. This work considers these challenges. The first part of the paper develops a novel local approximation technique for computing geodesics and related mappings on Riemannian manifolds in a computationally feasible manner. The second part constructs a smooth manifold of point clouds modulo rigid body group actions and a Riemannian structure that is based on an energy landscape for protein conformations. The resulting Riemannian geometry is tested on several data analysis tasks relevant for protein dynamics data. It performs exceptionally well on coarse-grained molecular dynamics simulated data. In particular, the geodesics with given start- and end-points approximately recover corresponding molecular dynamics trajectories for proteins that undergo relatively ordered transitions with medium sized deformations. The Riemannian protein geometry also gives physically realistic summary statistics and retrieves the underlying dimension even for large-sized deformations within seconds on a laptop.

LDPC · Color · 代碼 · 縮放 · 不變 ·

2023 年 10 月 25 日

Non-Clifford and parallelizable fault-tolerant logical gates on constant and almost-constant rate homological quantum LDPC codes via higher symmetries

Guanyu Zhu,Shehryar Sikander,Elia Portnoy,Andrew W. Cross,Benjamin J. Brown

from arxiv, 40 pages, 31 figures

We study parallel fault-tolerant quantum computing for families of homological quantum low-density parity-check (LDPC) codes defined on 3-manifolds with constant or almost-constant encoding rate. We derive generic formula for a transversal $T$ gate of color codes on general 3-manifolds, which acts as collective non-Clifford logical CCZ gates on any triplet of logical qubits with their logical-$X$ membranes having a $\mathbb{Z}_2$ triple intersection at a single point. The triple intersection number is a topological invariant, which also arises in the path integral of the emergent higher symmetry operator in a topological quantum field theory: the $\mathbb{Z}_2^3$ gauge theory. Moreover, the transversal $S$ gate of the color code corresponds to a higher-form symmetry supported on a codimension-1 submanifold, giving rise to exponentially many addressable and parallelizable logical CZ gates. We have developed a generic formalism to compute the triple intersection invariants for 3-manifolds and also study the scaling of the Betti number and systoles with volume for various 3-manifolds, which translates to the encoding rate and distance. We further develop three types of LDPC codes supporting such logical gates: (1) A quasi-hyperbolic code from the product of 2D hyperbolic surface and a circle, with almost-constant rate $k/n=O(1/\log(n))$ and $O(\log(n))$ distance; (2) A homological fibre bundle code with $O(1/\log^{\frac{1}{2}}(n))$ rate and $O(\log^{\frac{1}{2}}(n))$ distance; (3) A specific family of 3D hyperbolic codes: the Torelli mapping torus code, constructed from mapping tori of a pseudo-Anosov element in the Torelli subgroup, which has constant rate while the distance scaling is currently unknown. We then show a generic constant-overhead scheme for applying a parallelizable universal gate set with the aid of logical-$X$ measurements.

泛函 · 近似 · 樣本 · 優化器 · 標準正交 ·

2023 年 10 月 25 日

Optimal approximation of infinite-dimensional holomorphic functions II: recovery from i.i.d. pointwise samples

Ben Adcock,Nick Dexter,Sebastian Moraga

Infinite-dimensional, holomorphic functions have been studied in detail over the last several decades, due to their relevance to parametric differential equations and computational uncertainty quantification. The approximation of such functions from finitely many samples is of particular interest, due to the practical importance of constructing surrogate models to complex mathematical models of physical processes. In a previous work, [5] we studied the approximation of so-called Banach-valued, $(\boldsymbol{b},\varepsilon)$-holomorphic functions on the infinite-dimensional hypercube $[-1,1]^{\mathbb{N}}$ from $m$ (potentially adaptive) samples. In particular, we derived lower bounds for the adaptive $m$-widths for classes of such functions, which showed that certain algebraic rates of the form $m^{1/2-1/p}$ are the best possible regardless of the sampling-recovery pair. In this work, we continue this investigation by focusing on the practical case where the samples are pointwise evaluations drawn identically and independently from a probability measure. Specifically, for Hilbert-valued $(\boldsymbol{b},\varepsilon)$-holomorphic functions, we show that the same rates can be achieved (up to a small polylogarithmic or algebraic factor) for essentially arbitrary tensor-product Jacobi (ultraspherical) measures. Our reconstruction maps are based on least squares and compressed sensing procedures using the corresponding orthonormal Jacobi polynomials. In doing so, we strengthen and generalize past work that has derived weaker nonuniform guarantees for the uniform and Chebyshev measures (and corresponding polynomials) only. We also extend various best $s$-term polynomial approximation error bounds to arbitrary Jacobi polynomial expansions. Overall, we demonstrate that i.i.d.\ pointwise samples are near-optimal for the recovery of infinite-dimensional, holomorphic functions.

統計量 · 推斷 · Processing（編程語言） · MoDELS · 圖 ·

2023 年 10 月 25 日

Statistical inference for Gaussian Whittle-Matérn fields on metric graphs

David Bolin,Alexandre Simas,Jonas Wallin

from arxiv, 45 pages, 5 figures. arXiv admin note: text overlap with arXiv:2205.06163

Whittle-Mat\'ern fields are a recently introduced class of Gaussian processes on metric graphs, which are specified as solutions to a fractional-order stochastic differential equation. Unlike earlier covariance-based approaches for specifying Gaussian fields on metric graphs, the Whittle-Mat\'ern fields are well-defined for any compact metric graph and can provide Gaussian processes with differentiable sample paths. We derive the main statistical properties of the model class, particularly the consistency and asymptotic normality of maximum likelihood estimators of model parameters and the necessary and sufficient conditions for asymptotic optimality properties of linear prediction based on the model with misspecified parameters. The covariance function of the Whittle-Mat\'ern fields is generally unavailable in closed form, and they have therefore been challenging to use for statistical inference. However, we show that for specific values of the fractional exponent, when the fields have Markov properties, likelihood-based inference and spatial prediction can be performed exactly and computationally efficiently. This facilitates using the Whittle-Mat\'ern fields in statistical applications involving big datasets without the need for any approximations. The methods are illustrated via an application to modeling of traffic data, where allowing for differentiable processes dramatically improves the results.

潛在 · Processing（編程語言） · MoDELS · Continuity · INTERACT ·

2023 年 10 月 25 日

Latent event history models for quasi-reaction systems

Matteo Framba,Veronica Vinciotti,Ernst C. Wit

Various processes can be modelled as quasi-reaction systems of stochastic differential equations, such as cell differentiation and disease spreading. Since the underlying data of particle interactions, such as reactions between proteins or contacts between people, are typically unobserved, statistical inference of the parameters driving these systems is developed from concentration data measuring each unit in the system over time. While observing the continuous time process at a time scale as fine as possible should in theory help with parameter estimation, the existing Local Linear Approximation (LLA) methods fail in this case, due to numerical instability caused by small changes of the system at successive time points. On the other hand, one may be able to reconstruct the underlying unobserved interactions from the observed count data. Motivated by this, we first formalise the latent event history model underlying the observed count process. We then propose a computationally efficient Expectation-Maximation algorithm for parameter estimation, with an extended Kalman filtering procedure for the prediction of the latent states. A simulation study shows the performance of the proposed method and highlights the settings where it is particularly advantageous compared to the existing LLA approaches. Finally, we present an illustration of the methodology on the spreading of the COVID-19 pandemic in Italy.

有限差分 · 散度 · 可約的 · SimPLe · 離散化 ·

2023 年 10 月 24 日

A high order accurate bound-preserving compact finite difference scheme for two-dimensional incompressible flow

Hao Li,Xiangxiong Zhang

For solving two-dimensional incompressible flow in the vorticity form by the fourth-order compact finite difference scheme and explicit strong stability preserving (SSP) temporal discretizations, we show that the simple bound-preserving limiter in [Li H., Xie S., Zhang X., SIAM J. Numer. Anal., 56 (2018)]. can enforce the strict bounds of the vorticity, if the velocity field satisfies a discrete divergence free constraint. For reducing oscillations, a modified TVB limiter adapted from [Cockburn B., Shu CW., SIAM J. Numer. Anal., 31 (1994)] is constructed without affecting the bound-preserving property. This bound-preserving finite difference method can be used for any passive convection equation with a divergence free velocity field.

MoDELS · 可辨認的 · 相互獨立的 · state-of-the-art · 逼真度 ·

2023 年 10 月 24 日

Context-aware feature attribution through argumentation

Jinfeng Zhong,Elsa Negre

Feature attribution is a fundamental task in both machine learning and data analysis, which involves determining the contribution of individual features or variables to a model's output. This process helps identify the most important features for predicting an outcome. The history of feature attribution methods can be traced back to General Additive Models (GAMs), which extend linear regression models by incorporating non-linear relationships between dependent and independent variables. In recent years, gradient-based methods and surrogate models have been applied to unravel complex Artificial Intelligence (AI) systems, but these methods have limitations. GAMs tend to achieve lower accuracy, gradient-based methods can be difficult to interpret, and surrogate models often suffer from stability and fidelity issues. Furthermore, most existing methods do not consider users' contexts, which can significantly influence their preferences. To address these limitations and advance the current state-of-the-art, we define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction. Additionally, CA-FATA formulates feature attribution as an argumentation procedure, and each computation has explicit semantics, which makes it inherently interpretable. CA-FATA also easily integrates side information, such as users' contexts, resulting in more accurate predictions.