Modeling the complex relationships between multiple categorical response variables as a function of predictors is a fundamental task in the analysis of categorical data. However, existing methods can be difficult to interpret and may lack flexibility. To address these challenges, we introduce a penalized likelihood method for multivariate categorical response regression that relies on a novel subspace decomposition to parameterize interpretable association structures. Our approach models the relationships between categorical responses by identifying mutual, joint, and conditionally independent associations, which yields a linear problem within a tensor product space. We establish theoretical guarantees for our estimator, including error bounds in high-dimensional settings, and demonstrate the method's interpretability and prediction accuracy through comprehensive simulation studies.
Splitting methods are widely used for solving initial value problems (IVPs) due to their ability to simplify complicated evolutions into more manageable subproblems which can be solved efficiently and accurately. Traditionally, these methods are derived using analytic and algebraic techniques from numerical analysis, including truncated Taylor series and their Lie algebraic analogue, the Baker--Campbell--Hausdorff formula. These tools enable the development of high-order numerical methods that provide exceptional accuracy for small timesteps. Moreover, these methods often (nearly) conserve important physical invariants, such as mass, unitarity, and energy. However, in many practical applications the computational resources are limited. Thus, it is crucial to identify methods that achieve the best accuracy within a fixed computational budget, which might require taking relatively large timesteps. In this regime, high-order methods derived with traditional methods often exhibit large errors since they are only designed to be asymptotically optimal. Machine Learning techniques offer a potential solution since they can be trained to efficiently solve a given IVP with less computational resources. However, they are often purely data-driven, come with limited convergence guarantees in the small-timestep regime and do not necessarily conserve physical invariants. In this work, we propose a framework for finding machine learned splitting methods that are computationally efficient for large timesteps and have provable convergence and conservation guarantees in the small-timestep limit. We demonstrate numerically that the learned methods, which by construction converge quadratically in the timestep size, can be significantly more efficient than established methods for the Schr\"{o}dinger equation if the computational budget is limited.
Accurately characterizing migration velocity models is crucial for a wide range of geophysical applications, from hydrocarbon exploration to monitoring of CO2 sequestration projects. Traditional velocity model building methods such as Full-Waveform Inversion (FWI) are powerful but often struggle with the inherent complexities of the inverse problem, including noise, limited bandwidth, receiver aperture and computational constraints. To address these challenges, we propose a scalable methodology that integrates generative modeling, in the form of Diffusion networks, with physics-informed summary statistics, making it suitable for complicated imaging problems including field datasets. By defining these summary statistics in terms of subsurface-offset image volumes for poor initial velocity models, our approach allows for computationally efficient generation of Bayesian posterior samples for migration velocity models that offer a useful assessment of uncertainty. To validate our approach, we introduce a battery of tests that measure the quality of the inferred velocity models, as well as the quality of the inferred uncertainties. With modern synthetic datasets, we reconfirm gains from using subsurface-image gathers as the conditioning observable. For complex velocity model building involving salt, we propose a new iterative workflow that refines amortized posterior approximations with salt flooding and demonstrate how the uncertainty in the velocity model can be propagated to the final product reverse time migrated images. Finally, we present a proof of concept on field datasets to show that our method can scale to industry-sized problems.
Solving multiscale diffusion problems is often computationally expensive due to the spatial and temporal discretization challenges arising from high-contrast coefficients. To address this issue, a partially explicit temporal splitting scheme is proposed. By appropriately constructing multiscale spaces, the spatial multiscale property is effectively managed, and it has been demonstrated that the temporal step size is independent of the contrast. To enhance simulation speed, we propose a parallel algorithm for the multiscale flow problem that leverages the partially explicit temporal splitting scheme. The idea is first to evolve the partially explicit system using a coarse time step size, then correct the solution on each coarse time interval with a fine propagator, for which we consider both the sequential solver and all-at-once solver. This procedure is then performed iteratively till convergence. We analyze the stability and convergence of the proposed algorithm. The numerical experiments demonstrate that the proposed algorithm achieves high numerical accuracy for high-contrast problems and converges in a relatively small number of iterations. The number of iterations stays stable as the number of coarse intervals increases, thus significantly improving computational efficiency through parallel processing.
We consider the problem of sampling a multimodal distribution with a Markov chain given a small number of samples from the stationary measure. Although mixing can be arbitrarily slow, we show that if the Markov chain has a $k$th order spectral gap, initialization from a set of $\tilde O(k/\varepsilon^2)$ samples from the stationary distribution will, with high probability over the samples, efficiently generate a sample whose conditional law is $\varepsilon$-close in TV distance to the stationary measure. In particular, this applies to mixtures of $k$ distributions satisfying a Poincar\'e inequality, with faster convergence when they satisfy a log-Sobolev inequality. Our bounds are stable to perturbations to the Markov chain, and in particular work for Langevin diffusion over $\mathbb R^d$ with score estimation error, as well as Glauber dynamics combined with approximation error from pseudolikelihood estimation. This justifies the success of data-based initialization for score matching methods despite slow mixing for the data distribution, and improves and generalizes the results of Koehler and Vuong (2023) to have linear, rather than exponential, dependence on $k$ and apply to arbitrary semigroups. As a consequence of our results, we show for the first time that a natural class of low-complexity Ising measures can be efficiently learned from samples.
Learning causal effects of a binary exposure on time-to-event endpoints can be challenging because survival times may be partially observed due to censoring and systematically biased due to truncation. In this work, we present debiased machine learning-based nonparametric estimators of the joint distribution of a counterfactual survival time and baseline covariates for use when the observed data are subject to covariate-dependent left truncation and right censoring and when baseline covariates suffice to deconfound the relationship between exposure and survival time. Our inferential procedures explicitly allow the integration of flexible machine learning tools for nuisance estimation, and enjoy certain robustness properties. The approach we propose can be directly used to make pointwise or uniform inference on smooth summaries of the joint counterfactual survival time and covariate distribution, and can be valuable even in the absence of interventions, when summaries of a marginal survival distribution are of interest. We showcase how our procedures can be used to learn a variety of inferential targets and illustrate their performance in simulation studies.
We address the problem of identifying functional interactions among stochastic neurons with variable-length memory from their spiking activity. The neuronal network is modeled by a stochastic system of interacting point processes with variable-length memory. Each chain describes the activity of a single neuron, indicating whether it spikes at a given time. One neuron's influence on another can be either excitatory or inhibitory. To identify the existence and nature of an interaction between a neuron and its postsynaptic counterpart, we propose a model selection procedure based on the observation of the spike activity of a finite set of neurons over a finite time. The proposed procedure is also based on the maximum likelihood estimator for the synaptic weight matrix of the network neuronal model. In this sense, we prove the consistency of the maximum likelihood estimator followed by a proof of the consistency of the neighborhood interaction estimation procedure. The effectiveness of the proposed model selection procedure is demonstrated using simulated data, which validates the underlying theory. The method is also applied to analyze spike train data recorded from hippocampal neurons in rats during a visual attention task, where a computational model reconstructs the spiking activity and the results reveal interesting and biologically relevant information.
Robust and stable high order numerical methods for solving partial differential equations are attractive because they are efficient on modern and next generation hardware architectures. However, the design of provably stable numerical methods for nonlinear hyperbolic conservation laws pose a significant challenge. We present the dual-pairing (DP) and upwind summation-by-parts (SBP) finite difference (FD) framework for accurate and robust numerical approximations of nonlinear conservation laws. The framework has an inbuilt "limiter" whose goal is to detect and effectively resolve regions where the solution is poorly resolved and/or discontinuities are found. The DP SBP FD operators are a dual-pair of backward and forward FD stencils, which together preserve the SBP property. In addition, the DP SBP FD operators are designed to be upwind, that is they come with some innate dissipation everywhere, as opposed to traditional SBP and collocated discontinuous Galerkin spectral element methods which can only induce dissipation through numerical fluxes acting at element interfaces. We combine the DP SBP operators together with skew-symmetric and upwind flux splitting of nonlinear hyperbolic conservation laws. Our semi-discrete approximation is provably entropy-stable for arbitrary nonlinear hyperbolic conservation laws. The framework is high order accurate, provably entropy-stable, convergent, and avoids several pitfalls of current state-of-the-art high order methods. We give specific examples using the in-viscid Burger's equation, nonlinear shallow water equations and compressible Euler equations of gas dynamics. Numerical experiments are presented to verify accuracy and demonstrate the robustness of our numerical framework.
We perform a quantitative assessment of different strategies to compute the contribution due to surface tension in incompressible two-phase flows using a conservative level set (CLS) method. More specifically, we compare classical approaches, such as the direct computation of the curvature from the level set or the Laplace-Beltrami operator, with an evolution equation for the mean curvature recently proposed in literature. We consider the test case of a static bubble, for which an exact solution for the pressure jump across the interface is available, and the test case of an oscillating bubble, showing pros and cons of the different approaches.
Recently, general fractional calculus was introduced by Kochubei (2011) and Luchko (2021) as a further generalisation of fractional calculus, where the derivative and integral operator admits arbitrary kernel. Such a formalism will have many applications in physics and engineering, since the kernel is no longer restricted. We first extend the work of Al-Refai and Luchko (2023) on finite interval to arbitrary orders. Followed by, developing an efficient Petrov-Galerkin scheme by introducing Jacobi convolution polynomials as basis functions. A notable property of this basis function, the general fractional derivative of Jacobi convolution polynomial is a shifted Jacobi polynomial. Thus, with a suitable test function it results in diagonal stiffness matrix, hence, the efficiency in implementation. Furthermore, our method is constructed for any arbitrary kernel including that of fractional operator, since, its a special case of general fractional operator.
The susceptibility of timestepping algorithms to numerical instabilities is an important consideration when simulating partial differential equations (PDEs). Here we identify and analyze a pernicious numerical instability arising in pseudospectral simulations of nonlinear wave propagation resulting in finite-time blow-up. The blow-up time scale is independent of the spatial resolution and spectral basis but sensitive to the timestepping scheme and the timestep size. The instability appears in multi-step and multi-stage implicit-explicit (IMEX) timestepping schemes of different orders of accuracy and has been found to manifest in simulations of soliton solutions of the Korteweg-de Vries (KdV) equation and traveling wave solutions of a nonlinear generalized Klein-Gordon equation. Focusing on the case of KdV solitons, we show that modal predictions from linear stability theory are unable to explain the instability because the spurious growth from linear dispersion is small and nonlinear sources of error growth converge too slowly in the limit of small timestep size. We then develop a novel multi-scale asymptotic framework that captures the slow, nonlinear accumulation of timestepping errors. The framework allows the solution to vary with respect to multiple time scales related to the timestep size and thus recovers the instability as a function of a slow time scale dictated by the order of accuracy of the timestepping scheme. We show that this approach correctly describes our simulations of solitons by making accurate predictions of the blow-up time scale and transient features of the instability. Our work demonstrates that studies of long-time simulations of nonlinear waves should exercise caution when validating their timestepping schemes.