When making treatment selection decisions, it is essential to include a causal effect estimation analysis to compare potential outcomes under different treatments or controls, assisting in optimal selection. However, merely estimating individual treatment effects may not suffice for truly optimal decisions. Our study addressed this issue by incorporating additional criteria, such as the estimations' uncertainty, measured by the conditional value-at-risk, commonly used in portfolio and insurance management. For continuous outcomes observable before and after treatment, we incorporated a specific prediction condition. We prioritized treatments that could yield optimal treatment effect results and lead to post-treatment outcomes more desirable than pretreatment levels, with the latter condition being called the prediction criterion. With these considerations, we propose a comprehensive methodology for multitreatment selection. Our approach ensures satisfaction of the overlap assumption, crucial for comparing outcomes for treated and control groups, by training propensity score models as a preliminary step before employing traditional causal models. To illustrate a practical application of our methodology, we applied it to the credit card limit adjustment problem. Analyzing a fintech company's historical data, we found that relying solely on counterfactual predictions was inadequate for appropriate credit line modifications. Incorporating our proposed additional criteria significantly enhanced policy performance.
Deep neural networks have achieved remarkable success in diverse applications, prompting the need for a solid theoretical foundation. Recent research has identified the minimal width $\max\{2,d_x,d_y\}$ required for neural networks with input dimensions $d_x$ and output dimension $d_y$ that use leaky ReLU activations to universally approximate $L^p(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ on compacta. Here, we present an alternative proof for the minimal width of such neural networks, by directly constructing approximating networks using a coding scheme that leverages the properties of leaky ReLUs and standard $L^p$ results. The obtained construction has a minimal interior dimension of $1$, independent of input and output dimensions, which allows us to show that autoencoders with leaky ReLU activations are universal approximators of $L^p$ functions. Furthermore, we demonstrate that the normalizing flow LU-Net serves as a distributional universal approximator. We broaden our results to show that smooth invertible neural networks can approximate $L^p(\mathbb{R}^{d},\mathbb{R}^{d})$ on compacta when the dimension $d\geq 2$, which provides a constructive proof of a classical theorem of Brenier and Gangbo. In addition, we use a topological argument to establish that for FNNs with monotone Lipschitz continuous activations, $d_x+1$ is a lower bound on the minimal width required for the uniform universal approximation of continuous functions $C^0(\mathbb{R}^{d_x},\mathbb{R}^{d_y})$ on compacta when $d_x\geq d_y$.
Ongoing advances in microbiome profiling have allowed unprecedented insights into the molecular activities of microbial communities. This has fueled a strong scientific interest in understanding the critical role the microbiome plays in governing human health, by identifying microbial features associated with clinical outcomes of interest. Several aspects of microbiome data limit the applicability of existing variable selection approaches. In particular, microbiome data are high-dimensional, extremely sparse, and compositional. Importantly, many of the observed features, although categorized as different taxa, may play related functional roles. To address these challenges, we propose a novel compositional regression approach that leverages the data-adaptive clustering and variable selection properties of the spiked Dirichlet process to identify taxa that exhibit similar functional roles. Our proposed method, Bayesian Regression with Agglomerated Compositional Effects using a dirichLET process (BRACElet), enables the identification of a sparse set of features with shared impacts on the outcome, facilitating dimension reduction and model interpretation. We demonstrate that BRACElet outperforms existing approaches for microbiome variable selection through simulation studies and an application elucidating the impact of oral microbiome composition on insulin resistance.
A reasonable description of the degradation process is essential for credible reliability assessment in accelerated degradation testing. Existing methods usually use Markovian stochastic processes to describe the degradation process. However, degradation processes of some products are non-Markovian due to the interaction with environments. Misinterpretation of the degradation pattern may lead to biased reliability evaluations. Besides, owing to the differences in materials and manufacturing processes, products from the same population exhibit diverse degradation paths, further increasing the difficulty of accurately reliability estimation. To address the above issues, this paper proposes an accelerated degradation model incorporating memory effects and unit-to-unit variability. The memory effect in the degradation process is captured by the fractional Brownian motion, which reflects the non-Markovian characteristic of degradation. The unit-to-unit variability is considered in the acceleration model to describe diverse degradation paths. Then, lifetime and reliability under normal operating conditions are presented. Furthermore, to give an accurate estimation of the memory effect, a new statistical analysis method based on the expectation maximization algorithm is devised. The effectiveness of the proposed method is verified by a simulation case and a real-world tuner reliability analysis case. The simulation case shows that the estimation of the memory effect obtained by the proposed statistical analysis method is much more accurate than the traditional one. Moreover, ignoring unit-to-unit variability can lead to a highly biased estimation of the memory effect and reliability.
We study the semistability of quiver representations from an algorithmic perspective. We present efficient algorithms for several fundamental computational problems on the semistability of quiver representations: deciding the semistability and $\sigma$-semistability, finding the maximizers of King's criterion, and computing the Harder--Narasimhan filtration. We also investigate a class of polyhedral cones defined by the linear system in King's criterion, which we refer to as King cones. For rank-one representations, we demonstrate that these King cones can be encoded by submodular flow polytopes, enabling us to decide the $\sigma$-semistability in strongly polynomial time. Our approach employs submodularity in quiver representations, which may be of independent interest.
The direct parametrisation method for invariant manifolds is adjusted to consider a varying parameter. More specifically, the case of systems experiencing a Hopf bifurcation in the parameter range of interest are investigated, and the ability to predict the amplitudes of the limit cycle oscillations after the bifurcation is demonstrated. The cases of the Ziegler pendulum and Beck's column, both of which have a follower force, are considered for applications. By comparison with the eigenvalue trajectories in the conservative case, it is advocated that using two master modes to derive the ROM, instead of only considering the unstable one, should give more accurate results. Also, in the specific case where an exceptional bifurcation point is met, a numerical strategy enforcing the presence of Jordan blocks in the Jacobian matrix during the procedure, is devised. The ROMs are constructed for the Ziegler pendulum having two and three degrees of freedom, and then Beck's column is investigated, where a finite element procedure is used to space discretize the problem. The numerical results show the ability of the ROMs to correctly predict the amplitude of the limit cycles up to a certain range, and it is shown that computing the ROM after the Hopf bifurcation gives the most satisfactory results. This feature is analyzed in terms of phase space representations, and the two proposed adjustments are shown to improve the validity range of the ROMs.
Gaussian process are a widely-used statistical tool for conducting non-parametric inference in applied sciences, with many computational packages available to fit to data and predict future observations. We study the use of the Greta software for Bayesian inference to apply Gaussian process regression to spatio-temporal data of infectious disease outbreaks and predict future disease spread. Greta builds on Tensorflow, making it comparatively easy to take advantage of the significant gain in speed offered by GPUs. In these complex spatio-temporal models, we show a reduction of up to 70\% in computational time relative to fitting the same models on CPUs. We show how the choice of covariance kernel impacts the ability to infer spread and extrapolate to unobserved spatial and temporal units. The inference pipeline is applied to weekly incidence data on tuberculosis in the East and West Midlands regions of England over a period of two years.
Splitting methods are widely used for solving initial value problems (IVPs) due to their ability to simplify complicated evolutions into more manageable subproblems which can be solved efficiently and accurately. Traditionally, these methods are derived using analytic and algebraic techniques from numerical analysis, including truncated Taylor series and their Lie algebraic analogue, the Baker--Campbell--Hausdorff formula. These tools enable the development of high-order numerical methods that provide exceptional accuracy for small timesteps. Moreover, these methods often (nearly) conserve important physical invariants, such as mass, unitarity, and energy. However, in many practical applications the computational resources are limited. Thus, it is crucial to identify methods that achieve the best accuracy within a fixed computational budget, which might require taking relatively large timesteps. In this regime, high-order methods derived with traditional methods often exhibit large errors since they are only designed to be asymptotically optimal. Machine Learning techniques offer a potential solution since they can be trained to efficiently solve a given IVP with less computational resources. However, they are often purely data-driven, come with limited convergence guarantees in the small-timestep regime and do not necessarily conserve physical invariants. In this work, we propose a framework for finding machine learned splitting methods that are computationally efficient for large timesteps and have provable convergence and conservation guarantees in the small-timestep limit. We demonstrate numerically that the learned methods, which by construction converge quadratically in the timestep size, can be significantly more efficient than established methods for the Schr\"{o}dinger equation if the computational budget is limited.
Domain generalisation in computational histopathology is challenging because the images are substantially affected by differences among hospitals due to factors like fixation and staining of tissue and imaging equipment. We hypothesise that focusing on nuclei can improve the out-of-domain (OOD) generalisation in cancer detection. We propose a simple approach to improve OOD generalisation for cancer detection by focusing on nuclear morphology and organisation, as these are domain-invariant features critical in cancer detection. Our approach integrates original images with nuclear segmentation masks during training, encouraging the model to prioritise nuclei and their spatial arrangement. Going beyond mere data augmentation, we introduce a regularisation technique that aligns the representations of masks and original images. We show, using multiple datasets, that our method improves OOD generalisation and also leads to increased robustness to image corruptions and adversarial attacks. The source code is available at //github.com/undercutspiky/SFL/
The heterogeneity of treatment effect (HTE) lies at the heart of precision medicine. Randomized controlled trials are gold-standard for treatment effect estimation but are typically underpowered for heterogeneous effects. In contrast, large observational studies have high predictive power but are often confounded due to the lack of randomization of treatment. We show that an observational study, even subject to hidden confounding, may be used to empower trials in estimating the HTE using the notion of confounding function. The confounding function summarizes the impact of unmeasured confounders on the difference between the observed treatment effect and the causal treatment effect, given the observed covariates, which is unidentifiable based only on the observational study. Coupling the trial and observational study, we show that the HTE and confounding function are identifiable. We then derive the semiparametric efficient scores and the integrative estimators of the HTE and confounding function. We clarify the conditions under which the integrative estimator of the HTE is strictly more efficient than the trial estimator. Finally, we illustrate the integrative estimators via simulation and an application.
We consider the stochastic heat equation driven by a multiplicative Gaussian noise that is white in time and spatially homogeneous in space. Assuming that the spatial correlation function is given by a Riesz kernel of order $\alpha \in (0,1)$, we prove a central limit theorem for power variations and other related functionals of the solution. To our surprise, there is no asymptotic bias despite the low regularity of the noise coefficient in the multiplicative case. We trace this circumstance back to cancellation effects between error terms arising naturally in second-order limit theorems for power variations.