Neuro-evolutionary methods have proven effective in addressing a wide range of tasks. However, the study of the robustness and generalisability of evolved artificial neural networks (ANNs) has remained limited. This has immense implications in the fields like robotics where such controllers are used in control tasks. Unexpected morphological or environmental changes during operation can risk failure if the ANN controllers are unable to handle these changes. This paper proposes an algorithm that aims to enhance the robustness and generalisability of the controllers. This is achieved by introducing morphological variations during the evolutionary process. As a results, it is possible to discover generalist controllers that can handle a wide range of morphological variations sufficiently without the need of the information regarding their morphologies or adaptation of their parameters. We perform an extensive experimental analysis on simulation that demonstrates the trade-off between specialist and generalist controllers. The results show that generalists are able to control a range of morphological variations with a cost of underperforming on a specific morphology relative to a specialist. This research contributes to the field by addressing the limited understanding of robustness and generalisability in neuro-evolutionary methods and proposes a method by which to improve these properties.
In prediction settings where data are collected over time, it is often of interest to understand both the importance of variables for predicting the response at each time point and the importance summarized over the time series. Building on recent advances in estimation and inference for variable importance measures, we define summaries of variable importance trajectories. These measures can be estimated and the same approaches for inference can be applied regardless of the choice of the algorithm(s) used to estimate the prediction function. We propose a nonparametric efficient estimation and inference procedure as well as a null hypothesis testing procedure that are valid even when complex machine learning tools are used for prediction. Through simulations, we demonstrate that our proposed procedures have good operating characteristics, and we illustrate their use by investigating the longitudinal importance of risk factors for suicide attempt.
We consider the estimation of the cumulative hazard function, and equivalently the distribution function, with censored data under a setup that preserves the privacy of the survival database. This is done through a $\alpha$-locally differentially private mechanism for the failure indicators and by proposing a non-parametric kernel estimator for the cumulative hazard function that remains consistent under the privatization. Under mild conditions, we also prove lowers bounds for the minimax rates of convergence and show that estimator is minimax optimal under a well-chosen bandwidth.
We propose a new method for the construction of layer-adapted meshes for singularly perturbed differential equations (SPDEs), based on mesh partial differential equations (MPDEs) that incorporate \emph{a posteriori} solution information. There are numerous studies on the development of parameter robust numerical methods for SPDEs that depend on the layer-adapted mesh of Bakhvalov. In~\citep{HiMa2021}, a novel MPDE-based approach for constructing a generalisation of these meshes was proposed. Like with most layer-adapted mesh methods, the algorithms in that article depended on detailed derivations of \emph{a priori} bounds on the SPDE's solution and its derivatives. In this work we extend that approach so that it instead uses \emph{a posteriori} computed estimates of the solution. We present detailed algorithms for the efficient implementation of the method, and numerical results for the robust solution of two-parameter reaction-convection-diffusion problems, in one and two dimensions. We also provide full FEniCS code for a one-dimensional example.
We study the numerical approximation of a coupled hyperbolic-parabolic system by a family of discontinuous Galerkin space-time finite element methods. The model is rewritten as a first-order evolutionary problem that is treated by the unified abstract solution theory of R.\ Picard. To preserve the mathematical structure of the evolutionary equation on the fully discrete level, suitable generalizations of the distribution gradient and divergence operators on broken polynomial spaces on which the discontinuous Galerkin approach is built on are defined. Well-posedness of the fully discrete problem and error estimates for the discontinuous Galerkin approximation in space and time are proved.
In this work, we present new proofs of convergence for Plug-and-Play (PnP) algorithms. PnP methods are efficient iterative algorithms for solving image inverse problems where regularization is performed by plugging a pre-trained denoiser in a proximal algorithm, such as Proximal Gradient Descent (PGD) or Douglas-Rachford Splitting (DRS). Recent research has explored convergence by incorporating a denoiser that writes exactly as a proximal operator. However, the corresponding PnP algorithm has then to be run with stepsize equal to $1$. The stepsize condition for nonconvex convergence of the proximal algorithm in use then translates to restrictive conditions on the regularization parameter of the inverse problem. This can severely degrade the restoration capacity of the algorithm. In this paper, we present two remedies for this limitation. First, we provide a novel convergence proof for PnP-DRS that does not impose any restrictions on the regularization parameter. Second, we examine a relaxed version of the PGD algorithm that converges across a broader range of regularization parameters. Our experimental study, conducted on deblurring and super-resolution experiments, demonstrate that both of these solutions enhance the accuracy of image restoration.
In this work, a new class of vector-valued phase field models is presented, where the values of the phase parameters are constrained by a convex set. The generated phase fields feature the partition of the domain into patches of distinct phases, separated by thin interfaces. The configuration and dynamics of the phases are directly dependent on the geometry and topology of the convex constraint set, which makes it possible to engineer models of this type that exhibit desired interactions and patterns. An efficient proximal gradient solver is introduced to study numerically their L2-gradient flow, i.e.~the associated Allen-Cahn-type equation. Applying the solver together with various choices for the convex constraint set, yields numerical results that feature a number of patterns observed in nature and engineering, such as multiphase grains in metal alloys, traveling waves in reaction-diffusion systems, and vortices in magnetic materials.
Stochastic filtering is a vibrant area of research in both control theory and statistics, with broad applications in many scientific fields. Despite its extensive historical development, there still lacks an effective method for joint parameter-state estimation in SDEs. The state-of-the-art particle filtering methods suffer from either sample degeneracy or information loss, with both issues stemming from the dynamics of the particles generated to represent system parameters. This paper provides a novel and effective approach for joint parameter-state estimation in SDEs via Rao-Blackwellization and modularization. Our method operates in two layers: the first layer estimates the system states using a bootstrap particle filter, and the second layer marginalizes out system parameters explicitly. This strategy circumvents the need to generate particles representing system parameters, thereby mitigating their associated problems of sample degeneracy and information loss. Moreover, our method employs a modularization approach when integrating out the parameters, which significantly reduces the computational complexity. All these designs ensure the superior performance of our method. Finally, a numerical example is presented to illustrate that our method outperforms existing approaches by a large margin.
In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to explore the minimum architectural requirements for $\textit{recurrent}$ neural circuits to sample from complex distributions. We first consider the traditional sampling model consisting of a network of neurons whose outputs directly represent the samples (sampler-only network). We argue that synaptic current and firing-rate dynamics in the traditional model have limited capacity to sample from a complex probability distribution. We show that the firing rate dynamics of a recurrent neural circuit with a separate set of output units can sample from an arbitrary probability distribution. We call such circuits reservoir-sampler networks (RSNs). We propose an efficient training procedure based on denoising score matching that finds recurrent and output weights such that the RSN implements Langevin sampling. We empirically demonstrate our model's ability to sample from several complex data distributions using the proposed neural dynamics and discuss its applicability to developing the next generation of sampling-based brain models.
The study of time-inhomogeneous Markov jump processes is a traditional topic within probability theory that has recently attracted substantial attention in various applications. However, their flexibility also incurs a substantial mathematical burden which is usually circumvented by using well-known generic distributional approximations or simulations. This article provides a novel approximation method that tailors the dynamics of a time-homogeneous Markov jump process to meet those of its time-inhomogeneous counterpart on an increasingly fine Poisson grid. Strong convergence of the processes in terms of the Skorokhod $J_1$ metric is established, and convergence rates are provided. Under traditional regularity assumptions, distributional convergence is established for unconditional proxies, to the same limit. Special attention is devoted to the case where the target process has one absorbing state and the remaining ones transient, for which the absorption times also converge. Some applications are outlined, such as univariate hazard-rate density estimation, ruin probabilities, and multivariate phase-type density evaluation.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.