Point source localisation is generally modelled as a Lasso-type problem on measures. However, optimisation methods in non-Hilbert spaces, such as the space of Radon measures, are much less developed than in Hilbert spaces. Most numerical algorithms for point source localisation are based on the Frank-Wolfe conditional gradient method, for which ad hoc convergence theory is developed. We develop extensions of proximal-type methods to spaces of measures. This includes forward-backward splitting, its inertial version, and primal-dual proximal splitting. Their convergence proofs follow standard patterns. We demonstrate their numerical efficacy.
Quantum computing has recently emerged as a transformative technology. Yet, its promised advantages rely on efficiently translating quantum operations into viable physical realizations. In this work, we use generative machine learning models, specifically denoising diffusion models (DMs), to facilitate this transformation. Leveraging text-conditioning, we steer the model to produce desired quantum operations within gate-based quantum circuits. Notably, DMs allow to sidestep during training the exponential overhead inherent in the classical simulation of quantum dynamics -- a consistent bottleneck in preceding ML techniques. We demonstrate the model's capabilities across two tasks: entanglement generation and unitary compilation. The model excels at generating new circuits and supports typical DM extensions such as masking and editing to, for instance, align the circuit generation to the constraints of the targeted quantum device. Given their flexibility and generalization abilities, we envision DMs as pivotal in quantum circuit synthesis, enhancing both practical applications but also insights into theoretical quantum computation.
Interior point methods (IPMs) that handle nonconvex constraints such as IPOPT, KNITRO and LOQO have had enormous practical success. We consider IPMs in the setting where the objective and constraints are thrice differentiable, and have Lipschitz first and second derivatives on the feasible region. We provide an IPM that, starting from a strictly feasible point, finds a $\mu$-approximate Fritz John point by solving $\mathcal{O}( \mu^{-7/4})$ trust-region subproblems. For IPMs that handle nonlinear constraints, this result represents the first iteration bound with a polynomial dependence on $1/\mu$. We also show how to use our method to find scaled-KKT points starting from an infeasible solution and improve on existing complexity bounds.
We study two classic variants of block-structured integer programming. Two-stage stochastic programs are integer programs of the form $\{A_i \mathbf{x} + D_i \mathbf{y}_i = \mathbf{b}_i\textrm{ for all }i=1,\ldots,n\}$, where $A_i$ and $D_i$ are bounded-size matrices. On the other hand, $n$-fold programs are integer programs of the form $\{{\sum_{i=1}^n C_i\mathbf{y}_i=\mathbf{a}} \textrm{ and } D_i\mathbf{y}_i=\mathbf{b}_i\textrm{ for all }i=1,\ldots,n\}$, where again $C_i$ and $D_i$ are bounded-size matrices. It is known that solving these kind of programs is fixed-parameter tractable when parameterized by the maximum dimension among the relevant matrices $A_i,C_i,D_i$ and the maximum absolute value of any entry appearing in the constraint matrix. We show that the parameterized tractability results for two-stage stochastic and $n$-fold programs persist even when one allows large entries in the global part of the program. More precisely, we prove that: - The feasibility problem for two-stage stochastic programs is fixed-parameter tractable when parameterized by the dimensions of matrices $A_i,D_i$ and by the maximum absolute value of the entries of matrices $D_i$. That is, we allow matrices $A_i$ to have arbitrarily large entries. - The linear optimization problem for $n$-fold integer programs that are uniform -- all matrices $C_i$ are equal -- is fixed-parameter tractable when parameterized by the dimensions of matrices $C_i$ and $D_i$ and by the maximum absolute value of the entries of matrices $D_i$. That is, we require that $C_i=C$ for all $i=1,\ldots,n$, but we allow $C$ to have arbitrarily large entries. In the second result, the uniformity assumption is necessary; otherwise the problem is $\mathsf{NP}$-hard already when the parameters take constant values. Both our algorithms are weakly polynomial: the running time is measured in the total bitsize of the input.
In prediction settings where data are collected over time, it is often of interest to understand both the importance of variables for predicting the response at each time point and the importance summarized over the time series. Building on recent advances in estimation and inference for variable importance measures, we define summaries of variable importance trajectories. These measures can be estimated and the same approaches for inference can be applied regardless of the choice of the algorithm(s) used to estimate the prediction function. We propose a nonparametric efficient estimation and inference procedure as well as a null hypothesis testing procedure that are valid even when complex machine learning tools are used for prediction. Through simulations, we demonstrate that our proposed procedures have good operating characteristics, and we illustrate their use by investigating the longitudinal importance of risk factors for suicide attempt.
This work is concerned with the analysis of a space-time finite element discontinuous Galerkin method on polytopal meshes (XT-PolydG) for the numerical discretization of wave propagation in coupled poroelastic-elastic media. The mathematical model consists of the low-frequency Biot's equations in the poroelastic medium and the elastodynamics equation for the elastic one. To realize the coupling, suitable transmission conditions on the interface between the two domains are (weakly) embedded in the formulation. The proposed PolydG discretization in space is then coupled with a dG time integration scheme, resulting in a full space-time dG discretization. We present the stability analysis for both the continuous and the semidiscrete formulations, and we derive error estimates for the semidiscrete formulation in a suitable energy norm. The method is applied to a wide set of numerical test cases to verify the theoretical bounds. Examples of physical interest are also presented to investigate the capability of the proposed method in relevant geophysical scenarios.
We study Whitney-type estimates for approximation of convex functions in the uniform norm on various convex multivariate domains while paying a particular attention to the dependence of the involved constants on the dimension and the geometry of the domain.
In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to explore the minimum architectural requirements for $\textit{recurrent}$ neural circuits to sample from complex distributions. We first consider the traditional sampling model consisting of a network of neurons whose outputs directly represent the samples (sampler-only network). We argue that synaptic current and firing-rate dynamics in the traditional model have limited capacity to sample from a complex probability distribution. We show that the firing rate dynamics of a recurrent neural circuit with a separate set of output units can sample from an arbitrary probability distribution. We call such circuits reservoir-sampler networks (RSNs). We propose an efficient training procedure based on denoising score matching that finds recurrent and output weights such that the RSN implements Langevin sampling. We empirically demonstrate our model's ability to sample from several complex data distributions using the proposed neural dynamics and discuss its applicability to developing the next generation of sampling-based brain models.
Generalized linear models (GLMs) are routinely used for modeling relationships between a response variable and a set of covariates. The simple form of a GLM comes with easy interpretability, but also leads to concerns about model misspecification impacting inferential conclusions. A popular semi-parametric solution adopted in the frequentist literature is quasi-likelihood, which improves robustness by only requiring correct specification of the first two moments. We develop a robust approach to Bayesian inference in GLMs through quasi-posterior distributions. We show that quasi-posteriors provide a coherent generalized Bayes inference method, while also approximating so-called coarsened posteriors. In so doing, we obtain new insights into the choice of coarsening parameter. Asymptotically, the quasi-posterior converges in total variation to a normal distribution and has important connections with the loss-likelihood bootstrap posterior. We demonstrate that it is also well-calibrated in terms of frequentist coverage. Moreover, the loss-scale parameter has a clear interpretation as a dispersion, and this leads to a consolidated method of moments estimator.
We propose a new variable selection procedure for a functional linear model with multiple scalar responses and multiple functional predictors. This method is based on basis expansions of the involved functional predictors and coefficients that lead to a multivariate linear regression model. Then a criterion by means of which the variable selection problem reduces to that of estimating a suitable set is introduced. Estimation of this set is achieved by using appropriate penalizations of estimates of this criterion, so leading to our proposal. A simulation study that permits to investigate the effectiveness of the proposed approach and to compare it with existing methods is given.
Novel categories are commonly defined as those unobserved during training but present during testing. However, partially labelled training datasets can contain unlabelled training samples that belong to novel categories, meaning these can be present in training and testing. This research is the first to generalise between what we call observed-novel and unobserved-novel categories within a new learning policy called open-set learning with augmented category by exploiting unlabelled data or Open-LACU. After surveying existing learning policies, we introduce Open-LACU as a unified policy of positive and unlabelled learning, semi-supervised learning and open-set recognition. Subsequently, we develop the first Open-LACU model using an algorithmic training process of the relevant research fields. The proposed Open-LACU classifier achieves state-of-the-art and first-of-its-kind results.