Targeted Maximum Likelihood Estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate eight missing data methods in this context: complete-case analysis, extended TMLE incorporating outcome-missingness model, missing covariate missing indicator method, five multiple imputation (MI) approaches using parametric or machine-learning models. Six scenarios were considered, varying in exposure/outcome generation models (presence of confounder-confounder interactions) and missingness mechanisms (whether outcome influenced missingness in other variables and presence of interaction/non-linear terms in missingness models). Complete-case analysis and extended TMLE had small biases when outcome did not influence missingness in other variables. Parametric MI without interactions had large bias when exposure/outcome generation models included interactions. Parametric MI including interactions performed best in bias and variance reduction across all settings, except when missingness models included a non-linear term. When choosing a method to handle missing data in the context of TMLE, researchers must consider the missingness mechanism and, for MI, compatibility with the analysis method. In many settings, a parametric MI approach that incorporates interactions and non-linearities is expected to perform well.
Estimating matrices in the symmetric positive-definite (SPD) cone is of interest for many applications ranging from computer vision to graph learning. While there exist various convex optimization-based estimators, they remain limited in expressivity due to their model-based approach. The success of deep learning has thus led many to use neural networks to learn to estimate SPD matrices in a data-driven fashion. For learning structured outputs, one promising strategy involves architectures designed by unrolling iterative algorithms, which potentially benefit from inductive bias properties. However, designing correct unrolled architectures for SPD learning is difficult: they either do not guarantee that their output has all the desired properties, rely on heavy computations, or are overly restrained to specific matrices which hinders their expressivity. In this paper, we propose a novel and generic learning module with guaranteed SPD outputs called SpodNet, that also enables learning a larger class of functions than existing approaches. Notably, it solves the challenging task of learning jointly SPD and sparse matrices. Our experiments demonstrate the versatility of SpodNet layers.
The Spatial AutoRegressive model (SAR) is commonly used in studies involving spatial and network data to estimate the spatial or network peer influence and the effects of covariates on the response, taking into account the spatial or network dependence. While the model can be efficiently estimated with a Quasi maximum likelihood approach (QMLE), the detrimental effect of covariate measurement error on the QMLE and how to remedy it is currently unknown. If covariates are measured with error, then the QMLE may not have the $\sqrt{n}$ convergence and may even be inconsistent even when a node is influenced by only a limited number of other nodes or spatial units. We develop a measurement error-corrected ML estimator (ME-QMLE) for the parameters of the SAR model when covariates are measured with error. The ME-QMLE possesses statistical consistency and asymptotic normality properties. We consider two types of applications. The first is when the true covariate cannot be measured directly, and a proxy is observed instead. The second one involves including latent homophily factors estimated with error from the network for estimating peer influence. Our numerical results verify the bias correction property of the estimator and the accuracy of the standard error estimates in finite samples. We illustrate the method on a real dataset related to county-level death rates from the COVID-19 pandemic.
A Bayesian data assimilation scheme is formulated for advection-dominated advective and diffusive evolutionary problems, based upon the Dynamic Likelihood (DLF) approach to filtering. The DLF was developed specifically for hyperbolic problems -waves-, and in this paper, it is extended via a split step formulation, to handle advection-diffusion problems. In the dynamic likelihood approach, observations and their statistics are used to propagate probabilities along characteristics, evolving the likelihood in time. The estimate posterior thus inherits phase information. For advection-diffusion the advective part of the time evolution is handled on the basis of observations alone, while the diffusive part is informed through the model as well as observations. We expect, and indeed show here, that in advection-dominated problems, the DLF approach produces better estimates than other assimilation approaches, particularly when the observations are sparse and have low uncertainty. The added computational expense of the method is cubic in the total number of observations over time, which is on the same order of magnitude as a standard Kalman filter and can be mitigated by bounding the number of forward propagated observations, discarding the least informative data.
Digital MemComputing machines (DMMs), which employ nonlinear dynamical systems with memory (time non-locality), have proven to be a robust and scalable unconventional computing approach for solving a wide variety of combinatorial optimization problems. However, most of the research so far has focused on the numerical simulations of the equations of motion of DMMs. This inevitably subjects time to discretization, which brings its own (numerical) issues that would be otherwise absent in actual physical systems operating in continuous time. Although hardware realizations of DMMs have been previously suggested, their implementation would require materials and devices that are not so easy to integrate with traditional electronics. Addressing this, our study introduces a novel hardware design for DMMs, utilizing readily available electronic components. This approach not only significantly boosts computational speed compared to current models but also exhibits remarkable robustness against additive noise. Crucially, it circumvents the limitations imposed by numerical noise, ensuring enhanced stability and reliability during extended operations. This paves a new path for tackling increasingly complex problems, leveraging the inherent advantages of DMMs in a more practical and accessible framework.
The literature shows the possible existence of a problem called collinearity in both Nelson-Siegel and Nelson-Siegel-Svensson models due to the relationship between the slope and curvature components. The presence of this problem and the estimation of both models by Ordinary Least Squares would lead to coefficients estimates that may be unstable among other consequences. However, these estimates are used to make monetary policy decisions. For this reason, it is important to try mitigating this collinearity problem. Consequently, some authors propose traditional procedures for the treatment of collinearity such as: non-linear optimisation, to fix the shape parameter or ridge regression. Nevertheless, all these processes have their disadvantages. Alternatively, a new method with good properties called raise regression is proposed in this paper. Finally, the methodologies are illustrated with an empirical comparison on Euribor Overnight Index Swap and Euribor Interest Rates Swap data between 2011 and 2021.
Universal adversarial perturbation (UAP), also known as image-agnostic perturbation, is a fixed perturbation map that can fool the classifier with high probabilities on arbitrary images, making it more practical for attacking deep models in the real world. Previous UAP methods generate a scale-fixed and texture-fixed perturbation map for all images, which ignores the multi-scale objects in images and usually results in a low fooling ratio. Since the widely used convolution neural networks tend to classify objects according to semantic information stored in local textures, it seems a reasonable and intuitive way to improve the UAP from the perspective of utilizing local contents effectively. In this work, we find that the fooling ratios significantly increase when we add a constraint to encourage a small-scale UAP map and repeat it vertically and horizontally to fill the whole image domain. To this end, we propose texture scale-constrained UAP (TSC-UAP), a simple yet effective UAP enhancement method that automatically generates UAPs with category-specific local textures that can fool deep models more easily. Through a low-cost operation that restricts the texture scale, TSC-UAP achieves a considerable improvement in the fooling ratio and attack transferability for both data-dependent and data-free UAP methods. Experiments conducted on two state-of-the-art UAP methods, eight popular CNN models and four classical datasets show the remarkable performance of TSC-UAP.
We consider a convex constrained Gaussian sequence model and characterize necessary and sufficient conditions for the least squares estimator (LSE) to be optimal in a minimax sense. For a closed convex set $K\subset \mathbb{R}^n$ we observe $Y=\mu+\xi$ for $\xi\sim N(0,\sigma^2\mathbb{I}_n)$ and $\mu\in K$ and aim to estimate $\mu$. We characterize the worst case risk of the LSE in multiple ways by analyzing the behavior of the local Gaussian width on $K$. We demonstrate that optimality is equivalent to a Lipschitz property of the local Gaussian width mapping. We also provide theoretical algorithms that search for the worst case risk. We then provide examples showing optimality or suboptimality of the LSE on various sets, including $\ell_p$ balls for $p\in[1,2]$, pyramids, solids of revolution, and multivariate isotonic regression, among others.
There has been a resurgence of interest in the asymptotic normality of incomplete U-statistics that only sum over roughly as many kernel evaluations as there are data samples, due to its computational efficiency and usefulness in quantifying the uncertainty for ensemble-based predictions. In this paper, we focus on the normal convergence of one such construction, the incomplete U-statistic with Bernoulli sampling, based on a raw sample of size $n$ and a computational budget $N$ in the same order as $n$. Under a minimalistic third moment assumption on the kernel, we offer an accompanying Berry-Esseen bound of the natural rate $1/\sqrt{\min(N, n)}$ that characterizes the normal approximating accuracy involved. Our key techniques include Stein's method specialized for the so-called Studentized nonlinear statistics, and an exponential lower tail bound for non-negative kernel U-statistics.
Physical Unclonable Functions (PUFs) are emerging as promising security primitives for IoT devices, providing device fingerprints based on physical characteristics. Despite their strengths, PUFs are vulnerable to machine learning (ML) attacks, including conventional and reliability-based attacks. Conventional ML attacks have been effective in revealing vulnerabilities of many PUFs, and reliability-based ML attacks are more powerful tools that have detected vulnerabilities of some PUFs that are resistant to conventional ML attacks. Since reliability-based ML attacks leverage information of PUFs' unreliability, we were tempted to examine the feasibility of building defense using reliability enhancing techniques, and have discovered that majority voting with reasonably high repeats provides effective defense against existing reliability-based ML attack methods. It is known that majority voting reduces but does not eliminate unreliability, we are motivated to investigate if new attack methods exist that can capture the low unreliability of highly but not-perfectly reliable PUFs, which led to the development of a new reliability representation and the new representation-enabled attack method that has experimentally cracked PUFs enhanced with majority voting of high repetitions.
Spiking Neural Networks (SNN) are characterised by their unique temporal dynamics, but the properties and advantages of such computations are still not well understood. In order to provide answers, in this work we demonstrate how Spiking neurons can enable temporal feature extraction in feed-forward neural networks without the need for recurrent synapses, and how recurrent SNNs can achieve comparable results to LSTM with a smaller number of parameters. This shows how their bio-inspired computing principles can be successfully exploited beyond energy efficiency gains and evidences their differences with respect to conventional artificial neural networks. These results are obtained through a new task, DVS-Gesture-Chain (DVS-GC), which allows, for the first time, to evaluate the perception of temporal dependencies in a real event-based action recognition dataset. Our study proves how the widely used DVS Gesture benchmark can be solved by networks without temporal feature extraction when its events are accumulated in frames, unlike the new DVS-GC which demands an understanding of the order in which events happen. Furthermore, this setup allowed us to reveal the role of the leakage rate in spiking neurons for temporal processing tasks and demonstrated the benefits of "hard reset" mechanisms. Additionally, we also show how time-dependent weights and normalization can lead to understanding order by means of temporal attention.