Generalized cross-validation (GCV) is a widely-used method for estimating the squared out-of-sample prediction risk that employs a scalar degrees of freedom adjustment (in a multiplicative sense) to the squared training error. In this paper, we examine the consistency of GCV for estimating the prediction risk of arbitrary ensembles of penalized least-squares estimators. We show that GCV is inconsistent for any finite ensemble of size greater than one. Towards repairing this shortcoming, we identify a correction that involves an additional scalar correction (in an additive sense) based on degrees of freedom adjusted training errors from each ensemble component. The proposed estimator (termed CGCV) maintains the computational advantages of GCV and requires neither sample splitting, model refitting, or out-of-bag risk estimation. The estimator stems from a finer inspection of the ensemble risk decomposition and two intermediate risk estimators for the components in this decomposition. We provide a non-asymptotic analysis of the CGCV and the two intermediate risk estimators for ensembles of convex penalized estimators under Gaussian features and a linear response model. Furthermore, in the special case of ridge regression, we extend the analysis to general feature and response distributions using random matrix theory, which establishes model-free uniform consistency of CGCV.
Multivariate Cryptography is one of the main candidates for Post-quantum Cryptography. Multivariate schemes are usually constructed by applying two secret affine invertible transformations $\mathcal S,\mathcal T$ to a set of multivariate polynomials $\mathcal{F}$ (often quadratic). The secret polynomials $\mathcal{F}$ posses a trapdoor that allows the legitimate user to find a solution of the corresponding system, while the public polynomials $\mathcal G=\mathcal S\circ\mathcal F\circ\mathcal T$ look like random polynomials. The polynomials $\mathcal G$ and $\mathcal F$ are said to be affine equivalent. In this article, we present a more general way of constructing a multivariate scheme by considering the CCZ equivalence, which has been introduced and studied in the context of vectorial Boolean functions.
Studies intended to estimate the effect of a treatment, like randomized trials, may not be sampled from the desired target population. To correct for this discrepancy, estimates can be transported to the target population. Methods for transporting between populations are often premised on a positivity assumption, such that all relevant covariate patterns in one population are also present in the other. However, eligibility criteria, particularly in the case of trials, can result in violations of positivity when transporting to external populations. To address nonpositivity, a synthesis of statistical and mathematical models can be considered. This approach integrates multiple data sources (e.g. trials, observational, pharmacokinetic studies) to estimate treatment effects, leveraging mathematical models to handle positivity violations. This approach was previously demonstrated for positivity violations by a single binary covariate. Here, we extend the synthesis approach for positivity violations with a continuous covariate. For estimation, two novel augmented inverse probability weighting estimators are proposed. Both estimators are contrasted with other common approaches for addressing nonpositivity. Empirical performance is compared via Monte Carlo simulation. Finally, the competing approaches are illustrated with an example in the context of two-drug versus one-drug antiretroviral therapy on CD4 T cell counts among women with HIV.
This study explores the impact of adversarial perturbations on Convolutional Neural Networks (CNNs) with the aim of enhancing the understanding of their underlying mechanisms. Despite numerous defense methods proposed in the literature, there is still an incomplete understanding of this phenomenon. Instead of treating the entire model as vulnerable, we propose that specific feature maps learned during training contribute to the overall vulnerability. To investigate how the hidden representations learned by a CNN affect its vulnerability, we introduce the Adversarial Intervention framework. Experiments were conducted on models trained on three well-known computer vision datasets, subjecting them to attacks of different nature. Our focus centers on the effects that adversarial perturbations to a model's initial layer have on the overall behavior of the model. Empirical results revealed compelling insights: a) perturbing selected channel combinations in shallow layers causes significant disruptions; b) the channel combinations most responsible for the disruptions are common among different types of attacks; c) despite shared vulnerable combinations of channels, different attacks affect hidden representations with varying magnitudes; d) there exists a positive correlation between a kernel's magnitude and its vulnerability. In conclusion, this work introduces a novel framework to study the vulnerability of a CNN model to adversarial perturbations, revealing insights that contribute to a deeper understanding of the phenomenon. The identified properties pave the way for the development of efficient ad-hoc defense mechanisms in future applications.
The problems of optimal recovering univariate functions and their derivatives are studied. To solve these problems, two variants of the truncation method are constructed, which are order-optimal both in the sense of accuracy and in terms of the amount of involved Galerkin information. For numerical summation, it has been established how the parameters characterizing the problem being solved affect its stability.
Semi-implicit spectral deferred correction (SDC) methods provide a systematic approach to construct time integration methods of arbitrarily high order for nonlinear evolution equations including conservation laws. They converge towards $A$- or even $L$-stable collocation methods, but are often not sufficiently robust themselves. In this paper, a family of SDC methods inspired by an implicit formulation of the Lax-Wendroff method is developed. Compared to fully implicit approaches, the methods have the advantage that they only require the solution of positive definite or semi-definite linear systems. Numerical evidence suggests that the proposed semi-implicit SDC methods with Radau points are $L$-stable up to order 11 and require very little diffusion for orders 13 and 15. The excellent stability and accuracy of these methods is confirmed by numerical experiments with 1D conservation problems, including the convection-diffusion, Burgers, Euler and Navier-Stokes equations.
Given a finite set of matrices with integer entries, the matrix mortality problem asks if there exists a product of these matrices equal to the zero matrix. We consider a special case of this problem where all entries of the matrices are nonnegative. This case is equivalent to the NFA mortality problem, which, given an NFA, asks for a word $w$ such that the image of every state under $w$ is the empty set. The size of the alphabet of the NFA is then equal to the number of matrices in the set. We study the length of shortest such words depending on the size of the alphabet. We show that this length for an NFA with $n$ states can be at least $2^n - 1$, $2^{(n - 4)/2}$ and $2^{(n - 2)/3}$ if the size of the alphabet is, respectively, equal to $n$, three and two.
The execution of Belief-Desire-Intention (BDI) agents in a Multi-Agent System (MAS) can be practically implemented on top of low-level concurrency mechanisms that impact on efficiency, determinism, and reproducibility. We argue that developers should specify the MAS behaviour independently of the execution model, and choose or configure the concurrency model later on, according to the specific needs of their target domain, leaving the MAS specification unaffected. We identify patterns for mapping the agent execution over the underlying concurrency abstractions, and investigate which concurrency models are supported by some of the most commonly used BDI platforms. Although most frameworks support multiple concurrency models, we find that they mostly hide them under the hood, making them opaque to the developer, and actually limiting the possibility of fine-tuning the MAS.
Covariate-adaptive randomization is widely employed to balance baseline covariates in interventional studies such as clinical trials and experiments in development economics. Recent years have witnessed substantial progress in inference under covariate-adaptive randomization with a fixed number of strata. However, concerns have been raised about the impact of a large number of strata on its design and analysis, which is a common scenario in practice, such as in multicenter randomized clinical trials. In this paper, we propose a general framework for inference under covariate-adaptive randomization, which extends the seminal works of Bugni et al. (2018, 2019) by allowing for a diverging number of strata. Furthermore, we introduce a novel weighted regression adjustment that ensures efficiency improvement. On top of establishing the asymptotic theory, practical algorithms for handling situations involving an extremely large number of strata are also developed. Moreover, by linking design balance and inference robustness, we highlight the advantages of stratified block randomization, which enforces better covariate balance within strata compared to simple randomization. This paper offers a comprehensive landscape of inference under covariate-adaptive randomization, spanning from fixed to diverging to extremely large numbers of strata.
Characterized by an outer integral connected to an inner integral through a nonlinear function, nested integration is a challenging problem in various fields, such as engineering and mathematical finance. The available numerical methods for nested integration based on Monte Carlo (MC) methods can be prohibitively expensive owing to the error propagating from the inner to the outer integral. Attempts to enhance the efficiency of these approximations using the quasi-MC (QMC) or randomized QMC (rQMC) method have focused on either the inner or outer integral approximation. This work introduces a novel nested rQMC method that simultaneously addresses the approximation of the inner and outer integrals. This method leverages the unique nested integral structure to offer a more efficient approximation mechanism. By incorporating Owen's scrambling techniques, we address integrands exhibiting infinite variation in the Hardy--Krause sense, enabling theoretically sound error estimates. As the primary contribution, we derive asymptotic error bounds for the bias and variance of our estimator, along with the regularity conditions under which these bounds can be attained. In addition, we provide nearly optimal sample sizes for the rQMC approximations underlying the numerical implementation of the proposed method. Moreover, we derive a truncation scheme to make our estimator applicable in the context of expected information gain estimation and indicate how to use importance sampling to remedy the measure concentration arising in the inner integral. We verify the estimator quality through numerical experiments by comparing the computational efficiency of the nested rQMC method against standard nested MC integration for two case studies: one in thermomechanics and the other in pharmacokinetics. These examples highlight the computational savings and enhanced applicability of the proposed approach.
Temporal reasoning with conditionals is more complex than both classical temporal reasoning and reasoning with timeless conditionals, and can lead to some rather counter-intuitive conclusions. For instance, Aristotle's famous "Sea Battle Tomorrow" puzzle leads to a fatalistic conclusion: whether there will be a sea battle tomorrow or not, but that is necessarily the case now. We propose a branching-time logic LTC to formalise reasoning about temporal conditionals and provide that logic with adequate formal semantics. The logic LTC extends the Nexttime fragment of CTL*, with operators for model updates, restricting the domain to only future moments where antecedent is still possible to satisfy. We provide formal semantics for these operators that implements the restrictor interpretation of antecedents of temporalized conditionals, by suitably restricting the domain of discourse. As a motivating example, we demonstrate that a naturally formalised in our logic version of the `Sea Battle' argument renders it unsound, thereby providing a solution to the problem with fatalist conclusion that it entails, because its underlying reasoning per cases argument no longer applies when these cases are treated not as material implications but as temporal conditionals. On the technical side, we analyze the semantics of LTC and provide a series of reductions of LTC-formulae, first recursively eliminating the dynamic update operators and then the path quantifiers in such formulae. Using these reductions we obtain a sound and complete axiomatization for LTC, and reduce its decision problem to that of the modal logic KD.