A reduced-rank mixed effects model is developed for robust modeling of sparsely observed paired functional data. In this model, the curves for each functional variable are summarized using a few functional principal components, and the association of the two functional variables is modeled through the association of the principal component scores. Multivariate scale mixture of normal distributions is used to model the principal component scores and the measurement errors in order to handle outlying observations and achieve robust inference. The mean functions and principal component functions are modeled using splines and roughness penalties are applied to avoid overfitting. An EM algorithm is developed for computation of model fitting and prediction. A simulation study shows that the proposed method outperforms an existing method which is not designed for robust estimation. The effectiveness of the proposed method is illustrated in an application of fitting multi-band light curves of Type Ia supernovae.
In the area of query complexity of Boolean functions, the most widely studied cost measure of an algorithm is the worst-case number of queries made by it on an input. Motivated by the most natural cost measure studied in online algorithms, the competitive ratio, we consider a different cost measure for query algorithms for Boolean functions that captures the ratio of the cost of the algorithm and the cost of an optimal algorithm that knows the input in advance. The cost of an algorithm is its largest cost over all inputs. Grossman, Komargodski and Naor [ITCS'20] introduced this measure for Boolean functions, and dubbed it instance complexity. Grossman et al. showed, among other results, that monotone Boolean functions with instance complexity 1 are precisely those that depend on one or two variables. We complement the above-mentioned result of Grossman et al. by completely characterizing the instance complexity of symmetric Boolean functions. As a corollary we conclude that the only symmetric Boolean functions with instance complexity 1 are the Parity function and its complement. We also study the instance complexity of some graph properties like Connectivity and k-clique containment. In all the Boolean functions we study above, and those studied by Grossman et al., the instance complexity turns out to be the ratio of query complexity to minimum certificate complexity. It is a natural question to ask if this is the correct bound for all Boolean functions. We show a negative answer in a very strong sense, by analyzing the instance complexity of the Greater-Than and Odd-Max-Bit functions. We show that the above-mentioned ratio is linear in the input size for both of these functions, while we exhibit algorithms for which the instance complexity is a constant.
A non-intrusive proper generalized decomposition (PGD) strategy, coupled with an overlapping domain decomposition (DD) method, is proposed to efficiently construct surrogate models of parametric linear elliptic problems. A parametric multi-domain formulation is presented, with local subproblems featuring arbitrary Dirichlet interface conditions represented through the traces of the finite element functions used for spatial discretization at the subdomain level, with no need for additional auxiliary basis functions. The linearity of the operator is exploited to devise low-dimensional problems with only few active boundary parameters. An overlapping Schwarz method is used to glue the local surrogate models, solving a linear system for the nodal values of the parametric solution at the interfaces, without introducing Lagrange multipliers to enforce the continuity in the overlapping region. The proposed DD-PGD methodology relies on a fully algebraic formulation allowing for real-time computation based on the efficient interpolation of the local surrogate models in the parametric space, with no additional problems to be solved during the execution of the Schwarz algorithm. Numerical results for parametric diffusion and convection-diffusion problems are presented to showcase the accuracy of the DD-PGD approach, its robustness in different regimes and its superior performance with respect to standard high-fidelity DD methods.
Feature selection is one of the most relevant processes in any methodology for creating a statistical learning model. Usually, existing algorithms establish some criterion to select the most influential variables, discarding those that do not contribute to the model with any relevant information. This methodology makes sense in a static situation where the joint distribution of the data does not vary over time. However, when dealing with real data, it is common to encounter the problem of the dataset shift and, specifically, changes in the relationships between variables (concept shift). In this case, the influence of a variable cannot be the only indicator of its quality as a regressor of the model, since the relationship learned in the training phase may not correspond to the current situation. In tackling this problem, our approach establishes a direct relationship between the Shapley values and prediction errors, operating at a more local level to effectively detect the individual biases introduced by each variable. The proposed methodology is evaluated through various examples, including synthetic scenarios mimicking sudden and incremental shift situations, as well as two real-world cases characterized by concept shifts. Additionally, we perform three analyses of standard situations to assess the algorithm's robustness in the absence of shifts. The results demonstrate that our proposed algorithm significantly outperforms state-of-the-art feature selection methods in concept shift scenarios, while matching the performance of existing methodologies in static situations.
Multi-level modeling is an important approach for analyzing complex survey data using multi-stage sampling. However, estimation of multi-level models can be challenging when we combine several datasets with distinct hierarchies with sampling weights. This paper presents a method for combining multiple datasets with different hierarchical structures due to distinct informative sampling designs for the same survey. To develop an approach with complete generality, we propose to define a pseudo-cluster, a cluster containing only a singleton observation, to unify the data structure and thereby enable estimation of multi-level models incorporating sampling weights across the combined sample. We justify incorporating sampling weights at each level of the hierarchical model and in doing-so define a pseudo-likelihood estimation procedure. Simulation studies are used to illustrate the effect of incorporating sampling designs in this challenging multi-level modeling scenario. We demonstrate in the simulation studies that considering a linear mixed model with sampling weights provides unbiased estimates of model parameters and enhances the estimation of the variance components of the random effects. The proposed method is illustrated through a novel application from the National Survey of Healthcare Organizations and Systems that sought to determine which organizational characteristics or traits, as measured in the surveys, have the strongest average relationship to the percentage of depression and anxiety diagnoses in physician practices in the US.
We investigate the combinatorics of max-pooling layers, which are functions that downsample input arrays by taking the maximum over shifted windows of input coordinates, and which are commonly used in convolutional neural networks. We obtain results on the number of linearity regions of these functions by equivalently counting the number of vertices of certain Minkowski sums of simplices. We characterize the faces of such polytopes and obtain generating functions and closed formulas for the number of vertices and facets in a 1D max-pooling layer depending on the size of the pooling windows and stride, and for the number of vertices in a special case of 2D max-pooling.
We propose a new class of models for variable clustering called Asymptotic Independent block (AI-block) models, which defines population-level clusters based on the independence of the maxima of a multivariate stationary mixing random process among clusters. This class of models is identifiable, meaning that there exists a maximal element with a partial order between partitions, allowing for statistical inference. We also present an algorithm for recovering the clusters of variables without specifying the number of clusters \emph{a priori}. Our work provides some theoretical insights into the consistency of our algorithm, demonstrating that under certain conditions it can effectively identify clusters in the data with a computational complexity that is polynomial in the dimension. This implies that groups can be learned nonparametrically in which block maxima of a dependent process are only sub-asymptotic. To further illustrate the significance of our work, we applied our method to neuroscience and environmental real-datasets. These applications highlight the potential and versatility of the proposed approach.
The $K$-function is arguably the most important functional summary statistic for spatial point processes. It is used extensively for goodness-of-fit testing and in connection with minimum contrast estimation for parametric spatial point process models. It is thus pertinent to understand the asymptotic properties of estimates of the $K$-function. In this paper we derive the functional asymptotic distribution for the $K$-function estimator. Contrary to previous papers on functional convergence we consider the case of an inhomogeneous intensity function. We moreover handle the fact that practical $K$-function estimators rely on plugging in an estimate of the intensity function. This removes two serious limitations of the existing literature.
Discovering causal relationships from observational data is a fundamental yet challenging task. In some applications, it may suffice to learn the causal features of a given response variable, instead of learning the entire underlying causal structure. Invariant causal prediction (ICP, Peters et al., 2016) is a method for causal feature selection which requires data from heterogeneous settings. ICP assumes that the mechanism for generating the response from its direct causes is the same in all settings and exploits this invariance to output a subset of the causal features. The framework of ICP has been extended to general additive noise models and to nonparametric settings using conditional independence testing. However, nonparametric conditional independence testing often suffers from low power (or poor type I error control) and the aforementioned parametric models are not suitable for applications in which the response is not measured on a continuous scale, but rather reflects categories or counts. To bridge this gap, we develop ICP in the context of transformation models (TRAMs), allowing for continuous, categorical, count-type, and uninformatively censored responses (we show that, in general, these model classes do not allow for identifiability when there is no exogenous heterogeneity). We propose TRAM-GCM, a test for invariance of a subset of covariates, based on the expected conditional covariance between environments and score residuals which satisfies uniform asymptotic level guarantees. For the special case of linear shift TRAMs, we propose an additional invariance test, TRAM-Wald, based on the Wald statistic. We implement both proposed methods in the open-source R package "tramicp" and show in simulations that under the correct model specification, our approach empirically yields higher power than nonparametric ICP based on conditional independence testing.
P-time event graphs are discrete event systems suitable for modeling processes in which tasks must be executed in predefined time windows. Their dynamics can be represented by max-plus linear-dual inequalities (LDIs), i.e., systems of linear dynamical inequalities in the primal and dual operations of the max-plus algebra. We define a new class of models called switched LDIs (SLDIs), which allow to switch between different modes of operation, each corresponding to a set of LDIs, according to a sequence of modes called schedule. In this paper, we focus on the analysis of SLDIs when the considered schedule is fixed and either periodic or intermittently periodic. We show that SLDIs can model a wide range of applications including single-robot multi-product processing networks, in which every product has different processing requirements and corresponds to a specific mode of operation. Based on the analysis of SLDIs, we propose algorithms to compute: i. minimum and maximum cycle times for these processes, improving the time complexity of other existing approaches; ii. a complete trajectory of the robot including start-up and shut-down transients.
A robust nonconforming mixed finite element method is developed for a strain gradient elasticity (SGE) model. In two and three dimensional cases, a lower order $C^0$-continuous $H^2$-nonconforming finite element is constructed for the displacement field through enriching the quadratic Lagrange element with bubble functions. This together with the linear Lagrange element is exploited to discretize a mixed formulation of the SGE model. The robust discrete inf-sup condition is established. The sharp and uniform error estimates with respect to both the small size parameter and the Lam\'{e} coefficient are achieved, which is also verified by numerical results. In addition, the uniform regularity of the SGE model is derived under two reasonable assumptions.