We consider the parallel complexity of submodular function minimization (SFM). We provide a pair of methods which obtain two new query versus depth trade-offs a submodular function defined on subsets of $n$ elements that has integer values between $-M$ and $M$. The first method has depth $2$ and query complexity $n^{O(M)}$ and the second method has depth $\widetilde{O}(n^{1/3} M^{2/3})$ and query complexity $O(\mathrm{poly}(n, M))$. Despite a line of work on improved parallel lower bounds for SFM, prior to our work the only known algorithms for parallel SFM either followed from more general methods for sequential SFM or highly-parallel minimization of convex $\ell_2$-Lipschitz functions. Interestingly, to obtain our second result we provide the first highly-parallel algorithm for minimizing $\ell_\infty$-Lipschitz function over the hypercube which obtains near-optimal depth for obtaining constant accuracy.
Sparse high-dimensional functions have arisen as a rich framework to study the behavior of gradient-descent methods using shallow neural networks, showcasing their ability to perform feature learning beyond linear models. Amongst those functions, the simplest are single-index models $f(x) = \phi( x \cdot \theta^*)$, where the labels are generated by an arbitrary non-linear scalar link function $\phi$ applied to an unknown one-dimensional projection $\theta^*$ of the input data. By focusing on Gaussian data, several recent works have built a remarkable picture, where the so-called information exponent (related to the regularity of the link function) controls the required sample complexity. In essence, these tools exploit the stability and spherical symmetry of Gaussian distributions. In this work, building from the framework of \cite{arous2020online}, we explore extensions of this picture beyond the Gaussian setting, where both stability or symmetry might be violated. Focusing on the planted setting where $\phi$ is known, our main results establish that Stochastic Gradient Descent can efficiently recover the unknown direction $\theta^*$ in the high-dimensional regime, under assumptions that extend previous works \cite{yehudai2020learning,wu2022learning}.
Mediation analysis aims to separate the indirect effect through mediators from the direct effect of the exposure on the outcome. It is challenging to perform mediation analysis with neuroimaging data which involves high dimensionality, complex spatial correlations, sparse activation patterns and relatively low signal-to-noise ratio. To address these issues, we develop a new spatially varying coefficient structural equation model for Bayesian Image Mediation Analysis (BIMA). We define spatially varying mediation effects within the potential outcome framework, employing the soft-thresholded Gaussian process prior for functional parameters. We establish the posterior consistency for spatially varying mediation effects along with selection consistency on important regions that contribute to the mediation effects. We develop an efficient posterior computation algorithm scalable to analysis of large-scale imaging data. Through extensive simulations, we show that BIMA can improve the estimation accuracy and computational efficiency for high-dimensional mediation analysis over the existing methods. We apply BIMA to analyze the behavioral and fMRI data in the Adolescent Brain Cognitive Development (ABCD) study with a focus on inferring the mediation effects of the parental education level on the children's general cognitive ability that are mediated through the working memory brain activities.
Koopman representations aim to learn features of nonlinear dynamical systems (NLDS) which lead to linear dynamics in the latent space. Theoretically, such features can be used to simplify many problems in modeling and control of NLDS. In this work we study autoencoder formulations of this problem, and different ways they can be used to model dynamics, specifically for future state prediction over long horizons. We discover several limitations of predicting future states in the latent space and propose an inference-time mechanism, which we refer to as Periodic Reencoding, for faithfully capturing long term dynamics. We justify this method both analytically and empirically via experiments in low and high dimensional NLDS.
We introduce the class of P-finite automata. These are a generalisation of weighted automata, in which the weights of transitions can depend polynomially on the length of the input word. P-finite automata can also be viewed as simple tail-recursive programs in which the arguments of recursive calls can non-linearly refer to a variable that counts the number of recursive calls. The nomenclature is motivated by the fact that over a unary alphabet P-finite automata compute so-called P-finite sequences, that is, sequences that satisfy a linear recurrence with polynomial coefficients. Our main result shows that P-finite automata can be learned in polynomial time in Angluin's MAT exact learning model. This generalises the classical results that deterministic finite automata and weighted automata over a field are respectively polynomial-time learnable in the MAT model.
Estimating weights in the synthetic control method, typically resulting in sparse weights where only a few control units have non-zero weights, involves an optimization procedure that simultaneously selects and aligns control units to closely match the treated unit. However, this simultaneous selection and alignment of control units may lead to a loss of efficiency. Another concern arising from the aforementioned procedure is its susceptibility to under-fitting due to imperfect pre-treatment fit. It is not uncommon for the linear combination, using nonnegative weights, of pre-treatment period outcomes for the control units to inadequately approximate the pre-treatment outcomes for the treated unit. To address both of these issues, this paper proposes a simple and effective method called Synthetic Regressing Control (SRC). The SRC method begins by performing the univariate linear regression to appropriately align the pre-treatment periods of the control units with the treated unit. Subsequently, a SRC estimator is obtained by synthesizing (taking a weighted average) the fitted controls. To determine the weights in the synthesis procedure, we propose an approach that utilizes a criterion of unbiased risk estimator. Theoretically, we show that the synthesis way is asymptotically optimal in the sense of achieving the lowest possible squared error. Extensive numerical experiments highlight the advantages of the SRC method.
Large-scale matrix data has been widely discovered and continuously studied in various fields recently. Considering the multi-level factor structure and utilizing the matrix structure, we propose a multilevel matrix factor model with both global and local factors. The global factors can affect all matrix times series, whereas the local factors are only allow to affect within each specific matrix time series. The estimation procedures can consistently estimate the factor loadings and determine the number of factors. We establish the asymptotic properties of the estimators. The simulation is presented to illustrate the performance of the proposed estimation method. We utilize the model to analyze eight indicators across 200 stocks from ten distinct industries, demonstrating the empirical utility of our proposed approach.
The problem of finding a constant bound on a term given a set of assumptions has wide applications in optimization as well as program analysis. However, in many contexts the objective term may be unbounded. Still, some sort of symbolic bound may be useful. In this paper we introduce the optimal symbolic-bound synthesis problem, and a technique that tackles this problem for non-linear arithmetic with function symbols. This allows us to automatically produce symbolic bounds on complex arithmetic expressions from a set of both equality and inequality assumptions. Our solution employs a novel combination of powerful mathematical objects -- Gr\"obner bases together with polyhedral cones -- to represent an infinite set of implied inequalities. We obtain a sound symbolic bound by reducing the objective term by this infinite set. We implemented our method in a tool, AutoBound, which we tested on problems originating from real Solidity programs. We find that AutoBound yields relevant bounds in each case, matching or nearly-matching upper bounds produced by a human analyst on the same set of programs.
Differential Privacy (DP) is a well-established framework to quantify privacy loss incurred by any algorithm. Traditional formulations impose a uniform privacy requirement for all users, which is often inconsistent with real-world scenarios in which users dictate their privacy preferences individually. This work considers the problem of mean estimation, where each user can impose their own distinct privacy level. The algorithm we propose is shown to be minimax optimal and has a near-linear run-time. Our results elicit an interesting saturation phenomenon that occurs. Namely, the privacy requirements of the most stringent users dictate the overall error rates. As a consequence, users with less but differing privacy requirements are all given more privacy than they require, in equal amounts. In other words, these privacy-indifferent users are given a nontrivial degree of privacy for free, without any sacrifice in the performance of the estimator.
Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification.
Embedding models for deterministic Knowledge Graphs (KG) have been extensively studied, with the purpose of capturing latent semantic relations between entities and incorporating the structured knowledge into machine learning. However, there are many KGs that model uncertain knowledge, which typically model the inherent uncertainty of relations facts with a confidence score, and embedding such uncertain knowledge represents an unresolved challenge. The capturing of uncertain knowledge will benefit many knowledge-driven applications such as question answering and semantic search by providing more natural characterization of the knowledge. In this paper, we propose a novel uncertain KG embedding model UKGE, which aims to preserve both structural and uncertainty information of relation facts in the embedding space. Unlike previous models that characterize relation facts with binary classification techniques, UKGE learns embeddings according to the confidence scores of uncertain relation facts. To further enhance the precision of UKGE, we also introduce probabilistic soft logic to infer confidence scores for unseen relation facts during training. We propose and evaluate two variants of UKGE based on different learning objectives. Experiments are conducted on three real-world uncertain KGs via three tasks, i.e. confidence prediction, relation fact ranking, and relation fact classification. UKGE shows effectiveness in capturing uncertain knowledge by achieving promising results on these tasks, and consistently outperforms baselines on these tasks.