In \cite[Serra, Vena, Extremal families for the Kruskal-Katona theorem]{sv21}, the authors have shown a characterization of the extremal families for the Kruskal-Katona Theorem. We further develop some of the arguments given in \cite{sv21} and give additional properties of these extremal families. F\"uredi-Griggs/M\"ors theorem from 1986/85 \cite{furgri86,mors85} claims that, for some cardinalities, the initial segment of the colexicographical is the unique extremal family; we extend their result as follows: the number of (non-isomorphic) extremal families strictly grows with the gap between the last two coefficients of the $k$-binomial decomposition. We also show that every family is an induced subfamily of an extremal family, and that, somewhat going in the opposite direction, every extremal family is close to being the inital segment of the colex order; namely, if the family is extremal, then after performing $t$ lower shadows, with $t=O(\log(\log n))$, we obtain the initial segment of the colexicographical order. We also give a ``fast'' algorithm to determine whether, for a given $t$ and $m$, there exists an extremal family of size $m$ for which its $t$-th lower shadow is not yet the initial segment in the colexicographical order. As a byproduct of these arguments, we give yet another characterization of the families of $k$-sets satisfying equality in the Kruskal--Katona theorem. Such characterization is, at first glance, less appealing than the one in \cite{sv21}, since the additional information that it provides is indirect. However, the arguments used to prove such characterization provide additional insight on the structure of the extremal families themselves.
Online computation is a concept to model uncertainty where not all information on a problem instance is known in advance. An online algorithm receives requests which reveal the instance piecewise and has to respond with irrevocable decisions. Often, an adversary is assumed that constructs the instance knowing the deterministic behavior of the algorithm. Thus, the adversary is able to tailor the input to any online algorithm. From a game theoretical point of view, the adversary and the online algorithm are players in an asymmetric two-player game. To overcome this asymmetry, the online algorithm is equipped with an isomorphic copy of the graph, which is referred to as unlabeled map. By applying the game theoretical perspective on online graph problems, where the solution is a subset of the vertices, we analyze the complexity of these online vertex subset games. For this, we introduce a framework for reducing online vertex subset games from TQBF. This framework is based on gadget reductions from 3-SATISFIABILITY to the corresponding offline problem. We further identify a set of rules for extending the 3-SATISFIABILITY-reduction and provide schemes for additional gadgets which assure that these rules are fulfilled. By extending the gadget reduction of the vertex subset problem with these additional gadgets, we obtain a reduction for the corresponding online vertex subset game. At last, we provide example reductions for online vertex subset games based on VERTEX COVER, INDEPENDENT SET, and DOMINATING SET, proving that they are PSPACE-complete. Thus, this paper establishes that the online version with a map of NP-complete vertex subset problems form a large class of PSPACE-complete problems.
In this work we connect two notions: That of the nonparametric mode of a probability measure, defined by asymptotic small ball probabilities, and that of the Onsager-Machlup functional, a generalized density also defined via asymptotic small ball probabilities. We show that in a separable Hilbert space setting and under mild conditions on the likelihood, modes of a Bayesian posterior distribution based upon a Gaussian prior exist and agree with the minimizers of its Onsager-Machlup functional and thus also with weak posterior modes. We apply this result to inverse problems and derive conditions on the forward mapping under which this variational characterization of posterior modes holds. Our results show rigorously that in the limit case of infinite-dimensional data corrupted by additive Gaussian or Laplacian noise, nonparametric maximum a posteriori estimation is equivalent to Tikhonov-Phillips regularization. In comparison with the work of Dashti, Law, Stuart, and Voss (2013), the assumptions on the likelihood are relaxed so that they cover in particular the important case of white Gaussian process noise. We illustrate our results by applying them to a severely ill-posed linear problem with Laplacian noise, where we express the maximum a posteriori estimator analytically and study its rate of convergence in the small noise limit.
We investigate online classification with paid stochastic experts. Here, before making their prediction, each expert must be paid. The amount that we pay each expert directly influences the accuracy of their prediction through some unknown Lipschitz "productivity" function. In each round, the learner must decide how much to pay each expert and then make a prediction. They incur a cost equal to a weighted sum of the prediction error and upfront payments for all experts. We introduce an online learning algorithm whose total cost after $T$ rounds exceeds that of a predictor which knows the productivity of all experts in advance by at most $\mathcal{O}(K^2(\log T)\sqrt{T})$ where $K$ is the number of experts. In order to achieve this result, we combine Lipschitz bandits and online classification with surrogate losses. These tools allow us to improve upon the bound of order $T^{2/3}$ one would obtain in the standard Lipschitz bandit setting. Our algorithm is empirically evaluated on synthetic data
We study first-order logic over unordered structures whose elements carry a finite number of data values from an infinite domain. Data values can be compared wrt.\ equality. As the satisfiability problem for this logic is undecidable in general, we introduce a family of local fragments. They restrict quantification to the neighbourhood of a given reference point that is bounded by some radius. Our first main result establishes decidability of the satisfiability problem for the local radius-1 fragment in presence of one ``diagonal relation''. On the other hand, extending the radius leads to undecidability. In a second part, we provide the precise decidability and complexity landscape of the satisfiability problem for the existential fragments of local logic, which are parameterized by the number of data values carried by each element and the radius of the considered neighbourhoods. Altogether, we draw a landscape of formalisms that are suitable for the specification of systems with data and open up new avenues for future research.
The chain graph model admits both undirected and directed edges in one graph, where symmetric conditional dependencies are encoded via undirected edges and asymmetric causal relations are encoded via directed edges. Though frequently encountered in practice, the chain graph model has been largely under investigated in literature, possibly due to the lack of identifiability conditions between undirected and directed edges. In this paper, we first establish a set of novel identifiability conditions for the Gaussian chain graph model, exploiting a low rank plus sparse decomposition of the precision matrix. Further, an efficient learning algorithm is built upon the identifiability conditions to fully recover the chain graph structure. Theoretical analysis on the proposed method is conducted, assuring its asymptotic consistency in recovering the exact chain graph structure. The advantage of the proposed method is also supported by numerical experiments on both simulated examples and a real application on the Standard & Poor 500 index data.
Game theory on graphs is a basic tool in computer science. In this paper, we propose a new game-theoretic framework for studying the privacy protection of a user who interactively uses a software service. Our framework is based on the idea that an objective of a user using software services should not be known to an adversary because the objective is often closely related to personal information of the user. We propose two new notions, O-indistinguishable strategy (O-IS) and objective-indistinguishability equilibrium (OIE). For a given game and a subset O of winning objectives (or objectives in short), a strategy of a player is O-indistinguishable if an adversary cannot shrink O by excluding any objective from O as an impossible objective. A strategy profile, which is a tuple of strategies of all players, is an OIE if the profile is locally maximal in the sense that no player can expand her set of objectives indistinguishable from her real objective from the viewpoint of an adversary. We show that for a given multiplayer game with Muller objectives, both of the existence of an O-IS and that of OIE are decidable.
The CONGEST and CONGEST-CLIQUE models have been carefully studied to represent situations where the communication bandwidth between processors in a network is severely limited. Messages of only $O(log(n))$ bits of information each may be sent between processors in each round. The quantum versions of these models allow the processors instead to communicate and compute with quantum bits under the same bandwidth limitations. This leads to the following natural research question: What problems can be solved more efficiently in these quantum models than in the classical ones? Building on existing work, we contribute to this question in two ways. Firstly, we present two algorithms in the Quantum CONGEST-CLIQUE model of distributed computation that succeed with high probability; one for producing an approximately optimal Steiner Tree, and one for producing an exact directed minimum spanning tree, each of which uses $\tilde{O}(n^{1/4})$ rounds of communication and $\tilde{O}(n^{9/4})$ messages, where $n$ is the number of nodes in the network. The algorithms thus achieve a lower asymptotic round and message complexity than any known algorithms in the classical CONGEST-CLIQUE model. At a high level, we achieve these results by combining classical algorithmic frameworks with quantum subroutines. An existing framework for using distributed version of Grover's search algorithm to accelerate triangle finding lies at the core of the asymptotic speedup. Secondly, we carefully characterize the constants and logarithmic factors involved in our algorithms as well as related algorithms, otherwise commonly obscured by $\tilde{O}$ notation. The analysis shows that some improvements are needed to render both our and existing related quantum and classical algorithms practical, as their asymptotic speedups only help for very large values of $n$.
We consider the problem of discovering $K$ related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a $l_1/l_2$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures. To allow practical implementation, we design a continuous optimization problem whose optimizer is the same as the joint estimator and can be approximated efficiently by an iterative algorithm. We validate the theoretical analysis and the effectiveness of the joint estimator in experiments.
This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.
The previous work for event extraction has mainly focused on the predictions for event triggers and argument roles, treating entity mentions as being provided by human annotators. This is unrealistic as entity mentions are usually predicted by some existing toolkits whose errors might be propagated to the event trigger and argument role recognition. Few of the recent work has addressed this problem by jointly predicting entity mentions, event triggers and arguments. However, such work is limited to using discrete engineering features to represent contextual information for the individual tasks and their interactions. In this work, we propose a novel model to jointly perform predictions for entity mentions, event triggers and arguments based on the shared hidden representations from deep learning. The experiments demonstrate the benefits of the proposed method, leading to the state-of-the-art performance for event extraction.