Many popular models from the networks literature can be viewed through a common lens of contingency tables on network dyads, resulting in \emph{log-linear ERGMs}: exponential family models for random graphs whose sufficient statistics are linear on the dyads. We propose a new model in this family, the \emph{$p_1$-SBM}, which combines node and group effects common in network formation mechanisms. In particular, it is a generalization of several well-known ERGMs including the stochastic blockmodel for undirected graphs, the degree-corrected version of it, and the directed $p_1$ model without group structure. We frame the problem of testing model fit for the log-linear ERGM class through an exact conditional test whose $p$-value can be approximated efficiently in networks of both small and moderately large sizes. The sampling methods we build rely on a dynamic adaptation of Markov bases. We use quick estimation algorithms adapted from the contingency table literature and effective sampling methods rooted in graph theory and algebraic statistics. The performance and scalability of the method is demonstrated on two data sets from biology: the connectome of \emph{C. elegans} and the interactome of \emph{Arabidopsis thaliana}. These two networks -- a neuronal network and a protein-protein interaction network -- have been popular examples in the network science literature. Our work provides a model-based approach to studying them.
An NP-complete graph decision problem, the "Multi-stage graph Simple Path" (abbr. MSP) problem, is introduced. The main contribution of this paper is a poly-time algorithm named the ZH algorithm for the problem together with the proof of its correctness, which implies NP=P. (1) A crucial structural property is discovered, whereby all MSP instances are arranged into the sequence $G_{0}$,$G_{1}$,$G_{2}$,... ($G_{k}$ essentially stands for a group of graphs for all $k\geq 0$). For each $G_{j}(j>0)$ in the sequence, there is a graph $G_{i}(0\leq i<j)$ "mathematically homomorphic" to $G_{j}$ which keeps completely accordant with $G_{j}$ on the existence of global solutions. This naturally provides a chance of applying mathematical induction for the proof of an algorithm. In previous attempts, algorithms used for making global decisions were mostly guided by heuristics and intuition. Rather, the ZH algorithm is dedicatedly designed to comply with the proposed proving framework of mathematical induction. (2) Although the ZH algorithm deals with paths, it always regards paths as a collection of edge sets. This is the key to the avoidance of exponential complexity. (3) Any poly-time algorithm that seeks global information can barely avoid the error caused by localized computation. In the ZH algorithm, the proposed reachable-path edge-set $R(e)$ and the computed information for it carry all necessary contextual information, which can be utilized to summarize the "history" and to detect the "future" for searching global solutions. (4) The relation between local strategies and global strategies is discovered and established, wherein preceding decisions can pose constraints to subsequent decisions (and vice versa). This interplay resembles the paradigm of dynamic programming, while much more convoluted. Nevertheless, the computation is always strait forward and decreases monotonically.
Recently developed reduced-order modeling techniques aim to approximate nonlinear dynamical systems on low-dimensional manifolds learned from data. This is an effective approach for modeling dynamics in a post-transient regime where the effects of initial conditions and other disturbances have decayed. However, modeling transient dynamics near an underlying manifold, as needed for real-time control and forecasting applications, is complicated by the effects of fast dynamics and nonnormal sensitivity mechanisms. To begin to address these issues, we introduce a parametric class of nonlinear projections described by constrained autoencoder neural networks in which both the manifold and the projection fibers are learned from data. Our architecture uses invertible activation functions and biorthogonal weight matrices to ensure that the encoder is a left inverse of the decoder. We also introduce new dynamics-aware cost functions that promote learning of oblique projection fibers that account for fast dynamics and nonnormality. To demonstrate these methods and the specific challenges they address, we provide a detailed case study of a three-state model of vortex shedding in the wake of a bluff body immersed in a fluid, which has a two-dimensional slow manifold that can be computed analytically. In anticipation of future applications to high-dimensional systems, we also propose several techniques for constructing computationally efficient reduced-order models using our proposed nonlinear projection framework. This includes a novel sparsity-promoting penalty for the encoder that avoids detrimental weight matrix shrinkage via computation on the Grassmann manifold.
Decentralized architecture offers a robust and flexible structure for online platforms, since centralized moderation and computation can be easy to disrupt with targeted attacks. However, a platform offering a decentralized architecture does not guarantee that users will use it in a decentralized way, and measuring the centralization of socio-technical networks is not an easy task. In this paper we introduce a method of characterizing community influence in terms of how many edges between communities would be disrupted by a community's removal. Our approach provides a careful definition of "centralization" appropriate in bipartite user-community socio-technical networks, and demonstrates the inadequacy of more trivial methods for interrogating centralization such as examining the distribution of community sizes. We use this method to compare the structure of multiple socio-technical platforms -- Mastodon, git code hosting servers, BitChute, Usenet, and Voat -- and find a range of structures, from interconnected but decentralized git servers to an effectively centralized use of Mastodon servers, as well as multiscale hybrid network structures of disconnected Voat subverses. As the ecosystem of socio-technical platforms diversifies, it becomes critical to not solely focus on the underlying technologies but also consider the structure of how users interact through the technical infrastructure.
Latent linear dynamical systems with Bernoulli observations provide a powerful modeling framework for identifying the temporal dynamics underlying binary time series data, which arise in a variety of contexts such as binary decision-making and discrete stochastic processes (e.g., binned neural spike trains). Here we develop a spectral learning method for fast, efficient fitting of probit-Bernoulli latent linear dynamical system (LDS) models. Our approach extends traditional subspace identification methods to the Bernoulli setting via a transformation of the first and second sample moments. This results in a robust, fixed-cost estimator that avoids the hazards of local optima and the long computation time of iterative fitting procedures like the expectation-maximization (EM) algorithm. In regimes where data is limited or assumptions about the statistical structure of the data are not met, we demonstrate that the spectral estimate provides a good initialization for Laplace-EM fitting. Finally, we show that the estimator provides substantial benefits to real world settings by analyzing data from mice performing a sensory decision-making task.
Despite there being significant work on developing spectral, and metric embedding based approximation algorithms for hypergraph generalizations of conductance, little is known regarding the approximability of hypergraph partitioning objectives beyond this. This work proposes algorithms for a general model of hypergraph partitioning that unifies both undirected and directed versions of many well-studied partitioning objectives. The first contribution of this paper introduces polymatroidal cut functions, a large class of cut functions amenable to approximation algorithms via metric embeddings and routing multicommodity flows. We demonstrate an $O(\sqrt{\log n})$-approximation, where $n$ is the number of vertices in the hypergraph, for these problems by rounding relaxations to metrics of negative-type. The second contribution of this paper generalizes the cut-matching game framework of Khandekar et. al. to tackle polymatroidal cut functions. This yields the first almost-linear time $O(\log n)$-approximation algorithm for standard versions of undirected and directed hypergraph partitioning. A technical consequence of our construction is that a cut-matching game which greatly relaxes the set of allowed actions for both players can be used to partition hypergraphs with negligible impact on the approximation ratio. We believe this to be of independent interest.
Diffusion models can be parameterised in terms of either a score or an energy function. The energy parameterisation has better theoretical properties, mainly that it enables an extended sampling procedure with a Metropolis--Hastings correction step, based on the change in total energy in the proposed samples. However, it seems to yield slightly worse performance, and more importantly, due to the widespread popularity of score-based diffusion, there are limited availability of off-the-shelf pre-trained energy-based ones. This limitation undermines the purpose of model composition, which aims to combine pre-trained models to sample from new distributions. Our proposal, however, suggests retaining the score parameterization and instead computing the energy-based acceptance probability through line integration of the score function. This allows us to re-use existing diffusion models and still combine the reverse process with various Markov-Chain Monte Carlo (MCMC) methods. We evaluate our method on a 2D experiment and find that it achieve similar or arguably better performance than the energy parameterisation.
In this paper, we propose a dynamic grouping negotiation model for climate mitigation based on real-world business and political negotiation protocols. Within the AI4GCC competition framework, we develop a three-stage process: group formation and updates, intra-group negotiation, and inter-group negotiation. Our model promotes efficient and effective cooperation between various stakeholders to achieve global climate change objectives. By implementing a group-forming method and group updating strategy, we address the complexities and imbalances in multi-region climate negotiations. Intra-group negotiations ensure that all members contribute to mitigation efforts, while inter-group negotiations use the proposal-evaluation framework to set mitigation and savings rates. We demonstrate our negotiation model within the RICE-N framework, illustrating a promising approach for facilitating international cooperation on climate change mitigation.
We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the number of changes made to the set of centers after each point insertion or deletion. Previous works by Lattanzi and Vassilvitskii [ICML '12] and Fichtenberger, Lattanzi, Norouzi-Fard, and Svensson [SODA '21] showed that in the incremental setting, where deletions are not allowed, one can obtain $k \cdot \textrm{polylog}(n) / n$ amortized recourse for both $k$-center and $k$-median, and demonstrated a matching lower bound. However, no algorithm for the fully dynamic setting achieves less than the trivial $O(k)$ changes per update, which can be obtained by simply reclustering the full dataset after every update. In this work, we give the first algorithm for consistent $k$-center clustering for the fully dynamic setting, i.e., when both point insertions and deletions are allowed, and improves upon a trivial $O(k)$ recourse bound. Specifically, our algorithm maintains a constant factor approximate solution while ensuring worst-case constant recourse per update, which is optimal in the fully dynamic setting. Moreover, our algorithm is deterministic and is therefore correct even if an adaptive adversary chooses the insertions and deletions.
Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path in a large graph, neural networks allow learning from data to predict the most likely answer in more complex tasks such as image classification, which cannot be reduced to an exact algorithm. To get the best of both worlds, this thesis explores combining both concepts leading to more robust, better performing, more interpretable, more computationally efficient, and more data efficient architectures. The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm. When integrating an algorithm into a neural architecture, it is important that the algorithm is differentiable such that the architecture can be trained end-to-end and gradients can be propagated back through the algorithm in a meaningful way. To make algorithms differentiable, this thesis proposes a general method for continuously relaxing algorithms by perturbing variables and approximating the expectation value in closed form, i.e., without sampling. In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable renderers, and differentiable logic gate networks. Finally, this thesis presents alternative training strategies for learning with algorithms.
Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.