Over three decades of scientific endeavors to realize programmable matter, a substance that can change its physical properties based on user input or responses to its environment, there have been many advances in both the engineering of modular robotic systems and the corresponding algorithmic theory of collective behavior. However, while the design of modular robots routinely addresses the challenges of realistic three-dimensional (3D) space, algorithmic theory remains largely focused on 2D abstractions such as planes and planar graphs. In this work, we formalize the 3D geometric space variant for the canonical amoebot model of programmable matter, using the face-centered cubic (FCC) lattice to represent space and define local spatial orientations. We then give a distributed algorithm for leader election in connected, contractible 2D or 3D geometric amoebot systems that deterministically elects exactly one leader in $\mathcal{O}(n)$ rounds under an unfair sequential adversary, where $n$ is the number of amoebots in the system. We then demonstrate how this algorithm can be transformed using the concurrency control framework for amoebot algorithms (DISC 2021) to obtain the first known amoebot algorithm, both in 2D and 3D space, to solve leader election under an unfair asynchronous adversary.
This paper deals with the derivation of Non-Intrusive Reduced Basis (NIRB) techniques for sensitivity analysis, more specifically the direct and adjoint state methods. For highly complex parametric problems, these two approaches may become too costly. To reduce computational times, Proper Orthogonal Decomposition (POD) and Reduced Basis Methods (RBMs) have already been investigated. The majority of these algorithms are however intrusive in the sense that the High-Fidelity (HF) code must be modified. To address this issue, non-intrusive strategies are employed. The NIRB two-grid method uses the HF code solely as a ``black-box'', requiring no code modification. Like other RBMs, it is based on an offline-online decomposition. The offline stage is time-consuming, but it is only executed once, whereas the online stage is significantly less expensive than an HF evaluation. In this paper, we propose new NIRB two-grid algorithms for both the direct and adjoint state methods. On the direct method, we prove on a classical model problem, the heat equation, that HF evaluations of sensitivities reach an optimal convergence rate in $L^{\infty}(0,T;H^1(\Omega))$, and then establish that these rates are recovered by the proposed NIRB approximation. These results are supported by numerical simulations. We then numerically demonstrate that a Gaussian process regression can be used to approximate the projection coefficients of the NIRB two-grid method. This further reduces the computational costs of the online step while only computing a coarse solution of the initial problem. All numerical results are run with the model problem as well as a more complex problem, namely the Brusselator system.
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. In particular, we introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work. Then, the proposed algorithm leverages the powerful representation of NNs for both exploitation and exploration, has the query decision-maker tailored for $k$-class classification problems with the performance guarantee, utilizes the full feedback, and updates parameters in a more practical and efficient manner. These careful designs lead to an instance-dependent regret upper bound, roughly improving by a multiplicative factor $O(\log T)$ and removing the curse of input dimensionality. Furthermore, we show that the algorithm can achieve the same performance as the Bayes-optimal classifier in the long run under the hard-margin setting in classification problems. In the end, we use extensive experiments to evaluate the proposed algorithm and SOTA baselines, to show the improved empirical performance.
Energy system modellers typically choose a low spatial resolution for their models based on administrative boundaries such as countries, which eases data collection and reduces computation times. However, a low spatial resolution can lead to sub-optimal investment decisions for wind and solar generation. Ignoring power grid bottlenecks within regions tends to underestimate system costs, while combining locations with different wind and solar capacity factors in the same resource class tends to overestimate costs. We investigate these two competing effects in a capacity expansion model for Europe's power system with a high share of renewables, taking advantage of newly-available high-resolution datasets as well as computational advances. We vary the number of nodes, interpolating between a 37-node model based on country and synchronous zone boundaries, and a 1024-node model based on the location of electricity substations. If we focus on the effect of renewable resource resolution and ignore network restrictions, we find that a higher resolution allows the optimal solution to concentrate wind and solar capacity at sites with better capacity factors and thus reduces system costs by up to 10% compared to a low resolution model. This results in a big swing from offshore to onshore wind investment. However, if we introduce grid bottlenecks by raising the network resolution, costs increase by up to 23% as generation has to be sourced more locally at sites with worse capacity factors. These effects are most pronounced in scenarios where grid expansion is limited, for example, by low local acceptance. We show that allowing grid expansion mitigates some of the effects of the low grid resolution, and lowers overall costs by around 16%.
Threshold guards are a basic primitive of many fault-tolerant algorithms that solve classical problems in distributed computing, such as reliable broadcast, two-phase commit, and consensus. Moreover, threshold guards can be found in recent blockchain algorithms such as, e.g., Tendermint consensus. In this article, we give an overview of techniques for automated verification of threshold-guarded fault-tolerant distributed algorithms, implemented in the Byzantine Model Checker (ByMC). These threshold-guarded algorithms have the following features: (1) up to $t$ of processes may crash or behave Byzantine; (2) the correct processes count messages and make progress when they receive sufficiently many messages, e.g., at least $t+1$; (3) the number $n$ of processes in the system is a parameter, as well as the number $t$ of faults; and (4) the parameters are restricted by a resilience condition, e.g., $n > 3t$. Traditionally, these algorithms were implemented in distributed systems with up to ten participating processes. Nowadays, they are implemented in distributed systems that involve hundreds or thousands of processes. To make sure that these algorithms are still correct for that scale, it is imperative to verify them for all possible values of the parameters.
In computational practice, most attention is paid to rational approximations of functions and approximations by the sum of exponents. We consider a wide enough class of nonlinear approximations characterized by a set of two required parameters. The approximating function is linear in the first parameter; these parameters are assumed to be positive. The individual terms of the approximating function represent a fixed function that depends nonlinearly on the second parameter. A numerical approximation minimizes the residual functional by approximating function values at individual points. The second parameter's value is set on a more extensive set of points of the interval of permissible values. The proposed approach's key feature consists in determining the first parameter on each separate iteration of the classical non-negative least squares method. The computational algorithm is used to rational approximate the function $x^{-\alpha}, \ 0 < \alpha < 1, \ x \geq 1$. The second example concerns the approximation of the stretching exponential function $\exp(- x^{\alpha} ), \ \ \quad 0 < \alpha < 1$ at $ x \geq 0$ by the sum of exponents.
This paper provides a framework to show the concentration of solutions $Y^*$ to convex minimizing problem where the objective function $\phi(X)(Y)$ depends on some random vector $X$ satisfying concentration of measure hypotheses. More precisely, the convex problem translates into a contractive fixed point equation that ensure the transmission of the concentration from $X$ to $Y^*$. This result is of central interest to characterize many machine learning algorithms which are defined through implicit equations (e.g., logistic regression, lasso, boosting, etc.). Based on our framework, we provide precise estimations for the first moments of the solution $Y^*$, when $X= (x_1,\ldots, x_n)$ is a data matrix of independent columns and $\phi(X)(y)$ writes as a sum $\frac{1}{n}\sum_{i=1}^n h_i(x_i^TY)$. That allows to describe the behavior and performance (e.g., generalization error) of a wide variety of machine learning classifiers.
Composition is a key feature of differential privacy. Well-known advanced composition theorems allow one to query a private database quadratically more times than basic privacy composition would permit. However, these results require that the privacy parameters of all algorithms be fixed before interacting with the data. To address this, Rogers et al. introduced fully adaptive composition, wherein both algorithms and their privacy parameters can be selected adaptively. The authors introduce two probabilistic objects to measure privacy in adaptive composition: privacy filters, which provide differential privacy guarantees for composed interactions, and privacy odometers, time-uniform bounds on privacy loss. There are substantial gaps between advanced composition and existing filters and odometers. First, existing filters place stronger assumptions on the algorithms being composed. Second, these odometers and filters suffer from large constants, making them impractical. We construct filters that match the tightness of advanced composition, including constants, despite allowing for adaptively chosen privacy parameters. En route we also derive a privacy filter for approximate zCDP and approximate RDP. We also construct several general families of odometers. These odometers can match the tightness of advanced composition at an arbitrary, preselected point in time, or at all points in time simultaneously, up to a doubly-logarithmic factor. We obtain our results by leveraging recent advances in time-uniform martingale concentration. In sum, we show that fully adaptive privacy is obtainable at almost no loss, and conjecture that our results are essentially unimprovable (even in constants) in general.
We consider in discrete time, a general class of sequential stochastic dynamic games with asymmetric information with the following features. The underlying system has Markovian dynamics controlled by the agents' joint actions. Each agent's instantaneous utility depends on the current system state and the agents' joint actions. At each time instant each agent makes a private noisy observation of the current system state and the agents' actions in the previous time instant. In addition, at each time instant all agents have a common noisy observation of the current system state and their actions in the previous time instant. Each agent's actions are part of his private information. The objective is to determine Bayesian Nash Equilibrium (BNE) strategy profiles that are based on a compressed version of the agents' information and can be sequentially computed; such BNE strategy profiles may not always exist. We present an approach/methodology that achieves the above-stated objective, along with an instance of a game where BNE strategy profiles with the above-mentioned characteristics exist. We show that the methodology also works for the case where the agents have no common observations.
Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.
The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.