We study the problem of learning unknown parameters in stochastic interacting particle systems with polynomial drift, interaction and diffusion functions from the path of one single particle in the system. Our estimator is obtained by solving a linear system which is constructed by imposing appropriate conditions on the moments of the invariant distribution of the mean field limit and on the quadratic variation of the process. Our approach is easy to implement as it only requires the approximation of the moments via the ergodic theorem and the solution of a low-dimensional linear system. Moreover, we prove that our estimator is asymptotically unbiased in the limits of infinite data and infinite number of particles (mean field limit). In addition, we present several numerical experiments that validate the theoretical analysis and show the effectiveness of our methodology to accurately infer parameters in systems of interacting particles.
Multi-robot cooperative control has gained extensive research interest due to its wide applications in civil, security, and military domains. This paper proposes a cooperative control algorithm for multi-robot systems with general linear dynamics. The algorithm is based on distributed cooperative optimisation and output regulation, and it achieves global optimum by utilising only information shared among neighbouring robots. Technically, a high-level distributed optimisation algorithm for multi-robot systems is presented, which will serve as an optimal reference generator for each individual agent. Then, based on the distributed optimisation algorithm, an output regulation method is utilised to solve the optimal coordination problem for general linear dynamic systems. The convergence of the proposed algorithm is theoretically proved. Both numerical simulations and real-time physical robot experiments are conducted to validate the effectiveness of the proposed cooperative control algorithms.
A treatment policy defines when and what treatments are applied to affect some outcome of interest. Data-driven decision-making requires the ability to predict what happens if a policy is changed. Existing methods that predict how the outcome evolves under different scenarios assume that the tentative sequences of future treatments are fixed in advance, while in practice the treatments are determined stochastically by a policy and may depend for example on the efficiency of previous treatments. Therefore, the current methods are not applicable if the treatment policy is unknown or a counterfactual analysis is needed. To handle these limitations, we model the treatments and outcomes jointly in continuous time, by combining Gaussian processes and point processes. Our model enables the estimation of a treatment policy from observational sequences of treatments and outcomes, and it can predict the interventional and counterfactual progression of the outcome after an intervention on the treatment policy (in contrast with the causal effect of a single treatment). We show with real-world and semi-synthetic data on blood glucose progression that our method can answer causal queries more accurately than existing alternatives.
In the present paper, we study a Crouzeix-Raviart approximation of the obstacle problem, which imposes the obstacle constraint in the midpoints (i.e., barycenters) of the elements of a triangulation. We establish a priori error estimates imposing natural regularity assumptions, which are optimal, and the reliability and efficiency of a primal-dual type a posteriori error estimator for general obstacles and involving data oscillation terms stemming only from the right-hand side. The theoretical findings are supported by numerical experiments.
A complex system with cluttered observations may be a coupled mixture of multiple simple sub-systems corresponding to latent entities. Such sub-systems may hold distinct dynamics in the continuous-time domain; therein, complicated interactions between sub-systems also evolve over time. This setting is fairly common in the real world but has been less considered. In this paper, we propose a sequential learning approach under this setting by decoupling a complex system for handling irregularly sampled and cluttered sequential observations. Such decoupling brings about not only subsystems describing the dynamics of each latent entity but also a meta-system capturing the interaction between entities over time. Specifically, we argue that the meta-system evolving within a simplex is governed by projected differential equations (ProjDEs). We further analyze and provide neural-friendly projection operators in the context of Bregman divergence. Experimental results on synthetic and real-world datasets show the advantages of our approach when facing complex and cluttered sequential data compared to the state-of-the-art.
This work introduces a general framework for establishing the long time accuracy for approximations of Markovian dynamical systems on separable Banach spaces. Our results illuminate the role that a certain uniformity in Wasserstein contraction rates for the approximating dynamics bears on long time accuracy estimates. In particular, our approach yields weak consistency bounds on $\mathbb{R}^+$ while providing a means to sidestepping a commonly occurring situation where certain higher order moment bounds are unavailable for the approximating dynamics. Additionally, to facilitate the analytical core of our approach, we develop a refinement of certain `weak Harris theorems'. This extension expands the scope of applicability of such Wasserstein contraction estimates to a variety of interesting SPDE examples involving weaker dissipation or stronger nonlinearity than would be covered by the existing literature. As a guiding and paradigmatic example, we apply our formalism to the stochastic 2D Navier-Stokes equations and to a semi-implicit in time and spectral Galerkin in space numerical approximation of this system. In the case of a numerical approximation, we establish quantitative estimates on the approximation of invariant measures as well as prove weak consistency on $\mathbb{R}^+$. To develop these numerical analysis results, we provide a refinement of $L^2_x$ accuracy bounds in comparison to the existing literature which are results of independent interest.
In neural networks, continual learning results in gradient interference among sequential tasks, leading to catastrophic forgetting of old tasks while learning new ones. This issue is addressed in recent methods by storing the important gradient spaces for old tasks and updating the model orthogonally during new tasks. However, such restrictive orthogonal gradient updates hamper the learning capability of the new tasks resulting in sub-optimal performance. To improve new learning while minimizing forgetting, in this paper we propose a Scaled Gradient Projection (SGP) method, where we combine the orthogonal gradient projections with scaled gradient steps along the important gradient spaces for the past tasks. The degree of gradient scaling along these spaces depends on the importance of the bases spanning them. We propose an efficient method for computing and accumulating importance of these bases using the singular value decomposition of the input representations for each task. We conduct extensive experiments ranging from continual image classification to reinforcement learning tasks and report better performance with less training overhead than the state-of-the-art approaches.
Nonparametric estimation for semilinear SPDEs, namely stochastic reaction-diffusion equations in one space dimension, is studied. We consider observations of the solution field on a discrete grid in time and space with infill asymptotics in both coordinates. Firstly, we derive a nonparametric estimator for the reaction function of the underlying equation. The estimate is chosen from a finite-dimensional function space based on a least squares criterion. Oracle inequalities provide conditions for the estimator to achieve the usual nonparametric rate of convergence. Adaptivity is provided via model selection. Secondly, we show that the asymptotic properties of realized quadratic variation based estimators for the diffusivity and volatility carry over from linear SPDEs. In particular, we obtain a rate-optimal joint estimator of the two parameters. The result relies on our precise analysis of the H\"older regularity of the solution process and its nonlinear component, which may be of its own interest. Both steps of the calibration can be carried out simultaneously without prior knowledge of the parameters.
In the Colored Clustering problem, one is asked to cluster edge-colored (hyper-)graphs whose colors represent interaction types. More specifically, the goal is to select as many edges as possible without choosing two edges that share an endpoint and are colored differently. Equivalently, the goal can also be described as assigning colors to the vertices in a way that fits the edge-coloring as well as possible. As this problem is NP-hard, we build on previous work by studying its parameterized complexity. We give a $2^{\mathcal O(k)} \cdot n^{\mathcal O(1)}$-time algorithm where $k$ is the number of edges to be selected and $n$ the number of vertices. We also prove the existence of a problem kernel of size $\mathcal O(k^{5/2} )$, resolving an open problem posed in the literature. We consider parameters that are smaller than $k$, the number of edges to be selected, and $r$, the number of edges that can be deleted. Such smaller parameters are obtained by considering the difference between $k$ or $r$ and some lower bound on these values. We give both algorithms and lower bounds for Colored Clustering with such parameterizations. Finally, we settle the parameterized complexity of Colored Clustering with respect to structural graph parameters by showing that it is $W[1]$-hard with respect to both vertex cover number and tree-cut width, but fixed-parameter tractable with respect to slim tree-cut width.
In recent years, several swarm intelligence optimization algorithms have been proposed to be applied for solving a variety of optimization problems. However, the values of several hyperparameters should be determined. For instance, although Particle Swarm Optimization (PSO) has been applied for several applications with higher optimization performance, the weights of inertial velocity, the particle's best known position and the swarm's best known position should be determined. Therefore, this study proposes an analytic framework to analyze the optimized average-fitness-function-value (AFFV) based on mathematical models for a variety of fitness functions. Furthermore, the optimized hyperparameter values could be determined with a lower AFFV for minimum cases. Experimental results show that the hyperparameter values from the proposed method can obtain higher efficiency convergences and lower AFFVs.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.