In Coevolving Latent Space Networks with Attractors (CLSNA) models, nodes in a latent space represent social actors, and edges indicate their dynamic interactions. Attractors are added at the latent level to capture the notion of attractive and repulsive forces between nodes, borrowing from dynamical systems theory. However, CLSNA reliance on MCMC estimation makes scaling difficult, and the requirement for nodes to be present throughout the study period limit practical applications. We address these issues by (i) introducing a Stochastic gradient descent (SGD) parameter estimation method, (ii) developing a novel approach for uncertainty quantification using SGD, and (iii) extending the model to allow nodes to join and leave over time. Simulation results show that our extensions result in little loss of accuracy compared to MCMC, but can scale to much larger networks. We apply our approach to the longitudinal social networks of members of US Congress on the social media platform X. Accounting for node dynamics overcomes selection bias in the network and uncovers uniquely and increasingly repulsive forces within the Republican Party.
Reinforcement learning (RL) with continuous state and action spaces remains one of the most challenging problems within the field. Most current learning methods focus on integral identities such as value functions to derive an optimal strategy for the learning agent. In this paper, we instead study the dual form of the original RL formulation to propose the first differential RL framework that can handle settings with limited training samples and short-length episodes. Our approach introduces Differential Policy Optimization (DPO), a pointwise and stage-wise iteration method that optimizes policies encoded by local-movement operators. We prove a pointwise convergence estimate for DPO and provide a regret bound comparable with current theoretical works. Such pointwise estimate ensures that the learned policy matches the optimal path uniformly across different steps. We then apply DPO to a class of practical RL problems which search for optimal configurations with Lagrangian rewards. DPO is easy to implement, scalable, and shows competitive results on benchmarking experiments against several popular RL methods.
Unmanned Surface Vehicles (USVs) play a pivotal role in various applications, including surface rescue, commercial transactions, scientific exploration, water rescue, and military operations. The effective control of high-speed unmanned surface boats stands as a critical aspect within the overall USV system, particularly in challenging environments marked by complex surface obstacles and dynamic conditions, such as time-varying surges, non-directional forces, and unpredictable winds. In this paper, we propose a data-driven control method based on Koopman theory. This involves constructing a high-dimensional linear model by mapping a low-dimensional nonlinear model to a higher-dimensional linear space through data identification. The observable USVs dynamical system is dynamically reconstructed using online error learning. To enhance tracking control accuracy, we utilize a Constructive Lyapunov Function (CLF)-Control Barrier Function (CBF)-Quadratic Programming (QP) approach to regulate the high-dimensional linear dynamical system obtained through identification. This approach facilitates error compensation, thereby achieving more precise tracking control.
We argue that the current POW based consensus algorithm of the Bitcoin network suffers from a fundamental economic discrepancy between the real world transaction (txn) costs incurred by miners and the wealth that is being transacted. Put simply, whether one transacts 1 satoshi or 1 bitcoin, the same amount of electricity is needed when including this txn into a block. The notorious Bitcoin blockchain problems such as its high energy usage per txn or its scalability issues are, either partially or fully, mere consequences of this fundamental economic inconsistency. We propose making the computational cost of securing the txns proportional to the wealth being transferred, at least temporarily. First, we present a simple incentive based model of Bitcoin's security. Then, guided by this model, we augment each txn by two parameters, one controlling the time spent securing this txn and the second determining the fraction of the network used to accomplish this. The current Bitcoin txns are naturally embedded into this parametrized space. Then we introduce a sequence of hierarchical block structures (HBSs) containing these parametrized txns. The first of those HBSs exploits only a single degree of freedom of the extended txn, namely the time investment, but it allows already for txns with a variable level of trust together with aligned network fees and energy usage. In principle, the last HBS should scale to tens of thousands timely txns per second while preserving what the previous HBSs achieved. We also propose a simple homotopy based transition mechanism which enables us to relatively safely and continuously introduce new HBSs into the existing blockchain. Our approach is constructive and as rigorous as possible and we attempt to analyze all aspects of these developments, al least at a conceptual level. The process is supported by evaluation on recent transaction data.
Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption that all variables are jointly normally distributed. We present a semiparametric approach to CCA in which the multivariate margins of each variable set may be arbitrary, but the dependence between variable sets is described by a parametric model that provides low-dimensional summaries of dependence. While maximum likelihood estimation in the proposed model is intractable, we propose two estimation strategies: one using a pseudolikelihood for the model and one using a Markov chain Monte Carlo (MCMC) algorithm that provides Bayesian estimates and confidence regions for the between-set dependence parameters. The MCMC algorithm is derived from a multirank likelihood function, which uses only part of the information in the observed data in exchange for being free of assumptions about the multivariate margins. We apply the proposed Bayesian inference procedure to Brazilian climate data and monthly stock returns from the materials and communications market sectors.
We provide full theoretical guarantees for the convergence behaviour of diffusion-based generative models under the assumption of strongly log-concave data distributions while our approximating class of functions used for score estimation is made of Lipschitz continuous functions. We demonstrate via a motivating example, sampling from a Gaussian distribution with unknown mean, the powerfulness of our approach. In this case, explicit estimates are provided for the associated optimization problem, i.e. score approximation, while these are combined with the corresponding sampling estimates. As a result, we obtain the best known upper bound estimates in terms of key quantities of interest, such as the dimension and rates of convergence, for the Wasserstein-2 distance between the data distribution (Gaussian with unknown mean) and our sampling algorithm. Beyond the motivating example and in order to allow for the use of a diverse range of stochastic optimizers, we present our results using an $L^2$-accurate score estimation assumption, which crucially is formed under an expectation with respect to the stochastic optimizer and our novel auxiliary process that uses only known information. This approach yields the best known convergence rate for our sampling algorithm.
The use of ML models to predict a user's cognitive state from behavioral data has been studied for various applications which includes predicting the intent to perform selections in VR. We developed a novel technique that uses gaze-based intent models to adapt dwell-time thresholds to aid gaze-only selection. A dataset of users performing selection in arithmetic tasks was used to develop intent prediction models (F1 = 0.94). We developed GazeIntent to adapt selection dwell times based on intent model outputs and conducted an end-user study with returning and new users performing additional tasks with varied selection frequencies. Personalized models for returning users effectively accounted for prior experience and were preferred by 63% of users. Our work provides the field with methods to adapt dwell-based selection to users, account for experience over time, and consider tasks that vary by selection frequency
Random utility maximisation (RUM) models are one of the cornerstones of discrete choice modelling. However, specifying the utility function of RUM models is not straightforward and has a considerable impact on the resulting interpretable outcomes and welfare measures. In this paper, we propose a new discrete choice model based on artificial neural networks (ANNs) named "Alternative-Specific and Shared weights Neural Network (ASS-NN)", which provides a further balance between flexible utility approximation from the data and consistency with two assumptions: RUM theory and fungibility of money (i.e., "one euro is one euro"). Therefore, the ASS-NN can derive economically-consistent outcomes, such as marginal utilities or willingness to pay, without explicitly specifying the utility functional form. Using a Monte Carlo experiment and empirical data from the Swissmetro dataset, we show that ASS-NN outperforms (in terms of goodness of fit) conventional multinomial logit (MNL) models under different utility specifications. Furthermore, we show how the ASS-NN is used to derive marginal utilities and willingness to pay measures.
In a Jacobi--Davidson (JD) type method for singular value decomposition (SVD) problems, called JDSVD, a large symmetric and generally indefinite correction equation is approximately solved iteratively at each outer iteration, which constitutes the inner iterations and dominates the overall efficiency of JDSVD. In this paper, a convergence analysis is made on the minimal residual (MINRES) method for the correction equation. Motivated by the results obtained, a preconditioned correction equation is derived that extracts useful information from current searching subspaces to construct effective preconditioners for the correction equation and is proved to retain the same convergence of outer iterations of JDSVD. The resulting method is called inner preconditioned JDSVD (IPJDSVD) method. Convergence results show that MINRES for the preconditioned correction equation can converge much faster when there is a cluster of singular values closest to a given target, so that IPJDSVD is more efficient than JDSVD. A new thick-restart IPJDSVD algorithm with deflation and purgation is proposed that simultaneously accelerates the outer and inner convergence of the standard thick-restart JDSVD and computes several singular triplets of a large matrix. Numerical experiments justify the theory and illustrate the considerable superiority of IPJDSVD to JDSVD.
Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that jointly i) finds the most predictive subgraph by removing edges and/or nodes -- -\emph{without making assumptions about the subgraph structure} -- while ii) optimizing the performance of the graph classification task. To that end, we rely on reinforcement learning to solve the resulting bi-level optimization with a reward function based on conformal predictions to account for the current in-training uncertainty of the classifier. Our empirical results on nine different graph classification datasets show that our method competes in performance with baselines while relying on significantly sparser subgraphs, leading to more interpretable GNN-based predictions.
Quantized tensor trains (QTTs) have recently emerged as a framework for the numerical discretization of continuous functions, with the potential for widespread applications in numerical analysis. However, the theory of QTT approximation is not fully understood. In this work, we advance this theory from the point of view of multiscale polynomial interpolation. This perspective clarifies why QTT ranks decay with increasing depth, quantitatively controls QTT rank in terms of smoothness of the target function, and explains why certain functions with sharp features and poor quantitative smoothness can still be well approximated by QTTs. The perspective also motivates new practical and efficient algorithms for the construction of QTTs from function evaluations on multiresolution grids.