An appropriate level of arousal induces positive emotions, and a high arousal potential may provoke negative emotions. To explain the effect of arousal on emotional valence, we propose a novel mathematical framework of arousal potential variations in the dual process of human cognition: automatic and controlled. A suitable mathematical formulation to explain the emotions in the dual process is still absent. Our model associates free energy with arousal potential and its variations to explain emotional valence. Decreasing and increasing free energy consequently induce positive and negative emotions, respectively. We formalize a transition from the automatic to the controlled process in the dual process as a change of Bayesian prior. Further, we model emotional valence using free energy increase (FI) when one tries changing one's Bayesian prior and its reduction (FR) when one succeeds in recognizing the same stimuli with a changed prior and define three emotions: "interest," "confusion," and "boredom" using the variations. The results of our mathematical analysis comparing various Gaussian model parameters reveals the following: 1) prediction error (PR) increases FR (representing "interest") when the first prior variance is greater than the second prior variance, 2) PR decreases FR when the first prior variance is less than the second prior variance, and 3) the distance between priors' means always increases FR. We also discuss the association of the outcomes with emotions in the controlled process. The proposed mathematical model provides a general framework for predicting and controlling emotional valence in the dual process that varies with viewpoint and stimuli, as well as for understanding the contradictions in the effects of arousal on the valence.
Reconfigurable Intelligent Surfaces (RIS) are a new paradigm which, with judicious deployment and alignment, can enable more favorable propagation environments and better wireless network design. As such, they can offer a number of potential benefits for next generation wireless systems including improved coverage, better interference management and even security. In this paper, we consider an uplink next generation wireless system where each user is assisted with an RIS. We study the uplink power control problem in this distributed RIS-assisted wireless network. Specifically, we aim to minimize total uplink transmit power of all the users subject to each user's reliable communication requirements at the base station by a joint design of power, receiver filter and RIS phase matrices. We propose an iterative power control algorithm, combined with a successive convex approximation technique to solve the problem with non-convex phase constraints. Numerical results illustrate that distributed RIS assistance leads to uplink power savings when direct links are weak.
Recent advances in Federated Learning (FL) have paved the way towards the design of novel strategies for solving multiple learning tasks simultaneously, by leveraging cooperation among networked devices. Multi-Task Learning (MTL) exploits relevant commonalities across tasks to improve efficiency compared with traditional transfer learning approaches. By learning multiple tasks jointly, significant reduction in terms of energy footprints can be obtained. This article provides a first look into the energy costs of MTL processes driven by the Model-Agnostic Meta-Learning (MAML) paradigm and implemented in distributed wireless networks. The paper targets a clustered multi-task network setup where autonomous agents learn different but related tasks. The MTL process is carried out in two stages: the optimization of a meta-model that can be quickly adapted to learn new tasks, and a task-specific model adaptation stage where the learned meta-model is transferred to agents and tailored for a specific task. This work analyzes the main factors that influence the MTL energy balance by considering a multi-task Reinforcement Learning (RL) setup in a robotized environment. Results show that the MAML method can reduce the energy bill by at least 2 times compared with traditional approaches without inductive transfer. Moreover, it is shown that the optimal energy balance in wireless networks depends on uplink/downlink and sidelink communication efficiencies.
An evaluation criterion for safe and trustworthy deep learning is how well the invariances captured by representations of deep neural networks (DNNs) are shared with humans. We identify challenges in measuring these invariances. Prior works used gradient-based methods to generate identically represented inputs (IRIs), ie, inputs which have identical representations (on a given layer) of a neural network, and thus capture invariances of a given network. One necessary criterion for a network's invariances to align with human perception is for its IRIs look 'similar' to humans. Prior works, however, have mixed takeaways; some argue that later layers of DNNs do not learn human-like invariances (\cite{jenelle2019metamers}) yet others seem to indicate otherwise (\cite{mahendran2014understanding}). We argue that the loss function used to generate IRIs can heavily affect takeaways about invariances of the network and is the primary reason for these conflicting findings. We propose an adversarial regularizer on the IRI generation loss that finds IRIs that make any model appear to have very little shared invariance with humans. Based on this evidence, we argue that there is scope for improving models to have human-like invariances, and further, to have meaningful comparisons between models one should use IRIs generated using the regularizer-free loss. We then conduct an in-depth investigation of how different components (eg architectures, training losses, data augmentations) of the deep learning pipeline contribute to learning models that have good alignment with humans. We find that architectures with residual connections trained using a (self-supervised) contrastive loss with $\ell_p$ ball adversarial data augmentation tend to learn invariances that are most aligned with humans. Code: \url{github.com/nvedant07/Human-NN-Alignment}.
In the present paper we initiate the challenging task of building a mathematically sound theory for Adaptive Virtual Element Methods (AVEMs). Among the realm of polygonal meshes, we restrict our analysis to triangular meshes with hanging nodes in 2d -- the simplest meshes with a systematic refinement procedure that preserves shape regularity and optimal complexity. A major challenge in the a posteriori error analysis of AVEMs is the presence of the stabilization term, which is of the same order as the residual-type error estimator but prevents the equivalence of the latter with the energy error. Under the assumption that any chain of recursively created hanging nodes has uniformly bounded length, we show that the stabilization term can be made arbitrarily small relative to the error estimator provided the stabilization parameter of the scheme is sufficiently large. This quantitative estimate leads to stabilization-free upper and lower a posteriori bounds for the energy error. This novel and crucial property of VEMs hinges on the largest subspace of continuous piecewise linear functions and the delicate interplay between its coarser scales and the finer ones of the VEM space. An important consequence for piecewise constant data is a contraction property between consecutive loops of AVEMs, which we also prove. Our results apply to $H^1$-conforming (lowest order) VEMs of any kind, including the classical and enhanced VEMs.
We consider a simple one-way averaging protocol on graphs. Initially, every node of the graph has a value. A node $u$ is chosen uniformly at random and $u$ samples $k$ neighbours $v_1,v_2,\cdots, v_k \in N(u)$ uniformly at random. Then, $u$ averages its value with $v$ as follows: $\xi_u(t+1) = \alpha \xi_u(t) + \frac{(1-\alpha)}{k} \sum_{i=1}^k \xi_{v_i}(t)$ for some $\alpha \in (0,1)$, where $\xi_u(t)$ is the value of node $u$ at time $t$. Note that, in contrast to neighbourhood value balancing, only $u$ changes its value. Hence, the sum (and the average) of the values of all nodes changes over time. Our results are two-fold. First, we show a bound on the convergence time (the time it takes until all values are roughly the same) that is asymptotically tight for some initial assignments of values to the nodes. Our second set of results concerns the ability of this protocol to approximate well the initial average of all values: we bound the probability that the final outcome is significantly away from the initial average. Interestingly, the variance of the outcome does not depend on the graph structure. The proof introduces an interesting generalisation of the duality between coalescing random walks and the voter model.
Optimal transport (OT) is a framework that can guide the design of efficient resource allocation strategies in a network of multiple sources and targets. To ease the computational complexity of large-scale transport design, we first develop a distributed algorithm based on the alternating direction method of multipliers (ADMM). However, such a distributed algorithm is vulnerable to sensitive information leakage when an attacker intercepts the transport decisions communicated between nodes during the distributed ADMM updates. To this end, we propose a privacy-preserving distributed mechanism based on output variable perturbation by adding appropriate randomness to each node's decision before it is shared with other corresponding nodes at each update instance. We show that the developed scheme is differentially private, which prevents the adversary from inferring the node's confidential information even knowing the transport decisions. Finally, we corroborate the effectiveness of the devised algorithm through case studies.
Process mining is a methodology for the derivation and analysis of process models based on the event log. When process mining is employed to analyze business processes, the process discovery step, the conformance checking step, and the enhancements step are repeated. If a user wants to analyze a process from multiple perspectives (such as activity perspectives, originator perspectives, and time perspectives), the above procedure, inconveniently, has to be repeated over and over again. Although past studies involving process mining have applied detailed stepwise methodologies, no attempt has been made to incorporate and optimize multi-perspective process mining procedures. This paper contributes to developing a solution approach to this problem. First, we propose an automatic discovery framework of a multi-perspective process model based on deep Q-Learning. Our Dual Experience Replay with Experience Distribution (DERED) approach can automatically perform process model discovery steps, conformance check steps, and enhancements steps. Second, we propose a new method that further optimizes the experience replay (ER) method, one of the key algorithms of deep Q-learning, to improve the learning performance of reinforcement learning agents. Finally, we validate our approach using six real-world event datasets collected in port logistics, steel manufacturing, finance, IT, and government administration. We show that our DERED approach can provide users with multi-perspective, high-quality process models that can be employed more conveniently for multi-perspective process mining.
The rapid changes in the finance industry due to the increasing amount of data have revolutionized the techniques on data processing and data analysis and brought new theoretical and computational challenges. In contrast to classical stochastic control theory and other analytical approaches for solving financial decision-making problems that heavily reply on model assumptions, new developments from reinforcement learning (RL) are able to make full use of the large amount of financial data with fewer model assumptions and to improve decisions in complex financial environments. This survey paper aims to review the recent developments and use of RL approaches in finance. We give an introduction to Markov decision processes, which is the setting for many of the commonly used RL approaches. Various algorithms are then introduced with a focus on value and policy based methods that do not require any model assumptions. Connections are made with neural networks to extend the framework to encompass deep RL algorithms. Our survey concludes by discussing the application of these RL algorithms in a variety of decision-making problems in finance, including optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising.
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI). Owing to sophisticated pre-training objectives and huge model parameters, large-scale PTMs can effectively capture knowledge from massive labeled and unlabeled data. By storing knowledge into huge parameters and fine-tuning on specific tasks, the rich knowledge implicitly encoded in huge parameters can benefit a variety of downstream tasks, which has been extensively demonstrated via experimental verification and empirical analysis. It is now the consensus of the AI community to adopt PTMs as backbone for downstream tasks rather than learning models from scratch. In this paper, we take a deep look into the history of pre-training, especially its special relation with transfer learning and self-supervised learning, to reveal the crucial position of PTMs in the AI development spectrum. Further, we comprehensively review the latest breakthroughs of PTMs. These breakthroughs are driven by the surge of computational power and the increasing availability of data, towards four important directions: designing effective architectures, utilizing rich contexts, improving computational efficiency, and conducting interpretation and theoretical analysis. Finally, we discuss a series of open problems and research directions of PTMs, and hope our view can inspire and advance the future study of PTMs.
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.