From health to education, income impacts a huge range of life choices. Many papers have leveraged data from online social networks to study precisely this. In this paper, we ask the opposite question: do different levels of income result in different online behaviors? We demonstrate it does. We present the first large-scale study of Nextdoor, a popular location-based social network. We collect 2.6 Million posts from 64,283 neighborhoods in the United States and 3,325 neighborhoods in the United Kingdom, to examine whether online discourse reflects the income and income inequality of a neighborhood. We show that posts from neighborhoods with different income indeed differ, e.g. richer neighborhoods have a more positive sentiment and discuss crimes more, even though their actual crime rates are much lower. We then show that user-generated content can predict both income and inequality. We train multiple machine learning models and predict both income (R-Square=0.841) and inequality (R-Square=0.77).
We develop a boundary integral equation-based numerical method to solve for the electrostatic potential in two dimensions, inside a medium with piecewise constant conductivity, where the boundary condition is given by the complete electrode model (CEM). The CEM is seen as the most accurate model of the physical setting where electrodes are placed on the surface of an electrically conductive body, and currents are injected through the electrodes and the resulting voltages are measured again on these same electrodes. The integral equation formulation is based on expressing the electrostatic potential as the solution to a finite number of Laplace equations which are coupled through boundary matching conditions. This allows us to re-express the solution in terms of single layer potentials; the problem is thus re-cast as a system of integral equations on a finite number of smooth curves. We discuss an adaptive method for the solution of the resulting system of mildly singular integral equations. This solver is both fast and accurate. We then present a numerical inverse solver for electrical impedance tomography (EIT) which uses our forward solver at its core. To demonstrate the applicability of our results we test our numerical methods on an open electrical impedance tomography data set provided by the Finnish Inverse Problems Society.
Modern data analysis frequently involves large-scale hypothesis testing, which naturally gives rise to the problem of maintaining control of a suitable type I error rate, such as the false discovery rate (FDR). In many biomedical and technological applications, an additional complexity is that hypotheses are tested in an online manner, one-by-one over time. However, traditional procedures that control the FDR, such as the Benjamini-Hochberg procedure, assume that all p-values are available to be tested at a single time point. To address these challenges, a new field of methodology has developed over the past 15 years showing how to control error rates for online multiple hypothesis testing. In this framework, hypotheses arrive in a stream, and at each time point the analyst decides whether to reject the current hypothesis based both on the evidence against it, and on the previous rejection decisions. In this paper, we present a comprehensive exposition of the literature on online error rate control, with a review of key theory as well as a focus on applied examples. We also provide simulation results comparing different online testing algorithms and an up-to-date overview of the many methodological extensions that have been proposed.
Mobility systems often suffer from a high price of anarchy due to the uncontrolled behavior of selfish users. This may result in societal costs that are significantly higher compared to what could be achieved by a centralized system-optimal controller. Monetary tolling schemes can effectively align the behavior of selfish users with the system-optimum. Yet, they inevitably discriminate the population in terms of income. Artificial currencies were recently presented as an effective alternative that can achieve the same performance, whilst guaranteeing fairness among the population. However, those studies were based on behavioral models that may differ from practical implementations. This paper presents a data-driven approach to automatically adapt artificial-currency tolls within repetitive-game settings. We first consider a parallel-arc setting whereby users commute on a daily basis from an individual origin to an individual destination, choosing a route in exchange of an artificial-currency price or reward, while accounting for the impact of the choices of the other users on travel discomfort. Second, we devise a model-based reinforcement learning controller that autonomously learns the optimal pricing policy by interacting with the proposed framework considering the closeness of the observed aggregate flows to a desired system-optimal distribution as a reward function. Our numerical results show that the proposed data-driven pricing scheme can effectively align the users' flows with the system optimum, significantly reducing the societal costs with respect to the uncontrolled flows (by about 15% and 25% depending on the scenario), and respond to environmental changes in a robust and efficient manner.
Outcome phenotype measurement error is rarely corrected in comparative effect estimation studies in observational pharmacoepidemiology. Quantitative bias analysis (QBA) is a misclassification correction method that algebraically adjusts person counts in exposure-outcome contingency tables to reflect the magnitude of misclassification. The extent QBA minimizes bias is unclear because few systematic evaluations have been reported. We empirically evaluated QBA impact on odds ratios (OR) in several comparative effect estimation scenarios. We estimated non-differential and differential phenotype errors with internal validation studies using a probabilistic reference. Further, we synthesized an analytic space defined by outcome incidence, uncorrected ORs, and phenotype errors to identify which combinations produce invalid results indicative of input errors. We evaluated impact with relative bias [(OR-ORQBA)]/OR*100%]. Results were considered invalid if any contingency table cell was corrected to a negative number. Empirical bias correction was greatest in lower incidence scenarios where uncorrected ORs were larger. Similarly, synthetic bias correction was greater in lower incidence settings with larger uncorrected estimates. The invalid proportion of synthetic scenarios increased as uncorrected estimates increased. Results were invalid in common, low incidence scenarios indicating problematic inputs. This demonstrates the importance of accurately and precisely estimating phenotype errors before implementing QBA in comparative effect estimation studies.
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at //github.com/BayesWatch/nas-without-training.
Recent years have witnessed significant advances in technologies and services in modern network applications, including smart grid management, wireless communication, cybersecurity as well as multi-agent autonomous systems. Considering the heterogeneous nature of networked entities, emerging network applications call for game-theoretic models and learning-based approaches in order to create distributed network intelligence that responds to uncertainties and disruptions in a dynamic or an adversarial environment. This paper articulates the confluence of networks, games and learning, which establishes a theoretical underpinning for understanding multi-agent decision-making over networks. We provide an selective overview of game-theoretic learning algorithms within the framework of stochastic approximation theory, and associated applications in some representative contexts of modern network systems, such as the next generation wireless communication networks, the smart grid and distributed machine learning. In addition to existing research works on game-theoretic learning over networks, we highlight several new angles and research endeavors on learning in games that are related to recent developments in artificial intelligence. Some of the new angles extrapolate from our own research interests. The overall objective of the paper is to provide the reader a clear picture of the strengths and challenges of adopting game-theoretic learning methods within the context of network systems, and further to identify fruitful future research directions on both theoretical and applied studies.
Co-evolving time series appears in a multitude of applications such as environmental monitoring, financial analysis, and smart transportation. This paper aims to address the following challenges, including (C1) how to incorporate explicit relationship networks of the time series; (C2) how to model the implicit relationship of the temporal dynamics. We propose a novel model called Network of Tensor Time Series, which is comprised of two modules, including Tensor Graph Convolutional Network (TGCN) and Tensor Recurrent Neural Network (TRNN). TGCN tackles the first challenge by generalizing Graph Convolutional Network (GCN) for flat graphs to tensor graphs, which captures the synergy between multiple graphs associated with the tensors. TRNN leverages tensor decomposition to model the implicit relationships among co-evolving time series. The experimental results on five real-world datasets demonstrate the efficacy of the proposed method.
In recent years, Graph Neural Networks (GNNs), which can naturally integrate node information and topological structure, have been demonstrated to be powerful in learning on graph data. These advantages of GNNs provide great potential to advance social recommendation since data in social recommender systems can be represented as user-user social graph and user-item graph; and learning latent factors of users and items is the key. However, building social recommender systems based on GNNs faces challenges. For example, the user-item graph encodes both interactions and their associated opinions; social relations have heterogeneous strengths; users involve in two graphs (e.g., the user-user social graph and the user-item graph). To address the three aforementioned challenges simultaneously, in this paper, we present a novel graph neural network framework (GraphRec) for social recommendations. In particular, we provide a principled approach to jointly capture interactions and opinions in the user-item graph and propose the framework GraphRec, which coherently models two graphs and heterogeneous strengths. Extensive experiments on two real-world datasets demonstrate the effectiveness of the proposed framework GraphRec. Our code is available at \url{//github.com/wenqifan03/GraphRec-WWW19}
Graphs, which describe pairwise relations between objects, are essential representations of many real-world data such as social networks. In recent years, graph neural networks, which extend the neural network models to graph data, have attracted increasing attention. Graph neural networks have been applied to advance many different graph related tasks such as reasoning dynamics of the physical system, graph classification, and node classification. Most of the existing graph neural network models have been designed for static graphs, while many real-world graphs are inherently dynamic. For example, social networks are naturally evolving as new users joining and new relations being created. Current graph neural network models cannot utilize the dynamic information in dynamic graphs. However, the dynamic information has been proven to enhance the performance of many graph analytical tasks such as community detection and link prediction. Hence, it is necessary to design dedicated graph neural networks for dynamic graphs. In this paper, we propose DGNN, a new {\bf D}ynamic {\bf G}raph {\bf N}eural {\bf N}etwork model, which can model the dynamic information as the graph evolving. In particular, the proposed framework can keep updating node information by capturing the sequential information of edges, the time intervals between edges and information propagation coherently. Experimental results on various dynamic graphs demonstrate the effectiveness of the proposed framework.
Chatbot has become an important solution to rapidly increasing customer care demands on social media in recent years. However, current work on chatbot for customer care ignores a key to impact user experience - tones. In this work, we create a novel tone-aware chatbot that generates toned responses to user requests on social media. We first conduct a formative research, in which the effects of tones are studied. Significant and various influences of different tones on user experience are uncovered in the study. With the knowledge of effects of tones, we design a deep learning based chatbot that takes tone information into account. We train our system on over 1.5 million real customer care conversations collected from Twitter. The evaluation reveals that our tone-aware chatbot generates as appropriate responses to user requests as human agents. More importantly, our chatbot is perceived to be even more empathetic than human agents.