There has been a recent surge in statistical methods for handling the lack of adequate positivity when using inverse probability weighting. Alongside these nascent developments, a number of questions have been posed about the goals and intent of these methods: to infer causality, what are they really estimating and what are their target populations? Because causal inference is inherently a missing data problem, the assignment mechanism -- how participants are represented in their respective treatment groups and how they receive their treatments -- plays an important role in assessing causality. In this paper, we contribute to the discussion by highlighting specific characteristics of the equipoise estimators, i.e., overlap weights (OW) matching weights (MW) and entropy weights (EW) methods, which help answer these questions and contrast them with the behavior of the inverse probability weights (IPW) method. We discuss three distinct potential motives for weighting under the lack of adequate positivity when estimating causal effects: (1) What separates OW, MW, and EW from IPW trimming or truncation? (2) What fundamentally distinguishes the estimand of the IPW, i.e., average treatment effect (ATE) from the OW, MW, and EW estimands (resp. average treatment effect on the overlap (ATO), the matching (ATM), and entropy (ATEN))? (3) When should we expect similar results for these estimands, even if the treatment effect is heterogeneous? Our findings are illustrated through a number of Monte-Carlo simulation studies and a data example on healthcare expenditure.
Implicit fields have been very effective to represent and learn 3D shapes accurately. Signed distance fields and occupancy fields are the preferred representations, both with well-studied properties, despite their restriction to closed surfaces. Several other variations and training principles have been proposed with the goal to represent all classes of shapes. In this paper, we develop a novel and yet fundamental representation by considering the unit vector field defined on 3D space: at each point in $\mathbb{R}^3$ the vector points to the closest point on the surface. We theoretically demonstrate that this vector field can be easily transformed to surface density by applying the vector field divergence. Unlike other standard representations, it directly encodes an important physical property of the surface, which is the surface normal. We further show the advantages of our vector field representation, specifically in learning general (open, closed, or multi-layered) surfaces as well as piecewise planar surfaces. We compare our method on several datasets including ShapeNet where the proposed new neural implicit field shows superior accuracy in representing any type of shape, outperforming other standard methods. The code will be released at //github.com/edomel/ImplicitVF
To meet the fairly high safety and reliability requirements in practice, the state of health (SOH) estimation of Lithium-ion batteries (LIBs), which has a close relationship with the degradation performance, has been extensively studied with the widespread applications of various electronics. The conventional SOH estimation approaches with digital twin are end-of-cycle estimation that require the completion of a full charge/discharge cycle to observe the maximum available capacity. However, under dynamic operating conditions with partially discharged data, it is impossible to sense accurate real-time SOH estimation for LIBs. To bridge this research gap, we put forward a digital twin framework to gain the capability of sensing the battery's SOH on the fly, updating the physical battery model. The proposed digital twin solution consists of three core components to enable real-time SOH estimation without requiring a complete discharge. First, to handle the variable training cycling data, the energy discrepancy-aware cycling synchronization is proposed to align cycling data with guaranteeing the same data structure. Second, to explore the temporal importance of different training sampling times, a time-attention SOH estimation model is developed with data encoding to capture the degradation behavior over cycles, excluding adverse influences of unimportant samples. Finally, for online implementation, a similarity analysis-based data reconstruction has been put forward to provide real-time SOH estimation without requiring a full discharge cycle. Through a series of results conducted on a widely used benchmark, the proposed method yields the real-time SOH estimation with errors less than 1% for most sampling times in ongoing cycles.
In the past few years, large high-resolution displays (LHRDs) have attracted considerable attention from researchers, industries, and application areas that increasingly rely on data-driven decision-making. An up-to-date survey on the use of LHRDs for interactive data visualization seems warranted to summarize how new solutions meet the characteristics and requirements of LHRDs and take advantage of their unique benefits. In this survey, we start by defining LHRDs and outlining the consequence of LHRD environments on interactive visualizations in terms of more pixels, space, users, and devices. Then, we review related literature along the four axes of visualization, interaction, evaluation studies, and applications. With these four axes, our survey provides a unique perspective and covers a broad range of aspects being relevant when developing interactive visual data analysis solutions for LHRDs. We conclude this survey by reflecting on a number of opportunities for future research to help the community take up the still open challenges of interactive visualization on LHRDs.
Mixtures of regression are a powerful class of models for regression learning with respect to a highly uncertain and heterogeneous response variable of interest. In addition to being a rich predictive model for the response given some covariates, the parameters in this model class provide useful information about the heterogeneity in the data population, which is represented by the conditional distributions for the response given the covariates associated with a number of distinct but latent subpopulations. In this paper, we investigate conditions of strong identifiability, rates of convergence for conditional density and parameter estimation, and the Bayesian posterior contraction behavior arising in finite mixture of regression models, under exact-fitted and over-fitted settings and when the number of components is unknown. This theory is applicable to common choices of link functions and families of conditional distributions employed by practitioners. We provide simulation studies and data illustrations, which shed some light on the parameter learning behavior found in several popular regression mixture models reported in the literature.
Results on the spectral behavior of random matrices as the dimension increases are applied to the problem of detecting the number of sources impinging on an array of sensors. A common strategy to solve this problem is to estimate the multiplicity of the smallest eigenvalue of the spatial covariance matrix $R$ of the sensed data from the sample covariance matrix $\widehat{R}$. Existing approaches, such as that based on information theoretic criteria, rely on the closeness of the noise eigenvalues of $\widehat R$ to each other and, therefore, the sample size has to be quite large when the number of sources is large in order to obtain a good estimate. The analysis presented in this report focuses on the splitting of the spectrum of $\widehat{R}$ into noise and signal eigenvalues. It is shown that, when the number of sensors is large, the number of signals can be estimated with a sample size considerably less than that required by previous approaches. The practical significance of the main result is that detection can be achieved with a number of samples comparable to the number of sensors in large dimensional array processing.
Conditional effect estimation has great scientific and policy importance because interventions may impact subjects differently depending on their characteristics. Previous work has focused primarily on estimating the conditional average treatment effect (CATE), which considers the difference between counterfactual mean outcomes under interventions when all subjects receive treatment and all subjects receive control. However, these interventions may be unrealistic in certain policy scenarios. Furthermore, identification of the CATE requires that all subjects have a non-zero probability of receiving treatment, or positivity, which may be unrealistic in practice. In this paper, we propose conditional effects based on incremental propensity score interventions, which are stochastic interventions under which the odds of treatment are multiplied by some user-specified factor. These effects do not require positivity for identification and can be better suited for modeling real-world policies in which people cannot be forced to treatment. We develop a projection estimator, the "Projection-Learner", and a flexible nonparametric estimator, the "I-DR-Learner", which can each estimate all the conditional effects we propose. We derive model-agnostic error guarantees for both estimators, and show that both satisfy a form of double robustness, whereby the Projection-Learner attains parametric efficiency and the I-DR-Learner attains oracle efficiency under weak convergence conditions on the nuisance function estimators. We then propose a summary of treatment effect heterogeneity, the variance of a conditional derivative, and derive a nonparametric estimator for the effect that also satisfies a form of double robustness. Finally, we demonstrate our estimators with an analysis of the the effect of ICU admission on mortality using a dataset from the (SPOT)light prospective cohort study.
Various types of Multi-Agent Reinforcement Learning (MARL) methods have been developed, assuming that agents' policies are based on true states. Recent works have improved the robustness of MARL under uncertainties from the reward, transition probability, or other partners' policies. However, in real-world multi-agent systems, state estimations may be perturbed by sensor measurement noise or even adversaries. Agents' policies trained with only true state information will deviate from optimal solutions when facing adversarial state perturbations during execution. MARL under adversarial state perturbations has limited study. Hence, in this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to study the fundamental properties of MARL under state uncertainties. We prove that the optimal agent policy and the robust Nash equilibrium do not always exist for an SAMG. Instead, we define the solution concept, robust agent policy, of the proposed SAMG under adversarial state perturbations, where agents want to maximize the worst-case expected state value. We then design a gradient descent ascent-based robust MARL algorithm to learn the robust policies for the MARL agents. Our experiments show that adversarial state perturbations decrease agents' rewards for several baselines from the existing literature, while our algorithm outperforms baselines with state perturbations and significantly improves the robustness of the MARL policies under state uncertainties.
Transparency of Machine Learning models used for decision support in various industries becomes essential for ensuring their ethical use. To that end, feature attribution methods such as SHAP (SHapley Additive exPlanations) are widely used to explain the predictions of black-box machine learning models to customers and developers. However, a parallel trend has been to train machine learning models in collaboration with other data holders without accessing their data. Such models, trained over horizontally or vertically partitioned data, present a challenge for explainable AI because the explaining party may have a biased view of background data or a partial view of the feature space. As a result, explanations obtained from different participants of distributed machine learning might not be consistent with one another, undermining trust in the product. This paper presents an Explainable Data Collaboration Framework based on a model-agnostic additive feature attribution algorithm (KernelSHAP) and Data Collaboration method of privacy-preserving distributed machine learning. In particular, we present three algorithms for different scenarios of explainability in Data Collaboration and verify their consistency with experiments on open-access datasets. Our results demonstrated a significant (by at least a factor of 1.75) decrease in feature attribution discrepancies among the users of distributed machine learning.
The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.
Deep learning models on graphs have achieved remarkable performance in various graph analysis tasks, e.g., node classification, link prediction and graph clustering. However, they expose uncertainty and unreliability against the well-designed inputs, i.e., adversarial examples. Accordingly, various studies have emerged for both attack and defense addressed in different graph analysis tasks, leading to the arms race in graph adversarial learning. For instance, the attacker has poisoning and evasion attack, and the defense group correspondingly has preprocessing- and adversarial- based methods. Despite the booming works, there still lacks a unified problem definition and a comprehensive review. To bridge this gap, we investigate and summarize the existing works on graph adversarial learning tasks systemically. Specifically, we survey and unify the existing works w.r.t. attack and defense in graph analysis tasks, and give proper definitions and taxonomies at the same time. Besides, we emphasize the importance of related evaluation metrics, and investigate and summarize them comprehensively. Hopefully, our works can serve as a reference for the relevant researchers, thus providing assistance for their studies. More details of our works are available at //github.com/gitgiter/Graph-Adversarial-Learning.