In real life, we frequently come across data sets that involve some independent explanatory variable(s) generating a set of ordinal responses. These ordinal responses may correspond to an underlying continuous latent variable, which is linearly related to the covariate(s), and takes a particular (ordinal) label depending on whether this latent variable takes value in some suitable interval specified by a pair of (unknown) cut-offs. The most efficient way of estimating the unknown parameters (i.e., the regression coefficients and the cut-offs) is the method of maximum likelihood (ML). However, contamination in the data set either in the form of misspecification of ordinal responses, or the unboundedness of the covariate(s), might destabilize the likelihood function to a great extent where the ML based methodology might lead to completely unreliable inferences. In this paper, we explore a minimum distance estimation procedure based on the popular density power divergence (DPD) to yield robust parameter estimates for the ordinal response model. This paper highlights how the resulting estimator, namely the minimum DPD estimator (MDPDE), can be used as a practical robust alternative to the classical procedures based on the ML. We rigorously develop several theoretical properties of this estimator, and provide extensive simulations to substantiate the theory developed.
We propose a new joint mean and correlation regression model for correlated multivariate discrete responses, that simultaneously regresses the mean of each response against a set of covariates, and the correlations between responses against a set of similarity/distance measures. A set of joint estimating equations are formulated to construct an estimator of both the mean regression coefficients and the correlation regression parameters. Under a general setting where the number of responses can tend to infinity, the joint estimator is demonstrated to be consistent and asymptotically normally distributed, with differing rates of convergence due to the mean regression coefficients being heterogeneous across responses. An iterative estimation procedure is developed to obtain parameter estimates in the required, constrained parameter space. We apply the proposed model to a multivariate abundance dataset comprising overdispersed counts of 38 Carabidae ground beetle species sampled throughout Scotland, along with information about the environmental conditions of each site and the traits of each species. Results show in particular that the relationships between the mean abundances of various beetle species and environmental covariates are different and that beetle total length has statistically important effect in driving the correlations between the species. Simulations demonstrate the strong finite sample performance of the proposed estimator in terms of point estimation and inference.
Agents built with large language models (LLMs) have shown great potential across a wide range of domains. However, in complex decision-making tasks, pure LLM-based agents tend to exhibit intrinsic bias in their choice of actions, which is inherited from the model's training data and results in suboptimal performance. To develop strategic language agents, i.e., agents that generate flexible language actions and possess strong decision-making abilities, we propose a novel framework that powers LLM-based agents with reinforcement learning (RL). We consider Werewolf, a popular social deduction game, as a challenging testbed that emphasizes versatile communication and strategic gameplay. To mitigate the intrinsic bias in language actions, our agents use an LLM to perform deductive reasoning and generate a diverse set of action candidates. Then an RL policy trained to optimize the decision-making ability chooses an action from the candidates to play in the game. Extensive experiments show that our agents overcome the intrinsic bias and outperform existing LLM-based agents in the Werewolf game. We also conduct human-agent experiments and find that our agents achieve human-level performance and demonstrate strong strategic play.
Motivated by real-world applications such as the allocation of public housing, we examine the problem of assigning a group of agents to vertices (e.g., spatial locations) of a network so that the diversity level is maximized. Specifically, agents are of two types (characterized by features), and we measure diversity by the number of agents who have at least one neighbor of a different type. This problem is known to be NP-hard, and we focus on developing approximation algorithms with provable performance guarantees. We first present a local-improvement algorithm for general graphs that provides an approximation factor of 1/2. For the special case where the sizes of agent subgroups are similar, we present a randomized approach based on semidefinite programming that yields an approximation factor better than 1/2. Further, we show that the problem can be solved efficiently when the underlying graph is treewidth-bounded and obtain a polynomial time approximation scheme (PTAS) for the problem on planar graphs. Lastly, we conduct experiments to evaluate the per-performance of the proposed algorithms on synthetic and real-world networks.
In this paper, we seek to provide a simpler proof that the relocation problem in Ricochet Robots (Lunar Lockout with fixed geometry) is PSPACE-complete via a reduction from Finite Function Generation (FFG). Although this result was originally proven in 2003, we give a simpler reduction by utilizing the FFG problem, and put the result in context with recent publications showing that relocation is also PSPACE-complete in related models.
In this paper, we propose an over-the-air (OTA)-based approach for distributed matrix-vector multiplications in the context of distributed machine learning (DML). Thanks to OTA computation, the column-wise partitioning of a large matrix enables efficient workload distribution among workers (i.e., local computing nodes) based on their computing capabilities. In addition, without requiring additional bandwidth, it allows the system to remain scalable even as the number of workers increases to mitigate the impact of slow workers, known as stragglers. However, despite the improvements, there are still instances where some workers experience deep fading and become stragglers, preventing them from transmitting their results. By analyzing the mean squared error (MSE), we demonstrate that incorporating more workers in the OTA-based approach leads to MSE reduction without the need for additional radio resources. Furthermore, we introduce an analog coding scheme to further enhance the performance and compare it with conventional coded multiplication (CM) schemes. Through simulations, it is shown that the OTA-based approach achieves comparable performance to CM schemes while potentially requiring fewer radio resources.
In this paper, we consider the problem of distributed optimisation of a separable convex cost function over a graph, where every edge and node in the graph could carry both linear equality and/or inequality constraints. We show how to modify the primal-dual method of multipliers (PDMM), originally designed for linear equality constraints, such that it can handle inequality constraints as well. The proposed algorithm does not need any slack variables, which is similar to the recent work [1] which extends the alternating direction method of multipliers (ADMM) for addressing decomposable optimisation with linear equality and inequality constraints. Using convex analysis, monotone operator theory and fixed-point theory, we show how to derive the update equations of the modified PDMM algorithm by applying Peaceman-Rachford splitting to the monotonic inclusion related to the lifted dual problem. To incorporate the inequality constraints, we impose a non-negativity constraint on the associated dual variables. This additional constraint results in the introduction of a reflection operator to model the data exchange in the network, instead of a permutation operator as derived for equality constraint PDMM. Convergence for both synchronous and stochastic update schemes of PDMM are provided. The latter includes asynchronous update schemes and update schemes with transmission losses. Experiments show that PDMM converges notably faster than extended ADMM of [1].
Synthetic control (SC) methods have gained rapid popularity in economics recently, where they have been applied in the context of inferring the effects of treatments on standard continuous outcomes assuming linear input-output relations. In medical applications, conversely, survival outcomes are often of primary interest, a setup in which both commonly assumed data-generating processes (DGPs) and target parameters are different. In this paper, we therefore investigate whether and when SCs could serve as an alternative to matching methods in survival analyses. We find that, because SCs rely on a linearity assumption, they will generally be biased for the true expected survival time in commonly assumed survival DGPs -- even when taking into account the possibility of linearity on another scale as in accelerated failure time models. Additionally, we find that, because SC units follow distributions with lower variance than real control units, summaries of their distributions, such as survival curves, will be biased for the parameters of interest in many survival analyses. Nonetheless, we also highlight that using SCs can still improve upon matching whenever the biases described above are outweighed by extrapolation biases exhibited by imperfect matches, and investigate the use of regularization to trade off the shortcomings of both approaches.
Semantic segmentation plays a crucial role in various computer vision applications, yet its efficacy is often hindered by the lack of high-quality labeled data. To address this challenge, a common strategy is to leverage models trained on data from different populations, such as publicly available datasets. This approach, however, leads to the distribution shift problem, presenting a reduced performance on the population of interest. In scenarios where model errors can have significant consequences, selective prediction methods offer a means to mitigate risks and reduce reliance on expert supervision. This paper investigates selective prediction for semantic segmentation in low-resource settings, thus focusing on post-hoc confidence estimators applied to pre-trained models operating under distribution shift. We propose a novel image-level confidence measure tailored for semantic segmentation and demonstrate its effectiveness through experiments on three medical imaging tasks. Our findings show that post-hoc confidence estimators offer a cost-effective approach to reducing the impacts of distribution shift.
In this paper, we study the inference accuracy of the Resistive Random Access Memory (ReRAM) neuromorphic circuit due to stuck-at faults (stuck-on, stuck-off, and stuck at a certain resistive value). A simulation framework using Python is used to perform supervised machine learning (neural network with 3 hidden layers, 1 input layer, and 1 output layer) of handwritten digits and construct a corresponding fully analog neuromorphic circuit (4 synaptic arrays) simulated by Spectre. A generic 45nm Process Development Kit (PDK) was used. We study the difference in the inference accuracy degradation due to stuck-on and stuck-off defects. Various defect patterns are studied including circular, ring, row, column, and circular-complement defects. It is found that stuck-on and stuck-off defects have a similar effect on inference accuracy. However, it is also found that if there is a spatial defect variation across the columns, the inference accuracy may be degraded significantly. We also propose a machine learning (ML) strategy to recover the inference accuracy degradation due to stuck-at faults. The inference accuracy is improved from 48% to 85% in a defective neuromorphic circuit.
Machine learning techniques have deeply rooted in our everyday life. However, since it is knowledge- and labor-intensive to pursue good learning performance, human experts are heavily involved in every aspect of machine learning. In order to make machine learning techniques easier to apply and reduce the demand for experienced human experts, automated machine learning (AutoML) has emerged as a hot topic with both industrial and academic interest. In this paper, we provide an up to date survey on AutoML. First, we introduce and define the AutoML problem, with inspiration from both realms of automation and machine learning. Then, we propose a general AutoML framework that not only covers most existing approaches to date but also can guide the design for new methods. Subsequently, we categorize and review the existing works from two aspects, i.e., the problem setup and the employed techniques. Finally, we provide a detailed analysis of AutoML approaches and explain the reasons underneath their successful applications. We hope this survey can serve as not only an insightful guideline for AutoML beginners but also an inspiration for future research.