In this paper, we aim at establishing an approximation theory and a learning theory of distribution regression via a fully connected neural network (FNN). In contrast to the classical regression methods, the input variables of distribution regression are probability measures. Then we often need to perform a second-stage sampling process to approximate the actual information of the distribution. On the other hand, the classical neural network structure requires the input variable to be a vector. When the input samples are probability distributions, the traditional deep neural network method cannot be directly used and the difficulty arises for distribution regression. A well-defined neural network structure for distribution inputs is intensively desirable. There is no mathematical model and theoretical analysis on neural network realization of distribution regression. To overcome technical difficulties and address this issue, we establish a novel fully connected neural network framework to realize an approximation theory of functionals defined on the space of Borel probability measures. Furthermore, based on the established functional approximation results, in the hypothesis space induced by the novel FNN structure with distribution inputs, almost optimal learning rates for the proposed distribution regression model up to logarithmic terms are derived via a novel two-stage error decomposition technique.
Due to the mutual occlusion, severe scale variation, and complex spatial distribution, the current multi-person mesh recovery methods cannot produce accurate absolute body poses and shapes in large-scale crowded scenes. To address the obstacles, we fully exploit crowd features for reconstructing groups of people from a monocular image. A novel hypergraph relational reasoning network is proposed to formulate the complex and high-order relation correlations among individuals and groups in the crowd. We first extract compact human features and location information from the original high-resolution image. By conducting the relational reasoning on the extracted individual features, the underlying crowd collectiveness and interaction relationship can provide additional group information for the reconstruction. Finally, the updated individual features and the localization information are used to regress human meshes in camera coordinates. To facilitate the network training, we further build pseudo ground-truth on two crowd datasets, which may also promote future research on pose estimation and human behavior understanding in crowded scenes. The experimental results show that our approach outperforms other baseline methods both in crowded and common scenarios. The code and datasets are publicly available at //github.com/boycehbz/GroupRec.
Crossed random effects structures arise in many scientific contexts. They raise severe computational problems with likelihood and Bayesian computations scaling like $N^{3/2}$ or worse for $N$ data points. In this paper we develop a composite likelihood approach for crossed random effects probit models. For data arranged in rows and columns, one likelihood uses marginal distributions of the responses as if they were independent, another uses a hierarchical model capturing all within row dependence as if the rows were independent and the third model reverses the roles of rows and columns. We find that this method has a cost that grows as $\mathrm{O}(N)$ in crossed random effects settings where using the Laplace approximation has cost that grows superlinearly. We show how to get consistent estimates of the probit slope and variance components by maximizing those three likelihoods. The algorithm scales readily to a data set of five million observations from Stitch Fix.
In this paper, we describe the research on how perceptual load can affect programming performance in people with symptoms of Attention Deficit / Hyperactivity Disorder (ADHD). We asked developers to complete the Barkley Deficits in Executive Functioning Scale, which indicates the presence and severity levels of ADHD symptoms. After that, participants solved mentally active programming tasks (coding) and monotonous ones (debugging) in the integrated development environment in high perceptual load modes (visually noisy) and low perceptual load modes (visually clear). The development environment was augmented with the plugin we wrote to track efficiency metrics, i.e. time, speed, and activity. We found that the perceptual load does affect programmers' efficiency. For mentally active tasks, the time of inserting the first character was shorter and the overall speed was higher in the low perceptual load mode. For monotonous tasks, the total time for the solution was less for the low perceptual load mode. Also, we found that the effect of perceptual load on programmers' efficiency differs between those with and without ADHD symptoms. This effect has a specificity: depending on efficiency measures and ADHD symptoms, one or another level of perceptual load might be beneficial. Our findings support the idea of behavioral assessment of users for providing appropriate accommodation for the workforce with special needs.
In this paper, we study the classic submodular maximization problem subject to a group equality constraint under both non-adaptive and adaptive settings. It has been shown that the utility function of many machine learning applications, including data summarization, influence maximization in social networks, and personalized recommendation, satisfies the property of submodularity. Hence, maximizing a submodular function subject to various constraints can be found at the heart of many of those applications. On a high level, submodular maximization aims to select a group of most representative items (e.g., data points). However, the design of most existing algorithms does not incorporate the fairness constraint, leading to under- or over-representation of some particular groups. This motivates us to study the submodular maximization problem with group equality, where we aim to select a group of items to maximize a (possibly non-monotone) submodular utility function subject to a group equality constraint. To this end, we develop the first constant-factor approximation algorithm for this problem. The design of our algorithm is robust enough to be extended to solving the submodular maximization problem under a more complicated adaptive setting. Moreover, we further extend our study to incorporating a global cardinality constraint and other fairness notations.
In this paper we provide an algorithmic framework based on Langevin diffusion (LD) and its corresponding discretizations that allow us to simultaneously obtain: i) An algorithm for sampling from the exponential mechanism, whose privacy analysis does not depend on convexity and which can be stopped at anytime without compromising privacy, and ii) tight uniform stability guarantees for the exponential mechanism. As a direct consequence, we obtain optimal excess empirical and population risk guarantees for (strongly) convex losses under both pure and approximate differential privacy (DP). The framework allows us to design a DP uniform sampler from the Rashomon set. Rashomon sets are widely used in interpretable and robust machine learning, understanding variable importance, and characterizing fairness.
In this paper, we investigate the impact of imbalanced data on the convergence of distributed dual coordinate ascent in a tree network for solving an empirical loss minimization problem in distributed machine learning. To address this issue, we propose a method called delayed generalized distributed dual coordinate ascent that takes into account the information of the imbalanced data, and provide the analysis of the proposed algorithm. Numerical experiments confirm the effectiveness of our proposed method in improving the convergence speed of distributed dual coordinate ascent in a tree network.
In this research, we propose a novel technique for visualizing nonstationarity in geostatistics, particularly when confronted with a single realization of data at irregularly spaced locations. Our method hinges on formulating a statistic that tracks a stable microergodic parameter of the exponential covariance function, allowing us to address the intricate challenges of nonstationary processes that lack repeated measurements. We implement the fused lasso technique to elucidate nonstationary patterns at various resolutions. For prediction purposes, we segment the spatial domain into stationary sub-regions via Voronoi tessellations. Additionally, we devise a robust test for stationarity based on contrasting the sample means of our proposed statistics between two selected Voronoi subregions. The effectiveness of our method is demonstrated through simulation studies and its application to a precipitation dataset in Colorado.
In this paper, we develop a geometric framework for generating non-slip quadrupedal two-beat gaits. We consider a four-bar mechanism as a surrogate model for a contact state and develop the geometric tools such as shape-change basis to aid in gait generation, local connection as the matrix-equation of motion, and stratified panels to model net locomotion in line with previous work\cite{prasad2023contactswitch}. Standard two-beat gaits in quadrupedal systems like trot divide the shape space into two equal, decoupled subspaces. The subgaits generated in each subspace space are designed independently and when combined with appropriate phasing generate a two-beat gait where the displacements add up due to the geometric nature of the system. By adding ``scaling" and ``sliding" control knobs to subgaits defined as flows over the shape-change basis, we continuously steer an arbitrary, planar quadrupedal system. This exhibits translational anisotropy when modulated using the scaling inputs. To characterize the steering induced by sliding inputs, we define an average path curvature function analytically and show that the steering gaits can be generated using a geometric nonslip contact modeling framework.
In this paper, we study the application of DRL algorithms in the context of local navigation problems, in which a robot moves towards a goal location in unknown and cluttered workspaces equipped only with limited-range exteroceptive sensors, such as LiDAR. Collision avoidance policies based on DRL present some advantages, but they are quite susceptible to local minima, once their capacity to learn suitable actions is limited to the sensor range. Since most robots perform tasks in unstructured environments, it is of great interest to seek generalized local navigation policies capable of avoiding local minima, especially in untrained scenarios. To do so, we propose a novel reward function that incorporates map information gained in the training stage, increasing the agent's capacity to deliberate about the best course of action. Also, we use the SAC algorithm for training our ANN, which shows to be more effective than others in the state-of-the-art literature. A set of sim-to-sim and sim-to-real experiments illustrate that our proposed reward combined with the SAC outperforms the compared methods in terms of local minima and collision avoidance.
In this paper, we propose a hybrid model combining genetic algorithm and hill climbing algorithm for optimizing Convolutional Neural Networks (CNNs) on the CIFAR-100 dataset. The proposed model utilizes a population of chromosomes that represent the hyperparameters of the CNN model. The genetic algorithm is used for selecting and breeding the fittest chromosomes to generate new offspring. The hill climbing algorithm is then applied to the offspring to further optimize their hyperparameters. The mutation operation is introduced to diversify the population and to prevent the algorithm from getting stuck in local optima. The Genetic Algorithm is used for global search and exploration of the search space, while Hill Climbing is used for local optimization of promising solutions. The objective function is the accuracy of the trained neural network on the CIFAR-100 test set. The performance of the hybrid model is evaluated by comparing it with the standard genetic algorithm and hill-climbing algorithm. The experimental results demonstrate that the proposed hybrid model achieves better accuracy with fewer generations compared to the standard algorithms. Therefore, the proposed hybrid model can be a promising approach for optimizing CNN models on large datasets.