The fairness of clustering algorithms has gained widespread attention across various areas, including machine learning, In this paper, we study fair $k$-means clustering in Euclidean space. Given a dataset comprising several groups, the fairness constraint requires that each cluster should contain a proportion of points from each group within specified lower and upper bounds. Due to these fairness constraints, determining the optimal locations of $k$ centers is a quite challenging task. We propose a novel ``Relax and Merge'' framework that returns a $(1+4\rho + O(\epsilon))$-approximate solution, where $\rho$ is the approximate ratio of an off-the-shelf vanilla $k$-means algorithm and $O(\epsilon)$ can be an arbitrarily small positive number. If equipped with a PTAS of $k$-means, our solution can achieve an approximation ratio of $(5+O(\epsilon))$ with only a slight violation of the fairness constraints, which improves the current state-of-the-art approximation guarantee. Furthermore, using our framework, we can also obtain a $(1+4\rho +O(\epsilon))$-approximate solution for the $k$-sparse Wasserstein Barycenter problem, which is a fundamental optimization problem in the field of optimal transport, and a $(2+6\rho)$-approximate solution for the strictly fair $k$-means clustering with no violation, both of which are better than the current state-of-the-art methods. In addition, the empirical results demonstrate that our proposed algorithm can significantly outperform baseline approaches in terms of clustering cost.
The rapid adaptation of data driven AI models, such as deep learning inference, training, Vision Transformers (ViTs), and other HPC applications, drives a strong need for runtime precision configurable different non linear activation functions (AF) hardware support. Existing solutions support diverse precision or runtime AF reconfigurability but fail to address both simultaneously. This work proposes a flexible and SIMD multiprecision processing element (FlexPE), which supports diverse runtime configurable AFs, including sigmoid, tanh, ReLU and softmax, and MAC operation. The proposed design achieves an improved throughput of up to 16X FxP4, 8X FxP8, 4X FxP16 and 1X FxP32 in pipeline mode with 100% time multiplexed hardware. This work proposes an area efficient multiprecision iterative mode in the SIMD systolic arrays for edge AI use cases. The design delivers superior performance with up to 62X and 371X reductions in DMA reads for input feature maps and weight filters in VGG16, with an energy efficiency of 8.42 GOPS / W within the accuracy loss of 2%. The proposed architecture supports emerging 4-bit computations for DL inference while enhancing throughput in FxP8/16 modes for transformers and other HPC applications. The proposed approach enables future energy-efficient AI accelerators in edge and cloud environments.
Predicting future dynamics is crucial for applications like autonomous driving and robotics, where understanding the environment is key. Existing pixel-level methods are computationally expensive and often focus on irrelevant details. To address these challenges, we introduce $\texttt{DINO-Foresight}$, a novel framework that operates in the semantic feature space of pretrained Vision Foundation Models (VFMs). Our approach trains a masked feature transformer in a self-supervised manner to predict the evolution of VFM features over time. By forecasting these features, we can apply off-the-shelf, task-specific heads for various scene understanding tasks. In this framework, VFM features are treated as a latent space, to which different heads attach to perform specific tasks for future-frame analysis. Extensive experiments show that our framework outperforms existing methods, demonstrating its robustness and scalability. Additionally, we highlight how intermediate transformer representations in $\texttt{DINO-Foresight}$ improve downstream task performance, offering a promising path for the self-supervised enhancement of VFM features. We provide the implementation code at //github.com/Sta8is/DINO-Foresight .
Efficient algorithms for solving the Smallest Enclosing Sphere (SES) problem, such as Welzl's algorithm, often fail to handle degenerate subsets of points in 3D space. Degeneracies and ill-posed configurations present significant challenges, leading to failures in convergence, inaccuracies or increased computational cost in such cases. Existing improvements to these algorithms, while addressing some of these issues, are either computationally expensive or only partially effective. In this paper, we propose a hybrid algorithm designed to mitigate degeneracy while maintaining an overall computational complexity of $O(N)$. By combining robust preprocessing steps with efficient core computations, our approach avoids the pitfalls of degeneracy without sacrificing scalability. The proposed method is validated through theoretical analysis and experimental results, demonstrating its efficacy in addressing degenerate configurations and achieving high efficiency in practice.
Deep reinforcement learning (DRL) has emerged as an innovative solution for controlling legged robots in challenging environments using minimalist architectures. Traditional control methods for legged robots, such as inverse dynamics, either directly manage joint torques or use proportional-derivative (PD) controllers to regulate joint positions at a higher level. In case of DRL, direct torque control presents significant challenges, leading to a preference for joint position control. However, this approach necessitates careful adjustment of joint PD gains, which can limit both adaptability and efficiency. In this paper, we propose GainAdaptor, an adaptive gain control framework that autonomously tunes joint PD gains to enhance terrain adaptability and energy efficiency. The framework employs a dual-actor algorithm to dynamically adjust the PD gains based on varying ground conditions. By utilizing a divided action space, GainAdaptor efficiently learns stable and energy-efficient locomotion. We validate the effectiveness of the proposed method through experiments conducted on a Unitree Go1 robot, demonstrating improved locomotion performance across diverse terrains.
Since classical machine learning has become a powerful tool for developing data-driven algorithms, quantum machine learning is expected to similarly impact the development of quantum algorithms. The literature reflects a mutually beneficial relationship between machine learning and quantum computing, where progress in one field frequently drives improvements in the other. Motivated by the fertile connection between machine learning and quantum computing enabled by parameterized quantum circuits, we use a resource-efficient and scalable Single-Qubit Quantum Neural Network (SQQNN) for both regression and classification tasks. The SQQNN leverages parameterized single-qubit unitary operators and quantum measurements to achieve efficient learning. To train the model, we use gradient descent for regression tasks. For classification, we introduce a novel training method inspired by the Taylor series, which can efficiently find a global minimum in a single step. This approach significantly accelerates training compared to iterative methods. Evaluated across various applications, the SQQNN exhibits virtually error-free and strong performance in regression and classification tasks, including the MNIST dataset. These results demonstrate the versatility, scalability, and suitability of the SQQNN for deployment on near-term quantum devices.
Deep reinforcement learning (DRL) has revolutionised quadruped robot locomotion, but existing control frameworks struggle to generalise beyond their training-induced observational scope, resulting in limited adaptability. In contrast, animals achieve exceptional adaptability through gait transition strategies, diverse gait utilisation, and seamless adjustment to immediate environmental demands. Inspired by these capabilities, we present a novel DRL framework that incorporates key attributes of animal locomotion: gait transition strategies, pseudo gait procedural memory, and adaptive motion adjustments. This approach enables our framework to achieve unparalleled adaptability, demonstrated through blind zero-shot deployment on complex terrains and recovery from critically unstable states. Our findings offer valuable insights into the biomechanics of animal locomotion, paving the way for robust, adaptable robotic systems.
The choice of data representation is a key factor in the success of deep learning in geometric tasks. For instance, DUSt3R has recently introduced the concept of viewpoint-invariant point maps, generalizing depth prediction, and showing that one can reduce all the key problems in the 3D reconstruction of static scenes to predicting such point maps. In this paper, we develop an analogous concept for a very different problem, namely, the reconstruction of the 3D shape and pose of deformable objects. To this end, we introduce the Dual Point Maps (DualPM), where a pair of point maps is extracted from the same image, one associating pixels to their 3D locations on the object, and the other to a canonical version of the object at rest pose. We also extend point maps to amodal reconstruction, seeing through self-occlusions to obtain the complete shape of the object. We show that 3D reconstruction and 3D pose estimation reduce to the prediction of the DualPMs. We demonstrate empirically that this representation is a good target for a deep network to predict; specifically, we consider modeling horses, showing that DualPMs can be trained purely on 3D synthetic data, consisting of a single model of a horse, while generalizing very well to real images. With this, we improve by a large margin previous methods for the 3D analysis and reconstruction of this type of objects.
Recent advances of data-driven machine learning have revolutionized fields like computer vision, reinforcement learning, and many scientific and engineering domains. In many real-world and scientific problems, systems that generate data are governed by physical laws. Recent work shows that it provides potential benefits for machine learning models by incorporating the physical prior and collected data, which makes the intersection of machine learning and physics become a prevailing paradigm. In this survey, we present this learning paradigm called Physics-Informed Machine Learning (PIML) which is to build a model that leverages empirical data and available physical prior knowledge to improve performance on a set of tasks that involve a physical mechanism. We systematically review the recent development of physics-informed machine learning from three perspectives of machine learning tasks, representation of physical prior, and methods for incorporating physical prior. We also propose several important open research problems based on the current trends in the field. We argue that encoding different forms of physical prior into model architectures, optimizers, inference algorithms, and significant domain-specific applications like inverse engineering design and robotic control is far from fully being explored in the field of physics-informed machine learning. We believe that this study will encourage researchers in the machine learning community to actively participate in the interdisciplinary research of physics-informed machine learning.
The incredible development of federated learning (FL) has benefited various tasks in the domains of computer vision and natural language processing, and the existing frameworks such as TFF and FATE has made the deployment easy in real-world applications. However, federated graph learning (FGL), even though graph data are prevalent, has not been well supported due to its unique characteristics and requirements. The lack of FGL-related framework increases the efforts for accomplishing reproducible research and deploying in real-world applications. Motivated by such strong demand, in this paper, we first discuss the challenges in creating an easy-to-use FGL package and accordingly present our implemented package FederatedScope-GNN (FS-G), which provides (1) a unified view for modularizing and expressing FGL algorithms; (2) comprehensive DataZoo and ModelZoo for out-of-the-box FGL capability; (3) an efficient model auto-tuning component; and (4) off-the-shelf privacy attack and defense abilities. We validate the effectiveness of FS-G by conducting extensive experiments, which simultaneously gains many valuable insights about FGL for the community. Moreover, we employ FS-G to serve the FGL application in real-world E-commerce scenarios, where the attained improvements indicate great potential business benefits. We publicly release FS-G, as submodules of FederatedScope, at //github.com/alibaba/FederatedScope to promote FGL's research and enable broad applications that would otherwise be infeasible due to the lack of a dedicated package.
With the advances of data-driven machine learning research, a wide variety of prediction problems have been tackled. It has become critical to explore how machine learning and specifically deep learning methods can be exploited to analyse healthcare data. A major limitation of existing methods has been the focus on grid-like data; however, the structure of physiological recordings are often irregular and unordered which makes it difficult to conceptualise them as a matrix. As such, graph neural networks have attracted significant attention by exploiting implicit information that resides in a biological system, with interactive nodes connected by edges whose weights can be either temporal associations or anatomical junctions. In this survey, we thoroughly review the different types of graph architectures and their applications in healthcare. We provide an overview of these methods in a systematic manner, organized by their domain of application including functional connectivity, anatomical structure and electrical-based analysis. We also outline the limitations of existing techniques and discuss potential directions for future research.