If the probability model is correctly specified, then we can estimate the covariance matrix of the asymptotic maximum likelihood estimate distribution using either the first or second derivatives of the likelihood function. Therefore, if the determinants of these two different covariance matrix estimation formulas differ this indicates model misspecification. This misspecification detection strategy is the basis of the Determinant Information Matrix Test ($GIMT_{Det}$). To investigate the performance of the $GIMT_{Det}$, a Deterministic Input Noisy And gate (DINA) Cognitive Diagnostic Model (CDM) was fit to the Fraction-Subtraction dataset. Next, various misspecified versions of the original DINA CDM were fit to bootstrap data sets generated by sampling from the original fitted DINA CDM. The $GIMT_{Det}$ showed good discrimination performance for larger levels of misspecification. In addition, the $GIMT_{Det}$ did not detect model misspecification when model misspecification was not present and additionally did not detect model misspecification when the level of misspecification was very low. However, the $GIMT_{Det}$ discrimation performance was highly variable across different misspecification strategies when the misspecification level was moderately sized. The proposed new misspecification detection methodology is promising but additional empirical studies are required to further characterize its strengths and limitations.
Despite their remarkable success, large language models (LLMs) have shown limited ability on applied tasks such as vulnerability detection. We investigate various prompting strategies for vulnerability detection and, as part of this exploration, propose a prompting strategy that integrates natural language descriptions of vulnerabilities with a contrastive chain-of-thought reasoning approach, augmented using contrastive samples from a synthetic dataset. Our study highlights the potential of LLMs to detect vulnerabilities by integrating natural language descriptions, contrastive reasoning, and synthetic examples into a comprehensive prompting framework. Our results show that this approach can enhance LLM understanding of vulnerabilities. On a high-quality vulnerability detection dataset such as SVEN, our prompting strategies can improve accuracies, F1-scores, and pairwise accuracies by 23%, 11%, and 14%, respectively.
A scaled conjugate gradient method that accelerates existing adaptive methods utilizing stochastic gradients is proposed for solving nonconvex optimization problems with deep neural networks. It is shown theoretically that, whether with constant or diminishing learning rates, the proposed method can obtain a stationary point of the problem. Additionally, its rate of convergence with diminishing learning rates is verified to be superior to that of the conjugate gradient method. The proposed method is shown to minimize training loss functions faster than the existing adaptive methods in practical applications of image and text classification. Furthermore, in the training of generative adversarial networks, one version of the proposed method achieved the lowest Frechet inception distance score among those of the adaptive methods.
The study of biomechanics during locomotion provides valuable insights into the effects of varying conditions on specific movement patterns. This research focuses on examining the influence of different shoe parameters on walking biomechanics, aiming to understand their impact on gait patterns. To achieve this, various methodologies are explored to estimate human body biomechanics, including computer vision techniques and wearable devices equipped with advanced sensors. Given privacy considerations and the need for robust, accurate measurements, this study employs wearable devices with Inertial Measurement Unit (IMU) sensors. These devices offer a non-invasive, precise, and high-resolution approach to capturing biomechanical data during locomotion. Raw sensor data collected from wearable devices is processed using an Extended Kalman Filter to reduce noise and extract meaningful information. This includes calculating joint angles throughout the gait cycle, enabling a detailed analysis of movement dynamics. The analysis identifies correlations between shoe parameters and key gait characteristics, such as stability, mobility, step time, and propulsion forces. The findings provide deeper insights into how footwear design influences walking efficiency and biomechanics. This study paves the way for advancements in footwear technology and contributes to the development of personalized solutions for enhancing gait performance and mobility.
Objective: Configuring a prosthetic leg is an integral part of the fitting process, but the personalization of a multi-modal powered knee-ankle prosthesis is often too complex to realize in a clinical environment. This paper develops both the technical means to individualize a hybrid kinematic-impedance controller for variable-incline walking and sit-stand transitions, and an intuitive Clinical Tuning Interface (CTI) that allows prosthetists to directly modify the controller behavior. Methods: Utilizing an established method for predicting kinematic gait individuality alongside a new parallel approach for kinetic individuality, we applied tuned characteristics exclusively from level-ground walking to personalize continuous-phase/task models of joint kinematics and impedance. To take advantage of this method, we developed a CTI that translates common clinical tuning parameters into model adjustments. We then conducted a case study involving an above-knee amputee participant where a prosthetist iteratively tuned the prosthesis in a simulated clinical session involving walking and sit-stand transitions. Results: The prosthetist fully tuned the multi-activity prosthesis controller in under 20 min. Each iteration of tuning (i.e., observation, parameter adjustment, and model reprocessing) took 2 min on average for walking and 1 min on average for sit-stand. The tuned behavior changes were appropriately manifested in the commanded prosthesis torques, both at the tuned tasks and across untuned tasks (inclines). Conclusion: The CTI leveraged able-bodied trends to efficiently personalize a wide array of walking tasks and sit-stand transitions. A case-study validated the CTI tuning method and demonstrated the efficiency necessary for powered knee-ankle prostheses to become clinically viable.
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.
While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on the ImageNet classification task has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new Full Reference Image Quality Assessment (FR-IQA) dataset of perceptual human judgments, orders of magnitude larger than previous datasets. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by huge margins. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.