[18F]-Fluorodeoxyglucose (FDG) positron emission tomography - computed tomography (PET-CT) has become the imaging modality of choice for diagnosing many cancers. Co-learning complementary PET-CT imaging features is a fundamental requirement for automatic tumor segmentation and for developing computer aided cancer diagnosis systems. In this study, we propose a hyper-connected transformer (HCT) network that integrates a transformer network (TN) with a hyper connected fusion for multi-modality PET-CT images. The TN was leveraged for its ability to provide global dependencies in image feature learning, which was achieved by using image patch embeddings with a self-attention mechanism to capture image-wide contextual information. We extended the single-modality definition of TN with multiple TN based branches to separately extract image features. We also introduced a hyper connected fusion to fuse the contextual and complementary image features across multiple transformers in an iterative manner. Our results with two clinical datasets show that HCT achieved better performance in segmentation accuracy when compared to the existing methods.
Knowledge distillation (KD) emerges as a challenging yet promising technique for compressing deep learning models, characterized by the transmission of extensive learning representations from proficient and computationally intensive teacher models to compact student models. However, only a handful of studies have endeavored to compress the models for single image super-resolution (SISR) through KD, with their effects on student model enhancement remaining marginal. In this paper, we put forth an approach from the perspective of efficient data utilization, namely, the Data Upcycling Knowledge Distillation (DUKD) which facilitates the student model by the prior knowledge teacher provided via upcycled in-domain data derived from their inputs. This upcycling process is realized through two efficient image zooming operations and invertible data augmentations which introduce the label consistency regularization to the field of KD for SISR and substantially boosts student model's generalization. The DUKD, due to its versatility, can be applied across a broad spectrum of teacher-student architectures. Comprehensive experiments across diverse benchmarks demonstrate that our proposed DUKD method significantly outperforms previous art, exemplified by an increase of up to 0.5dB in PSNR over baselines methods, and a 67% parameters reduced RCAN model's performance remaining on par with that of the RCAN teacher model.
Privacy-preserving price e-negotiation (3PEN) is an important topic of secure multi-party computation (SMC) in the electronic commerce field, and the key point of its security is to guarantee the privacy of seller's and buyer's prices. In this study, a novel and efficient quantum solution to the 3PEN problem is proposed, where the oracle operation and the qubit comparator are utilized to obtain the comparative results of buyer's and seller's prices, and then quantum counting is executed to summarize the total number of products which meets the trading conditions. Analysis shows that our solution not only guarantees the correctness and the privacy of 3PEN, but also has lower communication complexity than those classical ones.
The performance of Hamiltonian Monte Carlo crucially depends on its parameters, in particular the integration timestep and the number of integration steps. We present an adaptive general-purpose framework to automatically tune these parameters based on a loss function which promotes the fast exploration of phase-space. For this, we make use of a fully-differentiable set-up and use backpropagation for optimization. An attention-like loss is defined which allows for the gradient driven learning of the distribution of integration steps. We also highlight the importance of jittering for a smooth loss-surface. Our approach is demonstrated for the one-dimensional harmonic oscillator and alanine dipeptide, a small protein common as a test-case for simulation methods. We find a good correspondence between our loss and the autocorrelation times, resulting in well-tuned parameters for Hamiltonian Monte Carlo.
We introduce an efficient stochastic interacting particle-field (SIPF) algorithm with no history dependence for computing aggregation patterns and near singular solutions of parabolic-parabolic Keller-Segel (KS) chemotaxis system in three space dimensions (3D). The KS solutions are approximated as empirical measures of particles coupled with a smoother field (concentration of chemo-attractant) variable computed by the spectral method. Instead of using heat kernels causing history dependence and high memory cost, we leverage the implicit Euler discretization to derive a one-step recursion in time for stochastic particle positions and the field variable based on the explicit Green's function of an elliptic operator of the form Laplacian minus a positive constant. In numerical experiments, we observe that the resulting SIPF algorithm is convergent and self-adaptive to the high gradient part of solutions. Despite the lack of analytical knowledge (e.g. a self-similar ansatz) of the blowup, the SIPF algorithm provides a low-cost approach to study the emergence of finite time blowup in 3D by only dozens of Fourier modes and through varying the amount of initial mass and tracking the evolution of the field variable. Notably, the algorithm can handle at ease multi-modal initial data and the subsequent complex evolution involving the merging of particle clusters and formation of a finite time singularity.
The all-to-all collective communications primitive is widely used in machine learning (ML) and high performance computing (HPC) workloads, and optimizing its performance is of interest to both ML and HPC communities. All-to-all is a particularly challenging workload that can severely strain the underlying interconnect bandwidth at scale. This is mainly because of the quadratic scaling in the number of messages that must be simultaneously serviced combined with large message sizes. This paper takes a holistic approach to optimize the performance of all-to-all collective communications on supercomputer-scale direct-connect interconnects. We address several algorithmic and practical challenges in developing efficient and bandwidth-optimal all-to-all schedules for any topology, lowering the schedules to various backends and fabrics that may or may not expose additional forwarding bandwidth, establishing an upper bound on all-to-all throughput, and exploring novel topologies that deliver near-optimal all-to-all performance.
Nonlinear model predictive control (NMPC) is typically restricted to short, finite horizons to limit the computational burden of online optimization. This makes a global planner necessary to avoid local minima when using NMPC for navigation in complex environments. For this reason, the performance of NMPC approaches are often limited by that of the global planner. While control policies trained with reinforcement learning (RL) can theoretically learn to avoid such local minima, they are usually unable to guarantee enforcement of general state constraints. In this paper, we augment a sampling-based stochastic NMPC (SNMPC) approach with an RL trained perception-informed value function. This allows the system to avoid observable local minima in the environment by reasoning about perception information beyond the finite planning horizon. By using Probably Approximately Correct NMPC (PAC-NMPC) as our base controller, we are also able to generate statistical guarantees of performance and safety. We demonstrate our approach in simulation and on hardware using a 1/10th scale rally car with lidar.
Graph-based collaborative filtering has emerged as a powerful paradigm for delivering personalized recommendations. Despite their demonstrated effectiveness, these methods often neglect the underlying intents of users, which constitute a pivotal facet of comprehensive user interests. Consequently, a series of approaches have arisen to tackle this limitation by introducing independent intent representations. However, these approaches fail to capture the intricate relationships between intents of different users and the compatibility between user intents and item properties. To remedy the above issues, we propose a novel method, named uniformly co-clustered intent modeling. Specifically, we devise a uniformly contrastive intent modeling module to bring together the embeddings of users with similar intents and items with similar properties. This module aims to model the nuanced relations between intents of different users and properties of different items, especially those unreachable to each other on the user-item graph. To model the compatibility between user intents and item properties, we design the user-item co-clustering module, maximizing the mutual information of co-clusters of users and items. This approach is substantiated through theoretical validation, establishing its efficacy in modeling compatibility to enhance the mutual information between user and item representations. Comprehensive experiments on various real-world datasets verify the effectiveness of the proposed framework.
To alleviate the expensive human labeling, semi-supervised semantic segmentation employs a few labeled images and an abundant of unlabeled images to predict the pixel-level label map with the same size. Previous methods often adopt co-training using two convolutional networks with the same architecture but different initialization, which fails to capture the sufficiently diverse features. This motivates us to use tri-training and develop the triple-view encoder to utilize the encoders with different architectures to derive diverse features, and exploit the knowledge distillation skill to learn the complementary semantics among these encoders. Moreover, existing methods simply concatenate the features from both encoder and decoder, resulting in redundant features that require large memory cost. This inspires us to devise a dual-frequency decoder that selects those important features by projecting the features from the spatial domain to the frequency domain, where the dual-frequency channel attention mechanism is introduced to model the feature importance. Therefore, we propose a Triple-view Knowledge Distillation framework, termed TriKD, for semi-supervised semantic segmentation, including the triple-view encoder and the dual-frequency decoder. Extensive experiments were conducted on two benchmarks, \ie, Pascal VOC 2012 and Cityscapes, whose results verify the superiority of the proposed method with a good tradeoff between precision and inference speed.
Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.
Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks. Despite its powerful capacity to learn and generalize from few samples, GNN usually suffers from severe over-fitting and over-smoothing as the model becomes deep, which limit the model scalability. In this work, we propose a novel Attentive GNN to tackle these challenges, by incorporating a triple-attention mechanism, \ie node self-attention, neighborhood attention, and layer memory attention. We explain why the proposed attentive modules can improve GNN for few-shot learning with theoretical analysis and illustrations. Extensive experiments show that the proposed Attentive GNN outperforms the state-of-the-art GNN-based methods for few-shot learning over the mini-ImageNet and Tiered-ImageNet datasets, with both inductive and transductive settings.