Protein structure-based property prediction has emerged as a promising approach for various biological tasks, such as protein function prediction and sub-cellular location estimation. The existing methods highly rely on experimental protein structure data and fail in scenarios where these data are unavailable. Predicted protein structures from AI tools (e.g., AlphaFold2) were utilized as alternatives. However, we observed that current practices, which simply employ accurately predicted structures during inference, suffer from notable degradation in prediction accuracy. While similar phenomena have been extensively studied in general fields (e.g., Computer Vision) as model robustness, their impact on protein property prediction remains unexplored. In this paper, we first investigate the reason behind the performance decrease when utilizing predicted structures, attributing it to the structure embedding bias from the perspective of structure representation learning. To study this problem, we identify a Protein 3D Graph Structure Learning Problem for Robust Protein Property Prediction (PGSL-RP3), collect benchmark datasets, and present a protein Structure embedding Alignment Optimization framework (SAO) to mitigate the problem of structure embedding bias between the predicted and experimental protein structures. Extensive experiments have shown that our framework is model-agnostic and effective in improving the property prediction of both predicted structures and experimental structures. The benchmark datasets and codes will be released to benefit the community.
Semi-supervised object detection is crucial for 3D scene understanding, efficiently addressing the limitation of acquiring large-scale 3D bounding box annotations. Existing methods typically employ a teacher-student framework with pseudo-labeling to leverage unlabeled point clouds. However, producing reliable pseudo-labels in a diverse 3D space still remains challenging. In this work, we propose Diffusion-SS3D, a new perspective of enhancing the quality of pseudo-labels via the diffusion model for semi-supervised 3D object detection. Specifically, we include noises to produce corrupted 3D object size and class label distributions, and then utilize the diffusion model as a denoising process to obtain bounding box outputs. Moreover, we integrate the diffusion model into the teacher-student framework, so that the denoised bounding boxes can be used to improve pseudo-label generation, as well as the entire semi-supervised learning process. We conduct experiments on the ScanNet and SUN RGB-D benchmark datasets to demonstrate that our approach achieves state-of-the-art performance against existing methods. We also present extensive analysis to understand how our diffusion model design affects performance in semi-supervised learning.
Horizontal gene transfer inference approaches are usually based on gene sequences: parametric methods search for patterns that deviate from a particular genomic signature, while phylogenetic methods use sequences to reconstruct the gene and species trees. However, it is well-known that sequences have difficulty identifying ancient transfers since mutations have enough time to erase all evidence of such events. In this work, we ask whether character-based methods can predict gene transfers. Their advantage over sequences is that homologous genes can have low DNA similarity, but still have retained enough important common motifs that allow them to have common character traits, for instance the same functional or expression profile. A phylogeny that has two separate clades that acquired the same character independently might indicate the presence of a transfer even in the absence of sequence similarity. We introduce perfect transfer networks, which are phylogenetic networks that can explain the character diversity of a set of taxa under the assumption that characters have unique births, and that once a character is gained it is rarely lost. Examples of such traits include transposable elements, biochemical markers and emergence of organelles, just to name a few. We study the differences between our model and two similar models: perfect phylogenetic networks and ancestral recombination networks. Our goals are to initiate a study on the structural and algorithmic properties of perfect transfer networks. We then show that in polynomial time, one can decide whether a given network is a valid explanation for a set of taxa, and show how, for a given tree, one can add transfer edges to it so that it explains a set of taxa. We finally provide lower and upper bounds on the number of transfers required to explain a set of taxa, in the worst case.
Knowledge distillation (KD) emerges as a promising yet challenging technique for compressing deep neural networks, aiming to transfer extensive learning representations from proficient and computationally intensive teacher models to compact student models. However, current KD methods for super-resolution (SR) models have limited performance and restricted applications, since the characteristics of SR tasks are overlooked. In this paper, we put forth an approach from the perspective of effective data utilization, namely, the Data Upcycling Knowledge Distillation (DUKD), which facilitates the student model by the prior knowledge the teacher provided through the upcycled in-domain data derived from the input images. Besides, for the first time, we realize the label consistency regularization in KD for SR models, which is implemented by the paired invertible data augmentations. It constrains the training process of KD and leads to better generalization capability of the student model. The DUKD, due to its versatility, can be applied across a broad spectrum of teacher-student architectures (e.g., CNN and Transformer models) and SR tasks, such as single image SR, real-world SR, and SR quantization, and is in parallel with other compression techniques. Comprehensive experiments on diverse benchmarks demonstrate that the DUKD method significantly outperforms previous art.
Recent work has shown the utility of developing machine learning models that respect the structure and symmetries of eigenvectors. These works promote sign invariance, since for any eigenvector v the negation -v is also an eigenvector. However, we show that sign invariance is theoretically limited for tasks such as building orthogonally equivariant models and learning node positional encodings for link prediction in graphs. In this work, we demonstrate the benefits of sign equivariance for these tasks. To obtain these benefits, we develop novel sign equivariant neural network architectures. Our models are based on a new analytic characterization of sign equivariant polynomials and thus inherit provable expressiveness properties. Controlled synthetic experiments show that our networks can achieve the theoretically predicted benefits of sign equivariant models. Code is available at //github.com/cptq/Sign-Equivariant-Nets.
The integration of multi-omics data has emerged as a promising approach for gaining comprehensive insights into complex diseases such as cancer. This paper proposes a novel approach to identify cancer subtypes through the integration of multi-omics data for clustering. The proposed method, named LIDAF utilises affinity matrices based on linear relationships between and within different omics datasets (Linear Inter and Intra Dataset Affinity Fusion (LIDAF)). Canonical Correlation Analysis is in this paper employed to create distance matrices based on Euclidean distances between canonical variates. The distance matrices are converted to affinity matrices and those are fused in a three-step process. The proposed LIDAF addresses the limitations of the existing method resulting in improvement of clustering performance as measured by the Adjusted Rand Index and the Normalized Mutual Information score. Moreover, our proposed LIDAF approach demonstrates a notable enhancement in 50% of the log10 rank p-values obtained from Cox survival analysis, surpassing the performance of the best reported method, highlighting its potential of identifying distinct cancer subtypes.
Semantic communication, recognized as a promising technology for future intelligent applications, has received widespread research attention. Despite the potential of semantic communication to enhance transmission reliability, especially in low signal-to-noise (SNR) environments, the critical issue of resource allocation and compatibility in the dynamic wireless environment remains largely unexplored. In this paper, we propose an adaptive semantic resource allocation paradigm with semantic-bit quantization (SBQ) compatibly for existing wireless communications, where the inaccurate environment perception introduced by the additional mapping relationship between semantic metrics and transmission metrics is solved. In order to investigate the performance of semantic communication networks, the quality of service for semantic communication (SC-QoS), including the semantic quantization efficiency (SQE) and transmission latency, is proposed for the first time. A problem of maximizing the overall effective SC-QoS is formulated by jointly optimizing the transmit beamforming of the base station, the bits for semantic representation, the subchannel assignment, and the bandwidth resource allocation. To address the non-convex formulated problem, an intelligent resource allocation scheme is proposed based on a hybrid deep reinforcement learning (DRL) algorithm, where the intelligent agent can perceive both semantic tasks and dynamic wireless environments. Simulation results demonstrate that our design can effectively combat semantic noise and achieve superior performance in wireless communications compared to several benchmark schemes. Furthermore, compared to mapping-guided paradigm based resource allocation schemes, our proposed adaptive scheme can achieve up to 13% performance improvement in terms of SC-QoS.
Magnetic resonance imaging (MRI) is commonly used for brain tumor segmentation, which is critical for patient evaluation and treatment planning. To reduce the labor and expertise required for labeling, weakly-supervised semantic segmentation (WSSS) methods with class activation mapping (CAM) have been proposed. However, existing CAM methods suffer from low resolution due to strided convolution and pooling layers, resulting in inaccurate predictions. In this study, we propose a novel CAM method, Attentive Multiple-Exit CAM (AME-CAM), that extracts activation maps from multiple resolutions to hierarchically aggregate and improve prediction accuracy. We evaluate our method on the BraTS 2021 dataset and show that it outperforms state-of-the-art methods.
Motivated by Tucker tensor decomposition, this paper imposes low-rank structures to the column and row spaces of coefficient matrices in a multivariate infinite-order vector autoregression (VAR), which leads to a supervised factor model with two factor modelings being conducted to responses and predictors simultaneously. Interestingly, the stationarity condition implies an intrinsic weak group sparsity mechanism of infinite-order VAR, and hence a rank-constrained group Lasso estimation is considered for high-dimensional linear time series. Its non-asymptotic properties are discussed thoughtfully by balancing the estimation, approximation and truncation errors. Moreover, an alternating gradient descent algorithm with thresholding is designed to search for high-dimensional estimates, and its theoretical justifications, including statistical and convergence analysis, are also provided. Theoretical and computational properties of the proposed methodology are verified by simulation experiments, and the advantages over existing methods are demonstrated by two real examples.
We examine the linear regression problem in a challenging high-dimensional setting with correlated predictors to explain and predict relevant quantities, with explicitly allowing the regression coefficient to vary from sparse to dense. Most classical high-dimensional regression estimators require some degree of sparsity. We discuss the more recent concepts of variable screening and random projection as computationally fast dimension reduction tools, and propose a new random projection matrix tailored to the linear regression problem with a theoretical bound on the gain in expected prediction error over conventional random projections. Around this new random projection, we built the Sparse Projected Averaged Regression (SPAR) method combining probabilistic variable screening steps with the random projection steps to obtain an ensemble of small linear models. In difference to existing methods, we introduce a thresholding parameter to obtain some degree of sparsity. In extensive simulations and two real data applications we guide through the elements of this method and compare prediction and variable selection performance to various competitors. For prediction, our method performs at least as good as the best competitors in most settings with a high number of truly active variables, while variable selection remains a hard task for all methods in high dimensions.
Most existing knowledge graphs suffer from incompleteness, which can be alleviated by inferring missing links based on known facts. One popular way to accomplish this is to generate low-dimensional embeddings of entities and relations, and use these to make inferences. ConvE, a recently proposed approach, applies convolutional filters on 2D reshapings of entity and relation embeddings in order to capture rich interactions between their components. However, the number of interactions that ConvE can capture is limited. In this paper, we analyze how increasing the number of these interactions affects link prediction performance, and utilize our observations to propose InteractE. InteractE is based on three key ideas -- feature permutation, a novel feature reshaping, and circular convolution. Through extensive experiments, we find that InteractE outperforms state-of-the-art convolutional link prediction baselines on FB15k-237. Further, InteractE achieves an MRR score that is 9%, 7.5%, and 23% better than ConvE on the FB15k-237, WN18RR and YAGO3-10 datasets respectively. The results validate our central hypothesis -- that increasing feature interaction is beneficial to link prediction performance. We make the source code of InteractE available to encourage reproducible research.