End-to-End (E2E) unrolled optimization frameworks show promise for Magnetic Resonance (MR) image recovery, but suffer from high memory usage during training. In addition, these deterministic approaches do not offer opportunities for sampling from the posterior distribution. In this paper, we introduce a memory-efficient approach for E2E learning of the posterior distribution. We represent this distribution as the combination of a data-consistency-induced likelihood term and an energy model for the prior, parameterized by a Convolutional Neural Network (CNN). The CNN weights are learned from training data in an E2E fashion using maximum likelihood optimization. The learned model enables the recovery of images from undersampled measurements using the Maximum A Posteriori (MAP) optimization. In addition, the posterior model can be sampled to derive uncertainty maps about the reconstruction. Experiments on parallel MR image reconstruction show that our approach performs comparable to the memory-intensive E2E unrolled algorithm, performs better than its memory-efficient counterpart, and can provide uncertainty maps. Our framework paves the way towards MR image reconstruction in 3D and higher dimensions
Convolutional neural networks (CNNs) represent one of the most widely used neural network architectures, showcasing state-of-the-art performance in computer vision tasks. Although larger CNNs generally exhibit higher accuracy, their size can be effectively reduced by "tensorization" while maintaining accuracy. Tensorization consists of replacing the convolution kernels with compact decompositions such as Tucker, Canonical Polyadic decompositions, or quantum-inspired decompositions such as matrix product states, and directly training the factors in the decompositions to bias the learning towards low-rank decompositions. But why doesn't tensorization seem to impact the accuracy adversely? We explore this by assessing how truncating the convolution kernels of dense (untensorized) CNNs impact their accuracy. Specifically, we truncated the kernels of (i) a vanilla four-layer CNN and (ii) ResNet-50 pre-trained for image classification on CIFAR-10 and CIFAR-100 datasets. We found that kernels (especially those inside deeper layers) could often be truncated along several cuts resulting in significant loss in kernel norm but not in classification accuracy. This suggests that such ``correlation compression'' (underlying tensorization) is an intrinsic feature of how information is encoded in dense CNNs. We also found that aggressively truncated models could often recover the pre-truncation accuracy after only a few epochs of re-training, suggesting that compressing the internal correlations of convolution layers does not often transport the model to a worse minimum. Our results can be applied to tensorize and compress CNN models more effectively.
This paper introduces CUQIpy, a versatile open-source Python package for computational uncertainty quantification (UQ) in inverse problems, presented as Part I of a two-part series. CUQIpy employs a Bayesian framework, integrating prior knowledge with observed data to produce posterior probability distributions that characterize the uncertainty in computed solutions to inverse problems. The package offers a high-level modeling framework with concise syntax, allowing users to easily specify their inverse problems, prior information, and statistical assumptions. CUQIpy supports a range of efficient sampling strategies and is designed to handle large-scale problems. Notably, the automatic sampler selection feature analyzes the problem structure and chooses a suitable sampler without user intervention, streamlining the process. With a selection of probability distributions, test problems, computational methods, and visualization tools, CUQIpy serves as a powerful, flexible, and adaptable tool for UQ in a wide selection of inverse problems. Part II of the series focuses on the use of CUQIpy for UQ in inverse problems with partial differential equations (PDEs).
On the Boolean domain, there is a class of symmetric signatures called ``Fibonacci gates" for which a beautiful P-time combinatorial algorithm has been designed for the corresponding $\operatorname{Holant}^*$ problems. In this work, we give a combinatorial view for $\operatorname{Holant}^*(\mathcal{F})$ problems on a domain of size 3 where $\mathcal{F}$ is a set of arity 3 functions with inputs taking values on the domain of size 3 and the functions share some common properties. The combinatorial view can also be extended to the domain of size 4. Specifically, we extend the definition of "Fibonacci gates" to the domain of size 3 and the domain of size 4. Moreover, we give the corresponding combinatorial algorithms.
In Coevolving Latent Space Networks with Attractors (CLSNA) models, nodes in a latent space represent social actors, and edges indicate their dynamic interactions. Attractors are added at the latent level to capture the notion of attractive and repulsive forces between nodes, borrowing from dynamical systems theory. However, CLSNA reliance on MCMC estimation makes scaling difficult, and the requirement for nodes to be present throughout the study period limit practical applications. We address these issues by (i) introducing a Stochastic gradient descent (SGD) parameter estimation method, (ii) developing a novel approach for uncertainty quantification using SGD, and (iii) extending the model to allow nodes to join and leave over time. Simulation results show that our extensions result in little loss of accuracy compared to MCMC, but can scale to much larger networks. We apply our approach to the longitudinal social networks of members of US Congress on the social media platform X. Accounting for node dynamics overcomes selection bias in the network and uncovers uniquely and increasingly repulsive forces within the Republican Party.
Lattices are architected metamaterials whose properties strongly depend on their geometrical design. The analogy between lattices and graphs enables the use of graph neural networks (GNNs) as a faster surrogate model compared to traditional methods such as finite element modelling. In this work, we generate a big dataset of structure-property relationships for strut-based lattices. The dataset is made available to the community which can fuel the development of methods anchored in physical principles for the fitting of fourth-order tensors. In addition, we present a higher-order GNN model trained on this dataset. The key features of the model are (i) SE(3) equivariance, and (ii) consistency with the thermodynamic law of conservation of energy. We compare the model to non-equivariant models based on a number of error metrics and demonstrate its benefits in terms of predictive performance and reduced training requirements. Finally, we demonstrate an example application of the model to an architected material design task. The methods which we developed are applicable to fourth-order tensors beyond elasticity such as piezo-optical tensor etc.
We present a novel framework for the development of fourth-order lattice Boltzmann schemes to tackle multidimensional nonlinear systems of conservation laws. Our numerical schemes preserve two fundamental characteristics inherent in classical lattice Boltzmann methods: a local relaxation phase and a transport phase composed of elementary shifts on a Cartesian grid. Achieving fourth-order accuracy is accomplished through the composition of second-order time-symmetric basic schemes utilizing rational weights. This enables the representation of the transport phase in terms of elementary shifts. Introducing local variations in the relaxation parameter during each stage of relaxation ensures the entropic nature of the schemes. This not only enhances stability in the long-time limit but also maintains fourth-order accuracy. To validate our approach, we conduct comprehensive testing on scalar equations and systems in both one and two spatial dimensions.
In sound event detection (SED), convolution neural networks (CNNs) are widely used to extract time-frequency patterns from the input spectrogram. However, features extracted by CNN can be insensitive to the shift of time-frequency patterns along the frequency axis. To address this issue, frequency dynamic convolution (FDY) has been proposed, which applies different kernels to different frequency components. Compared to the vannila CNN, FDY requires several times more parameters. In this paper, a more efficient solution named frequency-aware convolution (FAC) is proposed. In FAC, frequency-positional information is encoded in a vector and added to the input spectrogram. To match the amplitude of input, the encoding vector is scaled adaptively and channel-independently. Experiments are carried out in the context of DCASE 2022 task 4, and the results demonstrate that FAC can achieve comparable performance to that of FDY with only 515 additional parameters, while FDY requires 8.02 million additional parameters. The ablation study shows that scaling the encoding vector adaptively and channel-independently is critical to the performance of FAC.
The ability to simulate realistic networks based on empirical data is an important task across scientific disciplines, from epidemiology to computer science. Often simulation approaches involve selecting a suitable network generative model such as Erd\"os-R\'enyi or small-world. However, few tools are available to quantify if a particular generative model is suitable for capturing a given network structure or organization. We utilize advances in interpretable machine learning to classify simulated networks by our generative models based on various network attributes, using both primary features and their interactions. Our study underscores the significance of specific network features and their interactions in distinguishing generative models, comprehending complex network structures, and forming real-world networks
We define various monoid versions of the R. Thompson group $V$, and prove connections with monoids of acyclic digital circuits. We show that the monoid $M_{2,1}$ (based on partial functions) is not embeddable into Thompson's monoid ${\sf tot}M_{2,1}$, but that ${\sf tot}M_{2,1}$ has a submonoid that maps homomorphically onto $M_{2,1}$. This leads to an efficient completion algorithm for partial functions and partial circuits. We show that the union of partial circuits with disjoint domains is an element of $M_{2,1}$, and conversely, every element of $M_{2,1}$ can be decomposed efficiently into a union of partial circuits with disjoint domains.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.