Selecting the best regularization parameter in inverse problems is a classical and yet challenging problem. Recently, data-driven approaches have become popular to tackle this challenge. These approaches are appealing since they do require less a priori knowledge, but their theoretical analysis is limited. In this paper, we propose and study a statistical machine learning approach, based on empirical risk minimization. Our main contribution is a theoretical analysis, showing that, provided with enough data, this approach can reach sharp rates while being essentially adaptive to the noise and smoothness of the problem. Numerical simulations corroborate and illustrate the theoretical findings. Our results are a step towards grounding theoretically data-driven approaches to inverse problems.
Quasiperiodic systems are important space-filling ordered structures, without decay and translational invariance. How to solve quasiperiodic systems accurately and efficiently is of great challenge. A useful approach, the projection method (PM) [J. Comput. Phys., 256: 428, 2014], has been proposed to compute quasiperiodic systems. Various studies have demonstrated that the PM is an accurate and efficient method to solve quasiperiodic systems. However, there is a lack of theoretical analysis of PM. In this paper, we present a rigorous convergence analysis of the PM by establishing a mathematical framework of quasiperiodic functions and their high-dimensional periodic functions. We also give a theoretical analysis of quasiperiodic spectral method (QSM) based on this framework. Results demonstrate that PM and QSM both have exponential decay, and the QSM (PM) is a generalization of the periodic Fourier spectral (pseudo-spectral) method. Then we analyze the computational complexity of PM and QSM in calculating quasiperiodic systems. The PM can use fast Fourier transform, while the QSM cannot. Moreover, we investigate the accuracy and efficiency of PM, QSM and periodic approximation method in solving the linear time-dependent quasiperiodic Schr\"{o}dinger equation.
Strategies for partially observable Markov decision processes (POMDP) typically require memory. One way to represent this memory is via automata. We present a method to learn an automaton representation of a strategy using the L*-algorithm. Compared to the tabular representation of a strategy, the resulting automaton is dramatically smaller and thus also more explainable. Moreover, in the learning process, our heuristics may even improve the strategy's performance. In contrast to approaches that synthesize an automaton directly from the POMDP thereby solving it, our approach is incomparably more scalable.
Spatial transcriptomics (ST) technologies have revolutionized the study of gene expression patterns in tissues by providing multimodality data in transcriptomic, spatial, and morphological, offering opportunities for understanding tissue biology beyond transcriptomics. However, we identify the modality bias phenomenon in ST data species, i.e., the inconsistent contribution of different modalities to the labels leads to a tendency for the analysis methods to retain the information of the dominant modality. How to mitigate the adverse effects of modality bias to satisfy various downstream tasks remains a fundamental challenge. This paper introduces Multiple-modality Structure Transformation, named MuST, a novel methodology to tackle the challenge. MuST integrates the multi-modality information contained in the ST data effectively into a uniform latent space to provide a foundation for all the downstream tasks. It learns intrinsic local structures by topology discovery strategy and topology fusion loss function to solve the inconsistencies among different modalities. Thus, these topology-based and deep learning techniques provide a solid foundation for a variety of analytical tasks while coordinating different modalities. The effectiveness of MuST is assessed by performance metrics and biological significance. The results show that it outperforms existing state-of-the-art methods with clear advantages in the precision of identifying and preserving structures of tissues and biomarkers. MuST offers a versatile toolkit for the intricate analysis of complex biological systems.
In the era of large AI models, the complex architecture and vast parameters present substantial challenges for effective AI quality management (AIQM), e.g. large language model (LLM). This paper focuses on investigating the quality assurance of a specific LLM-based AI product--a ChatGPT-based sentiment analysis system. The study delves into stability issues related to both the operation and robustness of the expansive AI model on which ChatGPT is based. Experimental analysis is conducted using benchmark datasets for sentiment analysis. The results reveal that the constructed ChatGPT-based sentiment analysis system exhibits uncertainty, which is attributed to various operational factors. It demonstrated that the system also exhibits stability issues in handling conventional small text attacks involving robustness.
We provide a quantitative assessment of welfare in the classical model of risk-sharing and exchange under uncertainty. We prove three kinds of results. First, that in an equilibrium allocation, the scope for improving individual welfare by a given margin (an $\ve$-improvement) vanishes as the number of states increases. Second, that the scope for a change in aggregate resources that may be distributed to enhance individual welfare by a given margin also vanishes. Equivalently: in an inefficient allocation, for a given level of resource sub-optimality (as measured by the coefficient of resource under-utilization), the possibilities for enhancing welfare by perturbing aggregate resources decrease exponentially to zero with the number of states. Finally, we consider efficient risk-sharing in standard models of uncertainty aversion with multiple priors, and show that, in an inefficient allocation, certain sets of priors shrink with the size of the state space.
The search for a logic capturing PTIME is a long standing open problem in finite model theory. One of the most promising candidate logics for this is Choiceless Polynomial Time with counting (CPT). Abstractly speaking, CPT is an isomorphism-invariant computation model working with hereditarily finite sets as data structures. While it is easy to check that the evaluation of CPT-sentences is possible in polynomial time, the converse has been open for more than 20 years: Can every PTIME-decidable property of finite structures be expressed in CPT? We attempt to make progress towards a negative answer and show that Choiceless Polynomial Time cannot compute a preorder with colour classes of logarithmic size in every hypercube. The reason is that such preorders have super-polynomially many automorphic images, which makes it impossible for CPT to define them. While the computation of such a preorder is not a decision problem that would immediately separate P and CPT, it is significant for the following reason: The so-called Cai-F\"urer-Immerman (CFI) problem is one of the standard benchmarks for logics and maybe best known for separating fixed-point logic with counting (FPC) from P. Hence, it is natural to consider this also a potential candidate for the separation of CPT and P. The strongest known positive result in this regard says that CPT is able to solve CFI if a preorder with logarithmically sized colour classes is present in the input structure. Our result implies that this approach cannot be generalised to unordered inputs. In other words, CFI on unordered hypercubes is a PTIME-problem which provably cannot be tackled with the state-of-the-art choiceless algorithmic techniques.
We address the problem of checking the satisfiability of a set of constrained Horn clauses (CHCs) possibly including more than one query. We propose a transformation technique that takes as input a set of CHCs, including a set of queries, and returns as output a new set of CHCs, such that the transformed CHCs are satisfiable if and only if so are the original ones, and the transformed CHCs incorporate in each new query suitable information coming from the other ones so that the CHC satisfiability algorithm is able to exploit the relationships among all queries. We show that our proposed technique is effective on a non trivial benchmark of sets of CHCs that encode many verification problems for programs manipulating algebraic data types such as lists and trees.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.
High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.
While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on the ImageNet classification task has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new Full Reference Image Quality Assessment (FR-IQA) dataset of perceptual human judgments, orders of magnitude larger than previous datasets. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by huge margins. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.