We formulate and solve data-driven aerodynamic shape design problems with distributionally robust optimization (DRO) approaches. Building on the findings of the work \cite{gotoh2018robust}, we study the connections between a class of DRO and the Taguchi method in the context of robust design optimization. Our preliminary computational experiments on aerodynamic shape optimization in transonic turbulent flow show promising design results.
In many practical control applications, the performance level of a closed-loop system degrades over time due to the change of plant characteristics. Thus, there is a strong need for redesigning a controller without going through the system modeling process, which is often difficult for closed-loop systems. Reinforcement learning (RL) is one of the promising approaches that enable model-free redesign of optimal controllers for nonlinear dynamical systems based only on the measurement of the closed-loop system. However, the learning process of RL usually requires a considerable number of trial-and-error experiments using the poorly controlled system that may accumulate wear on the plant. To overcome this limitation, we propose a model-free two-step design approach that improves the transient learning performance of RL in an optimal regulator redesign problem for unknown nonlinear systems. Specifically, we first design a linear control law that attains some degree of control performance in a model-free manner, and then, train the nonlinear optimal control law with online RL by using the designed linear control law in parallel. We introduce an offline RL algorithm for the design of the linear control law and theoretically guarantee its convergence to the LQR controller under mild assumptions. Numerical simulations show that the proposed approach improves the transient learning performance and efficiency in hyperparameter tuning of RL.
Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model's memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used Multi-Task Learning (MTL) datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model's performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.
Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize enough to improve over directly using the additional data of their pre-training process for augmentation. We systematically evaluate a range of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. Personalizing diffusion models towards the target data outperforms simpler prompting strategies. However, using the pre-training data of the diffusion model alone, via a simple nearest-neighbor retrieval procedure, leads to even stronger downstream performance. Our study explores the potential of diffusion models in generating new training data, and surprisingly finds that these sophisticated models are not yet able to beat a simple and strong image retrieval baseline on simple downstream vision tasks.
High-dimensional, higher-order tensor data are gaining prominence in a variety of fields, including but not limited to computer vision and network analysis. Tensor factor models, induced from noisy versions of tensor decomposition or factorization, are natural potent instruments to study a collection of tensor-variate objects that may be dependent or independent. However, it is still in the early stage of developing statistical inferential theories for estimation of various low-rank structures, which are customary to play the role of signals of tensor factor models. In this paper, starting from tensor matricization, we aim to ``decode" estimation of a higher-order tensor factor model in the sense that, we recast it into mode-wise traditional high-dimensional vector/fiber factor models so as to deploy the conventional estimation of principle components analysis (PCA). Demonstrated by the Tucker tensor factor model (TuTFaM), which is induced from most popular Tucker decomposition, we summarize that estimations on signal components are essentially mode-wise PCA techniques, and the involvement of projection and iteration will enhance the signal-to-noise ratio to various extend. We establish the inferential theory of the proposed estimations and conduct rich simulation experiments under TuTFaM, and illustrate how the proposed estimations can work in tensor reconstruction, clustering for video and economic datasets, respectively.
Embedding graphs in continous spaces is a key factor in designing and developing algorithms for automatic information extraction to be applied in diverse tasks (e.g., learning, inferring, predicting). The reliability of graph embeddings directly depends on how much the geometry of the continuous space matches the graph structure. Manifolds are mathematical structure that can enable to incorporate in their topological spaces the graph characteristics, and in particular nodes distances. State-of-the-art of manifold-based graph embedding algorithms take advantage of the assumption that the projection on a tangential space of each point in the manifold (corresponding to a node in the graph) would locally resemble a Euclidean space. Although this condition helps in achieving efficient analytical solutions to the embedding problem, it does not represent an adequate set-up to work with modern real life graphs, that are characterized by weighted connections across nodes often computed over sparse datasets with missing records. In this work, we introduce a new class of manifold, named soft manifold, that can solve this situation. In particular, soft manifolds are mathematical structures with spherical symmetry where the tangent spaces to each point are hypocycloids whose shape is defined according to the velocity of information propagation across the data points. Using soft manifolds for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets. Experimental results on reconstruction tasks on synthetic and real datasets show how the proposed approach enable more accurate and reliable characterization of graphs in continuous spaces with respect to the state-of-the-art.
Mean-field molecular dynamics based on path integrals is used to approximate canonical quantum observables for particle systems consisting of nuclei and electrons. A computational bottleneck is the sampling from the Gibbs density of the electron operator, which due to the fermion sign problem has a computational complexity that scales exponentially with the number of electrons. In this work we construct an algorithm that approximates the mean-field Hamiltonian by path integrals for fermions. The algorithm is based on the determinant of a matrix with components based on Brownian bridges connecting permuted electron coordinates. The computational work for $n$ electrons is $\mathcal O(n^3)$, which reduces the computational complexity associated with the fermion sign problem. We analyze a bias resulting from this approximation and provide a computational error indicator. It remains to rigorously explain the surprisingly high accuracy.
We prove discrete-to-continuum convergence for dynamical optimal transport on $\mathbb{Z}^d$-periodic graphs with energy density having linear growth at infinity. This result provides an answer to a problem left open by Gladbach, Kopfer, Maas, and Portinale (Calc Var Partial Differential Equations 62(5), 2023), where the convergence behaviour of discrete boundary-value dynamical transport problems is proved under the stronger assumption of superlinear growth. Our result extends the known literature to some important classes of examples, such as scaling limits of 1-Wasserstein transport problems. Similarly to what happens in the quadratic case, the geometry of the graph plays a crucial role in the structure of the limit cost function, as we discuss in the final part of this work, which includes some visual representations.
Nonparametric estimation of nonlocal interaction kernels is crucial in various applications involving interacting particle systems. The inference challenge, situated at the nexus of statistical learning and inverse problems, comes from the nonlocal dependency. A central question is whether the optimal minimax rate of convergence for this problem aligns with the rate of $M^{-\frac{2\beta}{2\beta+1}}$ in classical nonparametric regression, where $M$ is the sample size and $\beta$ represents the smoothness exponent of the radial kernel. Our study confirms this alignment for systems with a finite number of particles. We introduce a tamed least squares estimator (tLSE) that attains the optimal convergence rate for a broad class of exchangeable distributions. The tLSE bridges the smallest eigenvalue of random matrices and Sobolev embedding. This estimator relies on nonasymptotic estimates for the left tail probability of the smallest eigenvalue of the normal matrix. The lower minimax rate is derived using the Fano-Tsybakov hypothesis testing method. Our findings reveal that provided the inverse problem in the large sample limit satisfies a coercivity condition, the left tail probability does not alter the bias-variance tradeoff, and the optimal minimax rate remains intact. Our tLSE method offers a straightforward approach for establishing the optimal minimax rate for models with either local or nonlocal dependency.
Optical computing systems provide high-speed and low-energy data processing but face deficiencies in computationally demanding training and simulation-to-reality gaps. We propose a model-free optimization (MFO) method based on a score gradient estimation algorithm for computationally efficient in situ training of optical computing systems. This approach treats an optical computing system as a black box and back-propagates the loss directly to the optical computing weights' probability distributions, circumventing the need for a computationally heavy and biased system simulation. Our experiments on a single-layer diffractive optical computing system show that MFO outperforms hybrid training on the MNIST and FMNIST datasets. Furthermore, we demonstrate image-free and high-speed classification of cells from their phase maps. Our method's model-free and high-performance nature, combined with its low demand for computational resources, expedites the transition of optical computing from laboratory demonstrations to real-world applications.
This study evaluates four fracture simulation methods, comparing their computational expenses and implementation complexities within the Finite Element (FE) framework when employed on multiphase materials. Fracture methods considered encompass the Cohesive Zone Model (CZM) using zero-thickness cohesive interface elements (CIEs), the Standard Phase-Field Fracture (SPFM) approach, the Cohesive Phase-Field fracture (CPFM) approach, and an innovative hybrid model. The hybrid approach combines the CPFM fracture method with the CZM, specifically applying the CZM within the interface zone. The finite element model studied is characterized by three specific phases: Inclusions, matrix, and interface zone. The thorough assessment of these modeling techniques indicates that the CPFM approach stands out as the most effective computational model provided that the thickness of the interface zone is not significantly smaller than that of the other phases. In materials like concrete the interface thickness is notably small when compared to other phases. This leads to the hybrid model standing as the most authentic finite element model, utilizing CIEs within the interface to simulate interface debonding. A significant finding from this investigation is that the CPFM method is in agreement with the hybrid model when the interface zone thickness is not excessively small. This implies that the CPFM fracture methodology may serve as a unified fracture approach for multiphase materials, provided the interface zone's thickness is comparable to that of the other phases. In addition, this research provides valuable insights that can advance efforts to fine-tune material microstructures. An investigation of the influence of the interface material properties, morphological features and spatial arrangement of inclusions showes a pronounced effect of these parameters on the fracture toughness of the material.