We consider the flow of a fluid whose response characteristics change due the value of the norm of the symmetric part of the velocity gradient, behaving as an Euler fluid below a critical value and as a Navier-Stokes fluid at and above the critical value, the norm being determined by the external stimuli. We show that such a fluid, while flowing past a bluff body, develops boundary layers which are practically identical to those that one encounters within the context of the classical boundary layer theory propounded by Prandtl. Unlike the classical boundary layer theory that arises as an approximation within the context of the Navier-Stokes theory, here the development of boundary layers is due to a change in the response characteristics of the constitutive relation. We study the flow of such a fluid past an airfoil and compare the same against the solution of the Navier-Stokes equations. We find that the results are in excellent agreement with regard to the velocity and vorticity fields for the two cases.
We consider the problem of solving a family of parametric mixed-integer linear optimization problems where some entries in the input data change. We introduce the concept of cutting-plane layer (CPL), i.e., a differentiable cutting-plane generator mapping the problem data and previous iterates to cutting planes. We propose a CPL implementation to generate split cuts, and by combining several CPLs, we devise a differentiable cutting-plane algorithm that exploits the repeated nature of parametric instances. In an offline phase, we train our algorithm by updating the internal parameters controlling the CPLs, thus altering cut generation. Once trained, our algorithm computes, with predictable execution times and a fixed number of cuts, solutions with low integrality gaps. Preliminary computational tests show that our algorithm generalizes on unseen instances and captures underlying parametric structures.
Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on massive devices simultaneously without the control of a central server. However, existing theories claim that decentralization invariably undermines generalization. In this paper, we challenge the conventional belief and present a completely new perspective for understanding decentralized learning. We prove that D-SGD implicitly minimizes the loss function of an average-direction Sharpness-aware minimization (SAM) algorithm under general non-convex non-$\beta$-smooth settings. This surprising asymptotic equivalence reveals an intrinsic regularization-optimization trade-off and three advantages of decentralization: (1) there exists a free uncertainty evaluation mechanism in D-SGD to improve posterior estimation; (2) D-SGD exhibits a gradient smoothing effect; and (3) the sharpness regularization effect of D-SGD does not decrease as total batch size increases, which justifies the potential generalization benefit of D-SGD over centralized SGD (C-SGD) in large-batch scenarios. The code is available at //github.com/Raiden-Zhu/ICML-2023-DSGD-and-SAM.
This paper establishes the nearly optimal rate of approximation for deep neural networks (DNNs) when applied to Korobov functions, effectively overcoming the curse of dimensionality. The approximation results presented in this paper are measured with respect to $L_p$ norms and $H^1$ norms. Our achieved approximation rate demonstrates a remarkable "super-convergence" rate, outperforming traditional methods and any continuous function approximator. These results are non-asymptotic, providing error bounds that consider both the width and depth of the networks simultaneously.
Distributed optimization has experienced a significant surge in interest due to its wide-ranging applications in distributed learning and adaptation. While various scenarios, such as shared-memory, local-memory, and consensus-based approaches, have been extensively studied in isolation, there remains a need for further exploration of their interconnections. This paper specifically concentrates on a scenario where agents collaborate toward a unified mission while potentially having distinct tasks. Each agent's actions can potentially impact other agents through interactions. Within this context, the objective for the agents is to optimize their local parameters based on the aggregate of local reward functions, where only local zeroth-order oracles are available. Notably, the learning process is asynchronous, meaning that agents update and query their zeroth-order oracles asynchronously while communicating with other agents subject to bounded but possibly random communication delays. This paper presents theoretical convergence analyses and establishes a convergence rate for the proposed approach. Furthermore, it addresses the relevant issue of deep learning-based resource allocation in communication networks and conducts numerical experiments in which agents, acting as transmitters, collaboratively train their individual (possibly unique) policies to maximize a common performance metric.
The fundamental diagram serves as the foundation of traffic flow modeling for almost a century. With the increasing availability of road sensor data, deterministic parametric models have proved inadequate in describing the variability of real-world data, especially in congested area of the density-flow diagram. In this paper we estimate the stochastic density-flow relation introducing a nonparametric method called convex quantile regression. The proposed method does not depend on any prior functional form assumptions, but thanks to the concavity constraints, the estimated function satisfies the theoretical properties of the density-flow curve. The second contribution is to develop the new convex quantile regression with bags (CQRb) approach to facilitate practical implementation of CQR to the real-world data. We illustrate the CQRb estimation process using the road sensor data from Finland in years 2016-2018. Our third contribution is to demonstrate the excellent out-of-sample predictive power of the proposed CQRb method in comparison to the standard parametric deterministic approach.
Understanding crop diversity is crucial for resilience in farming, ecosystem services, and effective agro-environmental policies. We utilize a novel EU-wide satellite product (2018, 10 m resolution) to assess crop diversity across different scales. We define local crop diversity ($\alpha$-diversity) at 1 km scale, which in the EU is proportional to the area covered by large farms or clusters of small-to-medium sized farms. We also compute $\gamma$-diversity, covering landscape, regional, and national levels crop diversity. $\beta$-diversity ($\gamma$/$\alpha$) provides a measure of between agroecosystems diversity. National $\alpha$, $\gamma$, and $\beta$ diversity varies greatly ($\alpha$: 2.1-3.9, $\gamma$: 3.5-7.5, $\beta$: 1.22-2.27). EU-wide $\gamma$-diversity increases logarithmically with spatial aggregation (1 km: 2.85, 100 km: 4.27). We categorize EU Member States (MS) into four groups for crop diversification policy recommendations. Compared to the USA, the EU exhibits higher diversity related to differences in farm structure and practices. High local $\alpha$-diversity is only found for MS with small farms (<25 ha), but their presence doesn't always guarantee high local diversity. This study aids CAP implementation in the EU, with potential for annual continental Copernicus crop type maps and ecosystem co-variates exploration for a deeper understanding of agro-ecosystem services.
In the realm of the Internet of Things (IoT), deploying deep learning models to process data generated or collected by IoT devices is a critical challenge. However, direct data transmission can cause network congestion and inefficient execution, given that IoT devices typically lack computation and communication capabilities. Centralized data processing in data centers is also no longer feasible due to concerns over data privacy and security. To address these challenges, we present an innovative Edge-assisted U-Shaped Split Federated Learning (EUSFL) framework, which harnesses the high-performance capabilities of edge servers to assist IoT devices in model training and optimization process. In this framework, we leverage Federated Learning (FL) to enable data holders to collaboratively train models without sharing their data, thereby enhancing data privacy protection by transmitting only model parameters. Additionally, inspired by Split Learning (SL), we split the neural network into three parts using U-shaped splitting for local training on IoT devices. By exploiting the greater computation capability of edge servers, our framework effectively reduces overall training time and allows IoT devices with varying capabilities to perform training tasks efficiently. Furthermore, we proposed a novel noise mechanism called LabelDP to ensure that data features and labels can securely resist reconstruction attacks, eliminating the risk of privacy leakage. Our theoretical analysis and experimental results demonstrate that EUSFL can be integrated with various aggregation algorithms, maintaining good performance across different computing capabilities of IoT devices, and significantly reducing training time and local computation overhead.
Altermagnetism, a new magnetic phase, has been theoretically proposed and experimentally verified to be distinct from ferromagnetism and antiferromagnetism. Although altermagnets have been found to possess many exotic physical properties, the very limited availability of known altermagnetic materials~(e.g., 14 confirmed materials) hinders the study of such properties. Hence, discovering more types of altermagnetic materials is crucial for a comprehensive understanding of altermagnetism and thus facilitating new applications in the next generation information technologies, e.g., storage devices and high-sensitivity sensors. Here, we report 25 new altermagnetic materials that cover metals, semiconductors, and insulators, discovered by an AI search engine unifying symmetry analysis, graph neural network pre-training, optimal transport theory, and first-principles electronic structure calculation. The wide range of electronic structural characteristics reveals that various innovative physical properties manifest in these newly discovered altermagnetic materials, e.g., anomalous Hall effect, anomalous Kerr effect, and topological property. Noteworthy, we discovered 8 $i$-wave altermagnetic materials for the first time. Overall, the AI search engine performs much better than human experts and suggests a set of new altermagnetic materials with unique properties, outlining its potential for accelerated discovery of altermagnetic materials.
We consider the problem of obtaining effective representations for the solutions of linear, vector-valued stochastic differential equations (SDEs) driven by non-Gaussian pure-jump L\'evy processes, and we show how such representations lead to efficient simulation methods. The processes considered constitute a broad class of models that find application across the physical and biological sciences, mathematics, finance and engineering. Motivated by important relevant problems in statistical inference, we derive new, generalised shot-noise simulation methods whenever a normal variance-mean (NVM) mixture representation exists for the driving L\'evy process, including the generalised hyperbolic, normal-Gamma, and normal tempered stable cases. Simple, explicit conditions are identified for the convergence of the residual of a truncated shot-noise representation to a Brownian motion in the case of the pure L\'evy process, and to a Brownian-driven SDE in the case of the L\'evy-driven SDE. These results provide Gaussian approximations to the small jumps of the process under the NVM representation. The resulting representations are of particular importance in state inference and parameter estimation for L\'evy-driven SDE models, since the resulting conditionally Gaussian structures can be readily incorporated into latent variable inference methods such as Markov chain Monte Carlo (MCMC), Expectation-Maximisation (EM), and sequential Monte Carlo.
In the pursuit of accurate experimental and computational data while minimizing effort, there is a constant need for high-fidelity results. However, achieving such results often requires significant computational resources. To address this challenge, this paper proposes a deep operator learning-based framework that requires a limited high-fidelity dataset for training. We introduce a novel physics-guided, bi-fidelity, Fourier-featured Deep Operator Network (DeepONet) framework that effectively combines low and high-fidelity datasets, leveraging the strengths of each. In our methodology, we began by designing a physics-guided Fourier-featured DeepONet, drawing inspiration from the intrinsic physical behavior of the target solution. Subsequently, we train this network to primarily learn the low-fidelity solution, utilizing an extensive dataset. This process ensures a comprehensive grasp of the foundational solution patterns. Following this foundational learning, the low-fidelity deep operator network's output is enhanced using a physics-guided Fourier-featured residual deep operator network. This network refines the initial low-fidelity output, achieving the high-fidelity solution by employing a small high-fidelity dataset for training. Notably, in our framework, we employ the Fourier feature network as the Trunk network for the DeepONets, given its proficiency in capturing and learning the oscillatory nature of the target solution with high precision. We validate our approach using a well-known 2D benchmark cylinder problem, which aims to predict the time trajectories of lift and drag coefficients. The results highlight that the physics-guided Fourier-featured deep operator network, serving as a foundational building block of our framework, possesses superior predictive capability for the lift and drag coefficients compared to its data-driven counterparts.