Modeling self-gravitating gas flows is essential to answering many fundamental questions in astrophysics. This spans many topics including planet-forming disks, star-forming clouds, galaxy formation, and the development of large-scale structures in the Universe. However, the nonlinear interaction between gravity and fluid dynamics offers a formidable challenge to solving the resulting time-dependent partial differential equations (PDEs) in three dimensions (3D). By leveraging the universal approximation capabilities of a neural network within a mesh-free framework, physics informed neural networks (PINNs) offer a new way of addressing this challenge. We introduce the gravity-informed neural network (GRINN), a PINN-based code, to simulate 3D self-gravitating hydrodynamic systems. Here, we specifically study gravitational instability and wave propagation in an isothermal gas. Our results match a linear analytic solution to within 1\% in the linear regime and a conventional grid code solution to within 5\% as the disturbance grows into the nonlinear regime. We find that the computation time of the GRINN does not scale with the number of dimensions. This is in contrast to the scaling of the grid-based code for the hydrodynamic and self-gravity calculations as the number of dimensions is increased. Our results show that the GRINN computation time is longer than the grid code in one- and two- dimensional calculations but is an order of magnitude lesser than the grid code in 3D with similar accuracy. Physics-informed neural networks like GRINN thus show promise for advancing our ability to model 3D astrophysical flows.
The fundamental problem in toxicity detection task lies in the fact that the toxicity is ill-defined. This causes us to rely on subjective and vague data in models' training, which results in non-robust and non-accurate results: garbage in - garbage out. This work suggests a new, stress-level-based definition of toxicity designed to be objective and context-aware. On par with it, we also describe possible ways of applying this new definition to dataset creation and model training.
We present a priori error estimates for a multirate time-stepping scheme for coupled differential equations. The discretization is based on Galerkin methods in time using two different time meshes for two parts of the problem. We aim at surface coupled multiphysics problems like two-phase flows. Special focus is on the handling of the interface coupling to guarantee a coercive formulation as key to optimal order error estimates. In a sequence of increasing complexity, we begin with the coupling of two ordinary differential equations, coupled heat conduction equation, and finally a coupled Stokes problem. For this we show optimal multi-rate estimates in velocity and a suboptimal result in pressure. The a priori estimates prove that the multirate method decouples the two subproblems exactly. This is the basis for adaptive methods which can choose optimal lattices for the respective subproblems.
High-quality samples generated with score-based reverse diffusion algorithms provide evidence that deep neural networks (DNN) trained for denoising can learn high-dimensional densities, despite the curse of dimensionality. However, recent reports of memorization of the training set raise the question of whether these networks are learning the "true" continuous density of the data. Here, we show that two denoising DNNs trained on non-overlapping subsets of a dataset learn nearly the same score function, and thus the same density, with a surprisingly small number of training images. This strong generalization demonstrates an alignment of powerful inductive biases in the DNN architecture and/or training algorithm with properties of the data distribution. We analyze these, demonstrating that the denoiser performs a shrinkage operation in a basis adapted to the underlying image. Examination of these bases reveals oscillating harmonic structures along contours and in homogeneous image regions. We show that trained denoisers are inductively biased towards these geometry-adaptive harmonic representations by demonstrating that they arise even when the network is trained on image classes such as low-dimensional manifolds, for which the harmonic basis is suboptimal. Additionally, we show that the denoising performance of the networks is near-optimal when trained on regular image classes for which the optimal basis is known to be geometry-adaptive and harmonic.
Directional fluid flow in perivascular spaces surrounding cerebral arteries is hypothesized to play a key role in brain solute transport and clearance. While various drivers for pulsatile flow, such as cardiac or respiratory pulsations, are well quantified, the question remains as to which mechanisms could induce directional flow within physiological regimes. To address this question, we develop theoretical and numerical reduced-order models to quantify the directional (net) flow induceable by peristaltic pumping in periarterial networks. Each periarterial element is modeled as a slender annular space bounded internally by a circular tube supporting a periodic traveling (peristaltic) wave. Under the reasonable assumptions of small Reynolds number flow, small radii, and small-amplitude peristaltic waves, we use lubrication theory and regular perturbation methods to derive theoretical expressions for the directional net flow and pressure distribution in the perivascular network. The reduced model is used to derive closed-form analytical expressions for the net flow for simple network configurations of interest, including single elements, two elements in tandem, and a three element bifurcation, with results compared with numerical predictions. In particular, we provide a computable theoretical estimate of the net flow induced by peristaltic motion in perivascular networks as a function of physiological parameters, notably wave length, frequency, amplitude and perivascular dimensions. Quantifying the maximal net flow for specific physiological regimes, we find that vasomotion may induce net pial periarterial flow velocities on the order of a few to tens of mum/s and that sleep-related changes in vasomotion pulsatility may drive a threefold flow increase.
Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth). In this paper, we propose Activation-aware Weight Quantization (AWQ), a hardware-friendly approach for LLM low-bit weight-only quantization. Our method is based on the observation that weights are not equally important: protecting only 1% of salient weights can greatly reduce quantization error. We then propose to search for the optimal per-channel scaling that protects the salient weights by observing the activation, not weights. AWQ does not rely on any backpropagation or reconstruction, so it can well preserve LLMs' generalization ability on different domains and modalities, without overfitting to the calibration set. AWQ outperforms existing work on various language modeling and domain-specific benchmarks. Thanks to better generalization, it achieves excellent quantization performance for instruction-tuned LMs and, for the first time, multi-modal LMs. Alongside AWQ, we implement an efficient and flexible inference framework tailored for LLMs on the edge, offering more than 3x speedup over the Huggingface FP16 implementation on both desktop and mobile GPUs. It also democratizes the deployment of the 70B Llama-2 model on mobile GPU (NVIDIA Jetson Orin 64GB).
We show a relation, based on parallel repetition of the Magic Square game, that can be solved, with probability exponentially close to $1$ (worst-case input), by $1D$ (uniform) depth $2$, geometrically-local, noisy (noise below a threshold), fan-in $4$, quantum circuits. We show that the same relation cannot be solved, with an exponentially small success probability (averaged over inputs drawn uniformly), by $1D$ (non-uniform) geometrically-local, sub-linear depth, classical circuits consisting of fan-in $2$ NAND gates. Quantum and classical circuits are allowed to use input-independent (geometrically-non-local) resource states, that is entanglement and randomness respectively. To the best of our knowledge, previous best (analogous) depth separation for a task between quantum and classical circuits was constant v/s sub-logarithmic, although for general (geometrically non-local) circuits. Our hardness result for classical circuits is based on a direct product theorem about classical communication protocols from Jain and Kundu [JK22]. As an application, we propose a protocol that can potentially demonstrate verifiable quantum advantage in the NISQ era. We also provide generalizations of our result for higher dimensional circuits as well as a wider class of Bell games.
Generalized cross-validation (GCV) is a widely-used method for estimating the squared out-of-sample prediction risk that employs a scalar degrees of freedom adjustment (in a multiplicative sense) to the squared training error. In this paper, we examine the consistency of GCV for estimating the prediction risk of arbitrary ensembles of penalized least squares estimators. We show that GCV is inconsistent for any finite ensemble of size greater than one. Towards repairing this shortcoming, we identify a correction that involves an additional scalar correction (in an additive sense) based on degrees of freedom adjusted training errors from each ensemble component. The proposed estimator (termed CGCV) maintains the computational advantages of GCV and requires neither sample splitting, model refitting, or out-of-bag risk estimation. The estimator stems from a finer inspection of ensemble risk decomposition and two intermediate risk estimators for the components in this decomposition. We provide a non-asymptotic analysis of the CGCV and the two intermediate risk estimators for ensembles of convex penalized estimators under Gaussian features and a linear response model. In the special case of ridge regression, we extend the analysis to general feature and response distributions using random matrix theory, which establishes model-free uniform consistency of CGCV.
We present an optimal transport approach for mesh adaptivity and shock capturing of compressible flows. Shock capturing is based on a viscosity regularization of the governing equations by introducing an artificial viscosity field as solution of the Helmholtz equation. Mesh adaptation is based on the optimal transport theory by formulating a mesh mapping as solution of Monge-Ampere equation. The marriage of optimal transport and viscosity regularization for compressible flows leads to a coupled system of the compressible Euler/Navier-Stokes equations, the Helmholtz equation, and the Monge-Ampere equation. We propose an iterative procedure to solve the coupled system in a sequential fashion using homotopy continuation to minimize the amount of artificial viscosity while enforcing positivity-preserving and smoothness constraints on the numerical solution. We explore various mesh monitor functions for computing r-adaptive meshes in order to reduce the amount of artificial dissipation and improve the accuracy of the numerical solution. The hybridizable discontinuous Galerkin method is used for the spatial discretization of the governing equations to obtain high-order accurate solutions. Extensive numerical results are presented to demonstrate the optimal transport approach on transonic, supersonic, hypersonic flows in two dimensions. The approach is found to yield accurate, sharp yet smooth solutions within a few mesh adaptation iterations.
It is well known since 1960s that by exploring the tensor product structure of the discrete Laplacian on Cartesian meshes, one can develop a simple direct Poisson solver with an $\mathcal O(N^{\frac{d+1}d})$ complexity in $d$-dimension. The GPU acceleration of numerically solving PDEs has been explored successfully around fifteen years ago and become more and more popular in the past decade, driven by significant advancement in both hardware and software technologies, especially in the recent few years. We present in this paper a simple but extremely fast MATLAB implementation on a modern GPU, which can be easily reproduced, for solving 3D Poisson type equations using a spectral-element method. In particular, it costs less than one second on a Nvidia A100 for solving a Poisson equation with one billion degree of freedoms.
Meta-analysis aims to combine effect measures from several studies. For continuous outcomes, the most popular effect measures use simple or standardized differences in sample means. However, a number of applications focus on the absolute values of these effect measures (i.e., unsigned magnitude effects). We provide statistical methods for meta-analysis of magnitude effects based on standardized mean differences. We propose a suitable statistical model for random-effects meta-analysis of absolute standardized mean differences (ASMD), investigate a number of statistical methods for point and interval estimation, and provide practical recommendations for choosing among them.