We consider the 2D acoustic system with the Gaussian pulse as the initial data. This case was proposed at the first Workshop on benchmark problems in computational aeroacoustics, and it is commonly used for the verification of numerical methods. We construct an efficient algorithm to evaluate the exact solution for a given time t and distance r. For a precision eps, it takes c*ln(1/eps) operations (the evaluation of a Bessel function counts as one operation) where c does not depend on t and r. This becomes possible by using three different integral representations and an asymptotic series depending on t and r.
Machine Listening focuses on developing technologies to extract relevant information from audio signals. A critical aspect of these projects is the acquisition and labeling of contextualized data, which is inherently complex and requires specific resources and strategies. Despite the availability of some audio datasets, many are unsuitable for commercial applications. The paper emphasizes the importance of Active Learning (AL) using expert labelers over crowdsourcing, which often lacks detailed insights into dataset structures. AL is an iterative process combining human labelers and AI models to optimize the labeling budget by intelligently selecting samples for human review. This approach addresses the challenge of handling large, constantly growing datasets that exceed available computational resources and memory. The paper presents a comprehensive data-centric framework for Machine Listening projects, detailing the configuration of recording nodes, database structure, and labeling budget optimization in resource-constrained scenarios. Applied to an industrial port in Valencia, Spain, the framework successfully labeled 6540 ten-second audio samples over five months with a small team, demonstrating its effectiveness and adaptability to various resource availability situations.
The aim of the present work is to design, analyze theoretically, and test numerically, a generalized Dryja-Smith-Widlund (GDSW) preconditioner for composite Discontinuous Galerkin discretizations of multicompartment parabolic reaction-diffusion equations, where the solution can exhibit natural discontinuities across the domain. We prove that the resulting preconditioned operator for the solution of the discrete system arising at each time step converges with a scalable and quasi-optimal upper bound for the condition number. The GDSW preconditioner is then applied to the EMI (Extracellular - Membrane - Intracellular) reaction-diffusion system, recently proposed to model microscopically the spatiotemporal evolution of cardiac bioelectrical potentials. Numerical tests validate the scalability and quasi-optimality of the EMI-GDSW preconditioner, and investigate its robustness with respect to the time step size as well as jumps in the diffusion coefficients.
Entropy estimation plays a crucial role in various fields, such as information theory, statistical data science, and machine learning. However, traditional entropy estimation methods often struggle with complex data distributions. Mixture-based estimation of entropy has been recently proposed and gained attention due to its ease of use and accuracy. This paper presents a novel approach to quantify the uncertainty associated with this mixture-based entropy estimation method using weighted likelihood bootstrap. Unlike standard methods, our approach leverages the underlying mixture structure by assigning random weights to observations in a weighted likelihood bootstrap procedure, leading to more accurate uncertainty estimation. The generation of weights is also investigated, leading to the proposal of using weights obtained from a Dirichlet distribution with parameter $\alpha = 0.8137$ instead of the usual $\alpha = 1$. Furthermore, the use of centered percentile intervals emerges as the preferred choice to ensure empirical coverage close to the nominal level. Extensive simulation studies comparing different resampling strategies are presented and results discussed. The proposed approach is illustrated by analyzing the log-returns of daily Gold prices at COMEX for the years 2014--2022, and the Net Rating scores, an advanced statistic used in basketball analytics, for NBA teams with reference to the 2022/23 regular season.
Model evaluations are central to understanding the safety, risks, and societal impacts of AI systems. While most real-world AI applications involve human-AI interaction, most current evaluations (e.g., common benchmarks) of AI models do not. Instead, they incorporate human factors in limited ways, assessing the safety of models in isolation, thereby falling short of capturing the complexity of human-model interactions. In this paper, we discuss and operationalize a definition of an emerging category of evaluations -- "human interaction evaluations" (HIEs) -- which focus on the assessment of human-model interactions or the process and the outcomes of humans using models. First, we argue that HIEs can be used to increase the validity of safety evaluations, assess direct human impact and interaction-specific harms, and guide future assessments of models' societal impact. Second, we propose a safety-focused HIE design framework -- containing a human-LLM interaction taxonomy -- with three stages: (1) identifying the risk or harm area, (2) characterizing the use context, and (3) choosing the evaluation parameters. Third, we apply our framework to two potential evaluations for overreliance and persuasion risks. Finally, we conclude with tangible recommendations for addressing concerns over costs, replicability, and unrepresentativeness of HIEs.
We propose an enhancement to inductive types and records in a dependent type theory, namely (co)conditions. With a primitive interval type, conditions generalize the cubical syntax of higher inductive types in homotopy type theory, while coconditions generalize the cubical path type. (Co)conditions are also useful without an interval type. The duality between conditions and coconditions is presented in an interesting way: The elimination principles of inductive types with conditions can be internalized with records with coconditions and vice versa. However, we do not develop the metatheory of conditions and coconditions in this paper. Instead, we only present the type checking.
Lifted samplers form a class of Markov chain Monte Carlo methods which has drawn a lot attention in recent years due to superior performance in challenging Bayesian applications. A canonical example of such sampler is the one that is derived from a random walk Metropolis algorithm for a totally-ordered state space such as the integers or the real numbers. The lifted sampler is derived by splitting into two the proposal distribution: one part in the increasing direction, and the other part in the decreasing direction. It keeps following a direction, until a rejection, upon which it flips the direction. In terms of asymptotic variances, it outperforms the random walk Metropolis algorithm, regardless of the target distribution, at no additional computational cost. Other studies show, however, that beyond this simple case, lifted samplers do not always outperform their Metropolis counterparts. In this paper, we leverage the celebrated work of Tierney (1998) to provide an analysis in a general framework encompassing a broad class of lifted samplers. Our finding is that, essentially, the asymptotic variances cannot increase by a factor of more than 2, regardless of the target distribution, the way the directions are induced, and the type of algorithm from which the lifted sampler is derived (be it a Metropolis--Hastings algorithm, a reversible jump algorithm, etc.). This result indicates that, while there is potentially a lot to gain from lifting a sampler, there is not much to lose.
Randomized controlled trials often suffer from interference, a violation of the Stable Unit Treatment Values Assumption (SUTVA) in which a unit's treatment assignment affects the outcomes of its neighbors. This interference causes bias in naive estimators of the average treatment effect (ATE). A popular method to achieve unbiasedness is to pair the Horvitz-Thompson estimator of the ATE with a known exposure mapping: a function that identifies which units in a given randomization are not subject to interference. For example, an exposure mapping can specify that any unit with at least $h$-fraction of its neighbors having the same treatment status does not experience interference. However, this threshold $h$ is difficult to elicit from domain experts, and a misspecified threshold can induce bias. In this work, we propose a data-adaptive method to select the "$h$"-fraction threshold that minimizes the mean squared error of the Hortvitz-Thompson estimator. Our method estimates the bias and variance of the Horvitz-Thompson estimator under different thresholds using a linear dose-response model of the potential outcomes. We present simulations illustrating that our method improves upon non-adaptive choices of the threshold. We further illustrate the performance of our estimator by running experiments on a publicly-available Amazon product similarity graph. Furthermore, we demonstrate that our method is robust to deviations from the linear potential outcomes model.
This work concerns the enrichment of Discontinuous Galerkin (DG) bases, so that the resulting scheme provides a much better approximation of steady solutions to hyperbolic systems of balance laws. The basis enrichment leverages a prior - an approximation of the steady solution - which we propose to compute using a Physics-Informed Neural Network (PINN). To that end, after presenting the classical DG scheme, we show how to enrich its basis with a prior. Convergence results and error estimates follow, in which we prove that the basis with prior does not change the order of convergence, and that the error constant is improved. To construct the prior, we elect to use parametric PINNs, which we introduce, as well as the algorithms to construct a prior from PINNs. We finally perform several validation experiments on four different hyperbolic balance laws to highlight the properties of the scheme. Namely, we show that the DG scheme with prior is much more accurate on steady solutions than the DG scheme without prior, while retaining the same approximation quality on unsteady solutions.
We consider a robust estimation of linear regression coefficients. In this note, we focus on the case where the covariates are sampled from an $L$-subGaussian distribution with unknown covariance, the noises are sampled from a distribution with a bounded absolute moment and both covariates and noises may be contaminated by an adversary. We derive an estimation error bound, which depends on the stable rank and the condition number of the covariance matrix of covariates with a polynomial computational complexity of estimation.
Sensorized insoles provide a tool for gait studies and health monitoring during daily life. For users to accept such insoles they need to be comfortable and lightweight. Previous work has already demonstrated that estimation of ground reaction forces (GRFs) is possible with insoles. However, these are often assemblies of commercial components restricting design freedom and customization. Within this work, we investigate using four 3D-printed soft foam-like sensors to sensorize an insole. These sensors were combined with system identification of Hammerstein-Wiener models to estimate the 3D GRFs, which were compared to values from an instrumented treadmill as the golden standard. It was observed that the four sensors behaved in line with the expected change in pressure distribution during the gait cycle. In addition, the identified (personalized) Hammerstein-Wiener models showed the best estimation performance (on average RMS error 9.3%, R^2=0.85 and mean absolute error (MAE) 7%) of the vertical, mediolateral, and anteroposterior GRFs. Thereby showing that these sensors can estimate the resulting 3D force reasonably well. These results for nine participants were comparable to or outperformed other works that used commercial FSRs with machine learning. The identified models did decrease in estimation performance over time but stayed on average 11.35% RMS and 8.6% MAE after a week with the Hammerstein-Wiener model seeming consistent between days two and seven. These results show promise for using 3D-printed soft piezoresistive foam-like sensors with system identification to be a viable approach for applications that require softness, lightweight, and customization such as wearable (force) sensors.