Grid sentence is commonly used for studying the Lombard effect and Normal-to-Lombard conversion. However, it's unclear if Normal-to-Lombard models trained on grid sentences are sufficient for improving natural speech intelligibility in real-world applications. This paper presents the recording of a parallel Lombard corpus (called Lombard Chinese TIMIT, LCT) extracting natural sentences from Chinese TIMIT. Then We compare natural and grid sentences in terms of Lombard effect and Normal-to-Lombard conversion using LCT and Enhanced MAndarin Lombard Grid corpus (EMALG). Through a parametric analysis of the Lombard effect, We find that as the noise level increases, both natural sentences and grid sentences exhibit similar changes in parameters, but in terms of the increase of the alpha ratio, grid sentences show a greater increase. Following a subjective intelligibility assessment across genders and Signal-to-Noise Ratios, the StarGAN model trained on EMALG consistently outperforms the model trained on LCT in terms of improving intelligibility. This superior performance may be attributed to EMALG's larger alpha ratio increase from normal to Lombard speech.
A comprehensive mathematical model of the multiphysics flow of blood and Cerebrospinal Fluid (CSF) in the brain can be expressed as the coupling of a poromechanics system and Stokes' equations: the first describes fluids filtration through the cerebral tissue and the tissue's elastic response, while the latter models the flow of the CSF in the brain ventricles. This model describes the functioning of the brain's waste clearance mechanism, which has been recently discovered to play an essential role in the progress of neurodegenerative diseases. To model the interactions between different scales in the porous medium, we propose a physically consistent coupling between Multi-compartment Poroelasticity (MPE) equations and Stokes' equations. In this work, we introduce a numerical scheme for the discretization of such coupled MPE-Stokes system, employing a high-order discontinuous Galerkin method on polytopal grids to efficiently account for the geometric complexity of the domain. We analyze the stability and convergence of the space semidiscretized formulation, we prove a-priori error estimates, and we present a temporal discretization based on a combination of Newmark's $\beta$-method for the elastic wave equation and the $\theta$-method for the other equations of the model. Numerical simulations carried out on test cases with manufactured solutions validate the theoretical error estimates. We also present numerical results on a two-dimensional slice of a patient-specific brain geometry reconstructed from diagnostic images, to test in practice the advantages of the proposed approach.
In this work we extend the shifted Laplacian approach to the elastic Helmholtz equation. The shifted Laplacian multigrid method is a common preconditioning approach for the discretized acoustic Helmholtz equation. In some cases, like geophysical seismic imaging, one needs to consider the elastic Helmholtz equation, which is harder to solve: it is three times larger and contains a nullity-rich grad-div term. These properties make the solution of the equation more difficult for multigrid solvers. The key idea in this work is combining the shifted Laplacian with approaches for linear elasticity. We provide local Fourier analysis and numerical evidence that the convergence rate of our method is independent of the Poisson's ratio. Moreover, to better handle the problem size, we complement our multigrid method with the domain decomposition approach, which works in synergy with the local nature of the shifted Laplacian, so we enjoy the advantages of both methods without sacrificing performance. We demonstrate the efficiency of our solver on 2D and 3D problems in heterogeneous media.
In this work, we present new proofs of convergence for Plug-and-Play (PnP) algorithms. PnP methods are efficient iterative algorithms for solving image inverse problems where regularization is performed by plugging a pre-trained denoiser in a proximal algorithm, such as Proximal Gradient Descent (PGD) or Douglas-Rachford Splitting (DRS). Recent research has explored convergence by incorporating a denoiser that writes exactly as a proximal operator. However, the corresponding PnP algorithm has then to be run with stepsize equal to $1$. The stepsize condition for nonconvex convergence of the proximal algorithm in use then translates to restrictive conditions on the regularization parameter of the inverse problem. This can severely degrade the restoration capacity of the algorithm. In this paper, we present two remedies for this limitation. First, we provide a novel convergence proof for PnP-DRS that does not impose any restrictions on the regularization parameter. Second, we examine a relaxed version of the PGD algorithm that converges across a broader range of regularization parameters. Our experimental study, conducted on deblurring and super-resolution experiments, demonstrate that both of these solutions enhance the accuracy of image restoration.
This study proposes a novel self-calibration method for eye tracking in a virtual reality (VR) headset. The proposed method is based on the assumptions that the user's viewpoint can freely move and that the points of regard (PoRs) from different viewpoints are distributed within a small area on an object surface during visual fixation. In the method, fixations are first detected from the time-series data of uncalibrated gaze directions using an extension of the I-VDT (velocity and dispersion threshold identification) algorithm to a three-dimensional (3D) scene. Then, the calibration parameters are optimized by minimizing the sum of a dispersion metrics of the PoRs. The proposed method can potentially identify the optimal calibration parameters representing the user-dependent offset from the optical axis to the visual axis without explicit user calibration, image processing, or marker-substitute objects. For the gaze data of 18 participants walking in two VR environments with many occlusions, the proposed method achieved an accuracy of 2.1$^\circ$, which was significantly lower than the average offset. Our method is the first self-calibration method with an average error lower than 3$^\circ$ in 3D environments. Further, the accuracy of the proposed method can be improved by up to 1.2$^\circ$ by refining the fixation detection or optimization algorithm.
We establish a general convergence theory of the Rayleigh--Ritz method and the refined Rayleigh--Ritz method for computing some simple eigenpair ($\lambda_{*},x_{*}$) of a given analytic nonlinear eigenvalue problem (NEP). In terms of the deviation $\varepsilon$ of $x_{*}$ from a given subspace $\mathcal{W}$, we establish a priori convergence results on the Ritz value, the Ritz vector and the refined Ritz vector, and present sufficient convergence conditions for them. The results show that, as $\varepsilon\rightarrow 0$, there is a Ritz value that unconditionally converges to $\lambda_*$ and the corresponding refined Ritz vector does so too but the Ritz vector may fail to converge and even may not be unique. We also present an error bound for the approximate eigenvector in terms of the computable residual norm of a given approximate eigenpair, and give lower and upper bounds for the error of the refined Ritz vector and the Ritz vector as well as for that of the corresponding residual norms. These results nontrivially extend some convergence results on these two methods for the linear eigenvalue problem to the NEP. Examples are constructed to illustrate some of the results.
In this work, we propose two criteria for linear codes obtained from the Plotkin sum construction being symplectic self-orthogonal (SO) and linear complementary dual (LCD). As specific constructions, several classes of symplectic SO codes with good parameters including symplectic maximum distance separable codes are derived via $\ell$-intersection pairs of linear codes and generalized Reed-Muller codes. Also symplectic LCD codes are constructed from general linear codes. Furthermore, we obtain some binary symplectic LCD codes, which are equivalent to quaternary trace Hermitian additive complementary dual codes that outperform best-known quaternary Hermitian LCD codes reported in the literature. In addition, we prove that symplectic SO and LCD codes obtained in these ways are asymptotically good.
Positive and unlabelled learning is an important problem which arises naturally in many applications. The significant limitation of almost all existing methods lies in assuming that the propensity score function is constant (SCAR assumption), which is unrealistic in many practical situations. Avoiding this assumption, we consider parametric approach to the problem of joint estimation of posterior probability and propensity score functions. We show that under mild assumptions when both functions have the same parametric form (e.g. logistic with different parameters) the corresponding parameters are identifiable. Motivated by this, we propose two approaches to their estimation: joint maximum likelihood method and the second approach based on alternating maximization of two Fisher consistent expressions. Our experimental results show that the proposed methods are comparable or better than the existing methods based on Expectation-Maximisation scheme.
Knowledge Graphs (KGs) have emerged as fundamental platforms for powering intelligent decision-making and a wide range of Artificial Intelligence (AI) services across major corporations such as Google, Walmart, and AirBnb. KGs complement Machine Learning (ML) algorithms by providing data context and semantics, thereby enabling further inference and question-answering capabilities. The integration of KGs with neuronal learning (e.g., Large Language Models (LLMs)) is currently a topic of active research, commonly named neuro-symbolic AI. Despite the numerous benefits that can be accomplished with KG-based AI, its growing ubiquity within online services may result in the loss of self-determination for citizens as a fundamental societal issue. The more we rely on these technologies, which are often centralised, the less citizens will be able to determine their own destinies. To counter this threat, AI regulation, such as the European Union (EU) AI Act, is being proposed in certain regions. The regulation sets what technologists need to do, leading to questions concerning: How can the output of AI systems be trusted? What is needed to ensure that the data fuelling and the inner workings of these artefacts are transparent? How can AI be made accountable for its decision-making? This paper conceptualises the foundational topics and research pillars to support KG-based AI for self-determination. Drawing upon this conceptual framework, challenges and opportunities for citizen self-determination are illustrated and analysed in a real-world scenario. As a result, we propose a research agenda aimed at accomplishing the recommended objectives.
In this paper we discuss a deterministic form of ensemble Kalman inversion as a regularization method for linear inverse problems. By interpreting ensemble Kalman inversion as a low-rank approximation of Tikhonov regularization, we are able to introduce a new sampling scheme based on the Nystr\"om method that improves practical performance. Furthermore, we formulate an adaptive version of ensemble Kalman inversion where the sample size is coupled with the regularization parameter. We prove that the proposed scheme yields an order optimal regularization method under standard assumptions if the discrepancy principle is used as a stopping criterion. The paper concludes with a numerical comparison of the discussed methods for an inverse problem of the Radon transform.
Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.