This paper introduces a novel paradigm for constructing linearly implicit and high-order unconditionally energy-stable schemes for general gradient flows, utilizing the scalar auxiliary variable (SAV) approach and the additive Runge-Kutta (ARK) methods. We provide a rigorous proof of energy stability, unique solvability, and convergence. The proposed schemes generalizes some recently developed high-order, energy-stable schemes and address their shortcomings. On the one other hand, the proposed schemes can incorporate existing SAV-RK type methods after judiciously selecting the Butcher tables of ARK methods \cite{sav_li,sav_nlsw}. The order of a SAV-RKPC method can thus be confirmed theoretically by the order conditions of the corresponding ARK method. Several new schemes are constructed based on our framework, which perform to be more stable than existing SAV-RK type methods. On the other hand, the proposed schemes do not limit to a specific form of the nonlinear part of the free energy and can achieve high order with fewer intermediate stages compared to the convex splitting ARK methods \cite{csrk}. Numerical experiments demonstrate stability and efficiency of proposed schemes.
Non-autoregressive approaches aim to improve the inference speed of translation models, particularly those that generate output in a one-pass forward manner. However, these approaches often suffer from a significant drop in translation quality compared to autoregressive models. This paper introduces a series of innovative techniques to enhance the translation quality of Non-Autoregressive Translation (NAT) models while maintaining a substantial acceleration in inference speed. We propose fine-tuning Pretrained Multilingual Language Models (PMLMs) with the CTC loss to train NAT models effectively. Furthermore, we adopt the MASK insertion scheme for up-sampling instead of token duplication, and we present an embedding distillation method to further enhance performance. In our experiments, our model outperforms the baseline autoregressive model (Transformer \textit{base}) on multiple datasets, including WMT'14 DE$\leftrightarrow$EN, WMT'16 RO$\leftrightarrow$EN, and IWSLT'14 DE$\leftrightarrow$EN. Notably, our model achieves better performance than the baseline autoregressive model on the IWSLT'14 En$\leftrightarrow$De and WMT'16 En$\leftrightarrow$Ro datasets, even without using distillation data during training. It is worth highlighting that on the IWSLT'14 DE$\rightarrow$EN dataset, our model achieves an impressive BLEU score of 39.59, setting a new state-of-the-art performance. Additionally, our model exhibits a remarkable speed improvement of 16.35 times compared to the autoregressive model.
This paper is concerned with the numerical approximation of quantities of interest associated with solutions to parametric elliptic partial differential equations (PDEs). The key novelty of this work is in its focus on the quantities of interest represented by continuously G\^ateaux differentiable nonlinear functionals. We consider a class of parametric elliptic PDEs where the underlying differential operator has affine dependence on a countably infinite number of uncertain parameters. We design a goal-oriented adaptive algorithm for approximating nonlinear functionals of solutions to this class of parametric PDEs. In the algorithm, the approximations of parametric solutions to the primal and dual problems are computed using the multilevel stochastic Galerkin finite element method (SGFEM) and the adaptive refinement process is guided by reliable spatial and parametric error indicators that identify the dominant sources of error. We prove that the proposed algorithm generates multilevel SGFEM approximations for which the estimates of the error in the goal functional converge to zero. Numerical experiments for a selection of test problems and nonlinear quantities of interest demonstrate that the proposed goal-oriented adaptive strategy yields optimal convergence rates (for both the error estimates and the reference errors in the quantities of interest) with respect to the overall dimension of the underlying multilevel approximations spaces.
In this paper we present a hybridizable discontinuous Galerkin method for the time-dependent Navier-Stokes equations coupled to the quasi-static poroelasticity equations via interface conditions. We determine a bound on the data that guarantees stability and well-posedness of the fully discrete problem and prove a priori error estimates. A numerical example confirms our analysis.
The absence of annotated sign language datasets has hindered the development of sign language recognition and translation technologies. In this paper, we introduce Bornil; a crowdsource-friendly, multilingual sign language data collection, annotation, and validation platform. Bornil allows users to record sign language gestures and lets annotators perform sentence and gloss-level annotation. It also allows validators to make sure of the quality of both the recorded videos and the annotations through manual validation to develop high-quality datasets for deep learning-based Automatic Sign Language Recognition. To demonstrate the system's efficacy; we collected the largest sign language dataset for Bangladeshi Sign Language dialect, perform deep learning based Sign Language Recognition modeling, and report the benchmark performance. The Bornil platform, BornilDB v1.0 Dataset, and the codebases are available on //bornil.bengali.ai
We propose and analyse an explicit boundary-preserving scheme for the strong approximations of some SDEs with non-globally Lipschitz drift and diffusion coefficients whose state-space is bounded. The scheme consists of a Lamperti transform followed by a Lie--Trotter splitting. We prove $L^{p}(\Omega)$-convergence of order $1$, for every $p \in \mathbb{N}$, of the scheme and exploit the Lamperti transform to confine the numerical approximations to the state-space of the considered SDE. We provide numerical experiments that confirm the theoretical results and compare the proposed Lamperti-splitting scheme to other numerical schemes for SDEs.
Over the past decades, cognitive neuroscientists and behavioral economists have recognized the value of describing the process of decision making in detail and modeling the emergence of decisions over time. For example, the time it takes to decide can reveal more about an agents true hidden preferences than only the decision itself. Similarly, data that track the ongoing decision process such as eye movements or neural recordings contain critical information that can be exploited, even if no decision is made. Here, we argue that artificial intelligence (AI) research would benefit from a stronger focus on insights about how decisions emerge over time and incorporate related process data to improve AI predictions in general and human-AI interactions in particular. First, we introduce a highly established computational framework that assumes decisions to emerge from the noisy accumulation of evidence, and we present related empirical work in psychology, neuroscience, and economics. Next, we discuss to what extent current approaches in multi-agent AI do or do not incorporate process data and models of decision making. Finally, we outline how a more principled inclusion of the evidence-accumulation framework into the training and use of AI can help to improve human-AI interactions in the future.
This work introduces a stabilised finite element formulation for the Stokes flow problem with a nonlinear slip boundary condition of friction type. The boundary condition is enforced with the help of an additional Lagrange multiplier and the stabilised formulation is based on simultaneously stabilising both the pressure and the Lagrange multiplier. We establish the stability and the a priori error analyses, and perform a numerical convergence study in order to verify the theory.
In natural language processing (NLP), deep neural networks (DNNs) could model complex interactions between context and have achieved impressive results on a range of NLP tasks. Prior works on feature interaction attribution mainly focus on studying symmetric interaction that only explains the additional influence of a set of words in combination, which fails to capture asymmetric influence that contributes to model prediction. In this work, we propose an asymmetric feature interaction attribution explanation model that aims to explore asymmetric higher-order feature interactions in the inference of deep neural NLP models. By representing our explanation with an directed interaction graph, we experimentally demonstrate interpretability of the graph to discover asymmetric feature interactions. Experimental results on two sentiment classification datasets show the superiority of our model against the state-of-the-art feature interaction attribution methods in identifying influential features for model predictions. Our code is available at //github.com/StillLu/ASIV.
We introduce and analyze a hybridizable discontinuous Galerkin (HDG) method for the dual-porosity-Stokes problem. This coupled problem describes the interaction between free flow in macrofractures/conduits, governed by the Stokes equations, and flow in microfractures/matrix, governed by a dual-porosity model. We prove that the HDG method is strongly conservative, well-posed, and give an a priori error analysis showing dependence on the problem parameters. Our theoretical findings are corroborated by numerical examples
In epidemiology and social sciences, propensity score methods are popular for estimating treatment effects using observational data, and multiple imputation is popular for handling covariate missingness. However, how to appropriately use multiple imputation for propensity score analysis is not completely clear. This paper aims to bring clarity on the consistency (or lack thereof) of methods that have been proposed, focusing on the within approach (where the effect is estimated separately in each imputed dataset and then the multiple estimates are combined) and the across approach (where typically propensity scores are averaged across imputed datasets before being used for effect estimation). We show that the within method is valid and can be used with any causal effect estimator that is consistent in the full-data setting. Existing across methods are inconsistent, but a different across method that averages the inverse probability weights across imputed datasets is consistent for propensity score weighting. We also comment on methods that rely on imputing a function of the missing covariate rather than the covariate itself, including imputation of the propensity score and of the probability weight. Based on consistency results and practical flexibility, we recommend generally using the standard within method. Throughout, we provide intuition to make the results meaningful to the broad audience of applied researchers.