This paper presents a numerical study of immiscible, compressible two-phase flows in porous media, that takes into account heterogeneity, gravity, anisotropy, and injection/production wells. We formulate a fully implicit stable discontinuous Galerkin solver for this system that is accurate, that respects the maximum principle for the approximation of saturation, and that is locally mass conservative. To completely eliminate the overshoot and undershoot phenomena, we construct a flux limiter that produces bound-preserving elementwise average of the saturation. The addition of a slope limiter allows to recover a pointwise bound-preserving discrete saturation. Numerical results show that both maximum principle and monotonicity of the solution are satisfied. The proposed flux limiter does not impact the local mass error and the number of nonlinear solver iterations.
Recently efforts have been made by social media platforms as well as researchers to detect hateful or toxic language using large language models. However, none of these works aim to use explanation, additional context and victim community information in the detection process. We utilise different prompt variation, input information and evaluate large language models in zero shot setting (without adding any in-context examples). We select three large language models (GPT-3.5, text-davinci and Flan-T5) and three datasets - HateXplain, implicit hate and ToxicSpans. We find that on average including the target information in the pipeline improves the model performance substantially (~20-30%) over the baseline across the datasets. There is also a considerable effect of adding the rationales/explanations into the pipeline (~10-20%) over the baseline across the datasets. In addition, we further provide a typology of the error cases where these large language models fail to (i) classify and (ii) explain the reason for the decisions they take. Such vulnerable points automatically constitute 'jailbreak' prompts for these models and industry scale safeguard techniques need to be developed to make the models robust against such prompts.
In this paper the interpolating rational functions introduced by Floater and Hormann are generalized leading to a whole new family of rational functions depending on $\gamma$, an additional positive integer parameter. For $\gamma = 1$, the original Floater--Hormann interpolants are obtained. When $\gamma>1$ we prove that the new rational functions share a lot of the nice properties of the original Floater--Hormann functions. Indeed, for any configuration of nodes in a compact interval, they have no real poles, interpolate the given data, preserve the polynomials up to a certain fixed degree, and have a barycentric-type representation. Moreover, we estimate the associated Lebesgue constants in terms of the minimum ($h^*$) and maximum ($h$) distance between two consecutive nodes. It turns out that, in contrast to the original Floater-Hormann interpolants, for all $\gamma > 1$ we get uniformly bounded Lebesgue constants in the case of equidistant and quasi-equidistant nodes configurations (i.e., when $h\sim h^*$). For such configurations, as the number of nodes tends to infinity, we prove that the new interpolants ($\gamma>1$) uniformly converge to the interpolated function $f$, for any continuous function $f$ and all $\gamma>1$. The same is not ensured by the original FH interpolants ($\gamma=1$). Moreover, we provide uniform and pointwise estimates of the approximation error for functions having different degrees of smoothness. Numerical experiments illustrate the theoretical results and show a better error profile for less smooth functions compared to the original Floater-Hormann interpolants.
Its numerous applications make multi-human 3D pose estimation a remarkably impactful area of research. Nevertheless, assuming a multiple-view system composed of several regular RGB cameras, 3D multi-pose estimation presents several challenges. First of all, each person must be uniquely identified in the different views to separate the 2D information provided by the cameras. Secondly, the 3D pose estimation process from the multi-view 2D information of each person must be robust against noise and potential occlusions in the scenario. In this work, we address these two challenges with the help of deep learning. Specifically, we present a model based on Graph Neural Networks capable of predicting the cross-view correspondence of the people in the scenario along with a Multilayer Perceptron that takes the 2D points to yield the 3D poses of each person. These two models are trained in a self-supervised manner, thus avoiding the need for large datasets with 3D annotations.
Toward large scale electrophysiology data analysis, many preprocessing pipelines are developed to reject artifacts as the prerequisite step before the downstream analysis. A mainstay of these pipelines is based on the data driven approach -- Independent Component Analysis (ICA). Nevertheless, there is little effort put to the preprocessing quality control. In this paper, attentions to this issue were carefully paid by our observation that after running ICA based preprocessing pipeline: some subjects showed approximately Parallel multichannel Log power Spectra (PaLOS), namely, multichannel power spectra are proportional to each other. Firstly, the presence of PaLOS and its implications to connectivity analysis were described by real instance and simulation; secondly, we built its mathematical model and proposed the PaLOS index (PaLOSi) based on the common principal component analysis to detect its presence; thirdly, the performance of PaLOSi was tested on 30094 cases of EEG from 5 databases. The results showed that 1) the PaLOS implies a sole source which is physiologically implausible. 2) PaLOSi can detect the excessive elimination of brain components and is robust in terms of channel number, electrode layout, reference, and the other factors. 3) PaLOSi can output the channel and frequency wise index to help for in-depth check. This paper presented the PaLOS issue in the quality control step after running the preprocessing pipeline and the proposed PaLOSi may serve as a novel data quality metric in the large-scale automatic preprocessing.
With the popularization of the internet, smartphones and social media, information is being spread quickly and easily way, which implies bigger traffic of information in the world, but there is a problem that is harming society with the dissemination of fake news. With a bigger flow of information, some people are trying to disseminate deceptive information and fake news. The automatic detection of fake news is a challenging task because to obtain a good result is necessary to deal with linguistics problems, especially when we are dealing with languages that not have been comprehensively studied yet, besides that, some techniques can help to reach a good result when we are dealing with text data, although, the motivation of detecting this deceptive information it is in the fact that the people need to know which information is true and trustful and which one is not. In this work, we present the effect the pre-processing methods such as lemmatization and stemming have on fake news classification, for that we designed some classifier models applying different pre-processing techniques. The results show that the pre-processing step is important to obtain betters results, the stemming and lemmatization techniques are interesting methods and need to be more studied to develop techniques focused on the Portuguese language so we can reach better results.
In this paper, I present three closed-form approximations of the two-sample Pearson Bayes factor. The techniques rely on some classical asymptotic results about gamma functions. These approximations permit simple closed-form calculation of the Pearson Bayes factor in cases where only the summary statistics are available (i.e., the t-score and degrees of freedom).
We study signals that are sparse in graph spectral domain and develop explicit algorithms to reconstruct the support set as well as partial components from samples on few vertices of the graph. The number of required samples is independent of the total size of the graph and takes only local properties of the graph into account. Our results rely on an operator based framework for subspace methods and become effective when the spectral eigenfunctions are zero-free or linear independent on small sets of the vertices. The latter has recently been adressed using algebraic methods by the first author.
In general insurance, claims are often lower-truncated and right-censored because insurance contracts may involve deductibles and maximal covers. Most classical statistical models are not (directly) suited to model lower-truncated and right-censored claims. A surprisingly flexible family of distributions that can cope with lower-truncated and right-censored claims is the class of MBBEFD distributions that originally has been introduced by Bernegger (1997) for reinsurance pricing, but which has not gained much attention outside the reinsurance literature. We derive properties of the class of MBBEFD distributions, and we extend it to a bigger family of distribution functions suitable for modeling lower-truncated and right-censored claims. Interestingly, in general insurance, we mainly rely on unimodal skewed densities, whereas the reinsurance literature typically proposes monotonically decreasing densities within the MBBEFD class.
To overcome the computational bottleneck of various data perturbation procedures such as the bootstrap and cross validations, we propose the Generative Multiple-purpose Sampler (GMS), which constructs a generator function to produce solutions of weighted M-estimators from a set of given weights and tuning parameters. The GMS is implemented by a single optimization without having to repeatedly evaluate the minimizers of weighted losses, and is thus capable of significantly reducing the computational time. We demonstrate that the GMS framework enables the implementation of various statistical procedures that would be unfeasible in a conventional framework, such as the iterated bootstrap, bootstrapped cross-validation for penalized likelihood, bootstrapped empirical Bayes with nonparametric maximum likelihood, etc. To construct a computationally efficient generator function, we also propose a novel form of neural network called the \emph{weight multiplicative multilayer perceptron} to achieve fast convergence. Our numerical results demonstrate that the new neural network structure enjoys a few orders of magnitude speed advantage in comparison to the conventional one. An R package called GMS is provided, which runs under Pytorch to implement the proposed methods and allows the user to provide a customized loss function to tailor to their own models of interest.
The aim of this paper is to show the relationship that lies in the fact of a person being right or left handed, in their skateboarding stance. Starting from the null hypothesis that there is no relationship, the Pearson's X^2 with Yates correction tests, as well as its respective p-value will be used to test the hypothesis. It will also be calculated and analyzed the residuals, Cramer's V and the Risk and Odds Ratios, with their respective confidence intervals to know the intensity of the association.