In this paper, we propose a non-parametric score to evaluate the quality of the solution to an iterative algorithm for Independent Component Analysis (ICA) with arbitrary Gaussian noise. The novelty of this score stems from the fact that it just assumes a finite second moment of the data and uses the characteristic function to evaluate the quality of the estimated mixing matrix without any knowledge of the parameters of the noise distribution. We also provide a new characteristic function-based contrast function for ICA and propose a fixed point iteration to optimize the corresponding objective function. Finally, we propose a theoretical framework to obtain sufficient conditions for the local and global optima of a family of contrast functions for ICA. This framework uses quasi-orthogonalization inherently, and our results extend the classical analysis of cumulant-based objective functions to noisy ICA. We demonstrate the efficacy of our algorithms via experimental results on simulated datasets.
We propose and analyse boundary-preserving schemes for the strong approximations of some scalar SDEs with non-globally Lipschitz drift and diffusion coefficients whose state-space is bounded. The schemes consists of a Lamperti transform followed by a Lie--Trotter splitting. We prove $L^{p}(\Omega)$-convergence of order $1$, for every $p \geq 1$, of the schemes and exploit the Lamperti transform to confine the numerical approximations to the state-space of the considered SDE. We provide numerical experiments that confirm the theoretical results and compare the proposed Lamperti-splitting schemes to other numerical schemes for SDEs.
In this paper we consider a superlinear one-dimensional elliptic boundary value problem that generalizes the one studied by Moore and Nehari in [43]. Specifically, we deal with piecewise-constant weight functions in front of the nonlinearity with an arbitrary number $\kappa\geq 1$ of vanishing regions. We study, from an analytic and numerical point of view, the number of positive solutions, depending on the value of a parameter $\lambda$ and on $\kappa$. Our main results are twofold. On the one hand, we study analytically the behavior of the solutions, as $\lambda\downarrow-\infty$, in the regions where the weight vanishes. Our result leads us to conjecture the existence of $2^{\kappa+1}-1$ solutions for sufficiently negative $\lambda$. On the other hand, we support such a conjecture with the results of numerical simulations which also shed light on the structure of the global bifurcation diagrams in $\lambda$ and the profiles of positive solutions. Finally, we give additional numerical results suggesting that the same high multiplicity result holds true for a much larger class of weights, also arbitrarily close to situations where there is uniqueness of positive solutions.
In this paper, we consider the generation and utilization of helper data for physical unclonable functions (PUFs) that provide real-valued readout symbols. Compared to classical binary PUFs, more entropy can be extracted from each basic building block (PUF node), resulting in longer keys/fingerprints and/or a higher reliability. To this end, a coded modulation and signal shaping scheme that matches the (approximately) Gaussian distribution of the readout has to be employed. A new helper data scheme is proposed that works with any type of coded modulation/shaping scheme. Compared to the permutation scheme from the literature, less amount of helper data has to be generated and a higher reliability is achieved. Moreover, the recently proposed idea of a two-metric helper data scheme is generalized to coded modulation and a general S-metric scheme. It is shown how extra helper data can be generated to improve decodability. The proposed schemes are assessed by numerical simulations and by evaluation of measurement data. We compare multi-level codes using a new rate design strategy with bit-interleaved coded modulation and trellis shaping with a distribution matcher. By selecting a suitable design, the rate per PUF node that can be reliably extracted can be as high as 2~bit/node.
In this paper, we propose an adaptive approach, based on mesh refinement or parametric enrichment with polynomial degree adaption, for numerical solution of convection dominated equations with random input data. A parametric system emerged from an application of stochastic Galerkin approach is discretized by using a symmetric interior penalty Galerkin (SIPG) method with upwinding for the convection term in the spatial domain. We derive a residual-based error estimator contributed by the error due to the SIPG discretization, the (generalized) polynomial chaos discretization in the stochastic space, and data oscillations. Then, the reliability of the proposed error estimator, an upper bound for the energy error up to a multiplicative constant, is shown. Moreover, to balance the errors stemmed from spatial and stochastic spaces, the truncation error coming from Karhunen--Lo\`{e}ve expansion is also considered in the numerical simulations. Last, several benchmark examples including a random diffusivity parameter, a random velocity parameter, random diffusivity/velocity parameters, and a random (jump) discontinuous diffusivity parameter, are tested to illustrate the performance of the proposed estimator.
In an error estimation of finite element solutions to the Poisson equation, we usually impose the shape regularity assumption on the meshes to be used. In this paper, we show that even if the shape regularity condition is violated, the standard error estimation can be obtained if "bad" elements (elements that violate the shape regularity or maximum angle condition) are covered virtually by "good" simplices. A numerical experiment confirms the theoretical result.
In this work, we present new constructions for topological subsystem codes using semi-regular Euclidean and hyperbolic tessellations. They give us new families of codes, and we also provide a new family of codes obtained through an already existing construction, due to Sarvepalli and Brown. We also prove new results that allow us to obtain the parameters of these new codes.
This paper introduces a novel evaluation framework for Large Language Models (LLMs) such as Llama-2 and Mistral, focusing on the adaptation of Precision and Recall metrics from image generation to text generation. This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora. By conducting a comprehensive evaluation of state-of-the-art language models, the study reveals significant insights into their performance on open-ended generation tasks, which are not adequately captured by traditional benchmarks. The findings highlight a trade-off between the quality and diversity of generated samples, particularly when models are fine-tuned with human feedback. This work extends the toolkit for distribution-based NLP evaluation, offering insights into the practical capabilities and challenges faced by current LLMs in generating diverse and high-quality text.
In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.
In this paper, we propose a novel multi-task learning architecture, which incorporates recent advances in attention mechanisms. Our approach, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with task-specific soft-attention modules, which are trainable in an end-to-end manner. These attention modules allow for learning of task-specific features from the global pool, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. Experiments on the CityScapes dataset show that our method outperforms several baselines in both single-task and multi-task learning, and is also more robust to the various weighting schemes in the multi-task loss function. We further explore the effectiveness of our method through experiments over a range of task complexities, and show how our method scales well with task complexity compared to baselines.
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax