The width of a well partial ordering (wpo) is the ordinal rank of the set of its antichains ordered by inclusion. We compute the width of wpos obtained as cartesian products of finitely many well-orderings.
We consider the Cauchy problem for the Helmholtz equation with a domain in R^d, d>2 with N cylindrical outlets to infinity with bounded inclusions in R^{d-1}. Cauchy data are prescribed on the boundary of the bounded domains and the aim is to find solution on the unbounded part of the boundary. In 1989, Kozlov and Maz'ya proposed an alternating iterative method for solving Cauchy problems associated with elliptic,self-adjoint and positive-definite operators in bounded domains. Different variants of this method for solving Cauchy problems associated with Helmholtz-type operators exists. We consider the variant proposed by Mpinganzima et al. for bounded domains and derive the necessary conditions for the convergence of the procedure in unbounded domains. For the numerical implementation, a finite difference method is used to solve the problem in a simple rectangular domain in R^2 that represent a truncated infinite strip. The numerical results shows that by appropriate truncation of the domain and with appropriate choice of the Robin parameters, the Robin-Dirichlet alternating iterative procedure is convergent.
The Infinitesimal Calculus explores mainly two measurements: the instantaneous rates of change and the accumulation of quantities. This work shows that scientists, engineers, mathematicians, and teachers increasingly apply another change measurements tool: functions' local trends. While it seems to be a special case of the rate (via the derivative sign), this work proposes a separate and favorable mathematical framework for the trend, called Semi-discrete Calculus.
In this paper we get error bounds for fully discrete approximations of infinite horizon problems via the dynamic programming approach. It is well known that considering a time discretization with a positive step size $h$ an error bound of size $h$ can be proved for the difference between the value function (viscosity solution of the Hamilton-Jacobi-Bellman equation corresponding to the infinite horizon) and the value function of the discrete time problem. However, including also a spatial discretization based on elements of size $k$ an error bound of size $O(k/h)$ can be found in the literature for the error between the value functions of the continuous problem and the fully discrete problem. In this paper we revise the error bound of the fully discrete method and prove, under similar assumptions to those of the time discrete case, that the error of the fully discrete case is in fact $O(h+k)$ which gives first order in time and space for the method. This error bound matches the numerical experiments of many papers in the literature in which the behaviour $1/h$ from the bound $O(k/h)$ have not been observed.
This paper establishes the asymptotic independence between the quadratic form and maximum of a sequence of independent random variables. Based on this theoretical result, we find the asymptotic joint distribution for the quadratic form and maximum, which can be applied into the high-dimensional testing problems. By combining the sum-type test and the max-type test, we propose the Fisher's combination tests for the one-sample mean test and two-sample mean test. Under this novel general framework, several strong assumptions in existing literature have been relaxed. Monte Carlo simulation has been done which shows that our proposed tests are strongly robust to both sparse and dense data.
The similarity between a pair of time series, i.e., sequences of indexed values in time order, is often estimated by the dynamic time warping (DTW) distance, instead of any in the well-studied family of measures including the longest common subsequence (LCS) length and the edit distance. Although it may seem as if the DTW and the LCS(-like) measures are essentially different, we reveal that the DTW distance can be represented by the longest increasing subsequence (LIS) length of a sequence of integers, which is the LCS length between the integer sequence and itself sorted. For a given pair of time series of length $n$ such that the dissimilarity between any elements is an integer between zero and $c$, we propose an integer sequence that represents any substring-substring DTW distance as its band-substring LIS length. The length of the produced integer sequence is $O(c n^2)$, which can be translated to $O(n^2)$ for constant dissimilarity functions. To demonstrate that techniques developed under the LCS(-like) measures are directly applicable to analysis of time series via our reduction of DTW to LIS, we present time-efficient algorithms for DTW-related problems utilizing the semi-local sequence comparison technique developed for LCS-related problems.
Bearing fault identification and analysis is an important research area in the field of machinery fault diagnosis. Aiming at the common faults of rolling bearings, we propose a data-driven diagnostic algorithm based on the characteristics of bearing vibrations called multi-size kernel based adaptive convolutional neural network (MSKACNN). Using raw bearing vibration signals as the inputs, MSKACNN provides vibration feature learning and signal classification capabilities to identify and analyze bearing faults. Ball mixing is a ball bearing production quality problem that is difficult to identify using traditional frequency domain analysis methods since it requires high frequency resolutions of the measurement signals and results in a long analyzing time. The proposed MSKACNN is shown to improve the efficiency and accuracy of ball mixing diagnosis. To further demonstrate the effectiveness of MSKACNN in bearing fault identification, a bearing vibration data acquisition system was developed, and vibration signal acquisition was performed on rolling bearings under five different fault conditions including ball mixing. The resulting datasets were used to analyze the performance of our proposed model. To validate the adaptive ability of MSKACNN, fault test data from the Case Western Reserve University Bearing Data Center were also used. Test results show that MSKACNN can identify the different bearing conditions with high accuracy with high generalization ability. We presented an implementation of the MSKACNN as a lightweight module for a real-time bearing fault diagnosis system that is suitable for production.
Park et al. [TCS 2020] observed that the similarity between two (numerical) strings can be captured by the Cartesian trees: The Cartesian tree of a string is a binary tree recursively constructed by picking up the smallest value of the string as the root of the tree. Two strings of equal length are said to Cartesian-tree match if their Cartesian trees are isomorphic. Park et al. [TCS 2020] introduced the following Cartesian tree substring matching (CTMStr) problem: Given a text string $T$ of length $n$ and a pattern string of length $m$, find every consecutive substring $S = T[i..j]$ of a text string $T$ such that $S$ and $P$ Cartesian-tree match. They showed how to solve this problem in $\tilde{O}(n+m)$ time. In this paper, we introduce the Cartesian tree subsequence matching (CTMSeq) problem, that asks to find every minimal substring $S = T[i..j]$ of $T$ such that $S$ contains a subsequence $S'$ which Cartesian-tree matches $P$. We prove that the CTMSeq problem can be solved efficiently, in $O(m n p(n))$ time, where $p(n)$ denotes the update/query time for dynamic predecessor queries. By using a suitable dynamic predecessor data structure, we obtain $O(mn \log \log n)$-time and $O(n \log m)$-space solution for CTMSeq. This contrasts CTMSeq with closely related order-preserving subsequence matching (OPMSeq) which was shown to be NP-hard by Bose et al. [IPL 1998].
White noise is a fundamental and fairly well understood stochastic process that conforms the conceptual basis for many other processes, as well as for the modeling of time series. Here we push a fresh perspective toward white noise that, grounded on combinatorial considerations, contributes to give new interesting insights both for modelling and theoretical purposes. To this aim, we incorporate the ordinal pattern analysis approach which allows us to abstract a time series as a sequence of patterns and their associated permutations, and introduce a simple functional over permutations that partitions them into classes encoding their level of asymmetry. We compute the exact probability mass function (p.m.f.) of this functional over the symmetric group of degree $n$, thus providing the description for the case of an infinite white noise realization. This p.m.f. can be conveniently approximated by a continuous probability density from an exponential family, the Gaussian, hence providing natural sufficient statistics that render a convenient and simple statistical analysis through ordinal patterns. Such analysis is exemplified on experimental data for the spatial increments from tracks of gold nanoparticles in 3D diffusion.
We recall some of the history of the information-theoretic approach to deriving core results in probability theory and indicate parts of the recent resurgence of interest in this area with current progress along several interesting directions. Then we give a new information-theoretic proof of a finite version of de Finetti's classical representation theorem for finite-valued random variables. We derive an upper bound on the relative entropy between the distribution of the first $k$ in a sequence of $n$ exchangeable random variables, and an appropriate mixture over product distributions. The mixing measure is characterised as the law of the empirical measure of the original sequence, and de Finetti's result is recovered as a corollary. The proof is nicely motivated by the Gibbs conditioning principle in connection with statistical mechanics, and it follows along an appealing sequence of steps. The technical estimates required for these steps are obtained via the use of a collection of combinatorial tools known within information theory as `the method of types.'
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.