A number of complexity measures for Boolean functions have previously been introduced. These include (1) sensitivity, (2) block sensitivity, (3) witness complexity, (4) subcube partition complexity and (5) algorithmic complexity. Each of these is concerned with "worst-case" inputs. It has been shown that there is "asymptotic separation" between these complexity measures and very recently, due to the work of Huang, it has been established that they are all "polynomially related". In this paper, we study the notion of distributional complexity where the input bits are independent and one considers all of the above notions in expectation. We obtain a number of results concerning distributional complexity measures, among others addressing the above concepts of "asymptotic separation" and being "polynomially related" in this context. We introduce a new distributional complexity measure, local witness complexity, which only makes sense in the distributional context and we also study a new version of algorithmic complexity which involves partial information. Many interesting examples are presented including some related to percolation. The latter connects a number of the recent developments in percolation theory over the last two decades with the study of complexity measures in theoretical computer science.
In various scientific fields, researchers are interested in exploring the relationship between some response variable Y and a vector of covariates X. In order to make use of a specific model for the dependence structure, it first has to be checked whether the conditional density function of Y given X fits into a given parametric family. We propose a consistent bootstrap-based goodness-of-fit test for this purpose. The test statistic traces the difference between a nonparametric and a semi-parametric estimate of the marginal distribution function of Y. As its asymptotic null distribution is not distribution-free, a parametric bootstrap method is used to determine the critical value. A simulation study shows that, in some cases, the new method is more sensitive to deviations from the parametric model than other tests found in the literature. We also apply our method to real-world datasets.
We investigate notions of complete representation by partial functions, where the operations in the signature include antidomain restriction and may include composition, intersection, update, preferential union, domain, antidomain, and set difference. When the signature includes both antidomain restriction and intersection, the join-complete and the meet-complete representations coincide. Otherwise, for the signatures we consider, meet-complete is strictly stronger than join-complete. A necessary condition to be meet-completely representable is that the atoms are separating. For the signatures we consider, this condition is sufficient if and only if composition is not in the signature. For each of the signatures we consider, the class of (meet-)completely representable algebras is not axiomatisable by any existential-universal-existential first-order theory. For 14 expressively distinct signatures, we show, by giving an explicit representation, that the (meet-)completely representable algebras form a basic elementary class, axiomatisable by a universal-existential-universal first-order sentence. The signatures we axiomatise are those containing antidomain restriction and any of intersection, update, and preferential union and also those containing antidomain restriction, composition, and intersection and any of update, preferential union, domain, and antidomain.
We develop the novel method of artificial barriers for scalar stochastic differential equations (SDEs) and use it to construct boundary-preserving numerical schemes for strong approximations of scalar SDEs, possibly with non-globally Lipschitz drift and diffusion coefficients, whose state-space is either bounded or half-bounded. The idea of artificial barriers is to augment the SDE with artificial barriers outside the state-space to not change the solution process, and then apply a boundary-preserving numerical scheme to the resulting reflected SDE (RSDE). This enables us to construct boundary-preserving numerical schemes with the same strong convergence rate as the strong convergence rate of the numerical scheme for the corresponding RSDE. Based on the method of artificial barriers, we construct two boundary-preserving schemes that we call the Artificial Barrier Euler-Maruyama (ABEM) scheme and the Artificial Barrier Euler-Peano (ABEP) scheme. We provide numerical experiments for the ABEM scheme and the numerical results agree with the obtained theoretical results.
The least trimmed squares (LTS) is a reasonable formulation of robust regression whereas it suffers from high computational cost due to the nonconvexity and nonsmoothness of its objective function. The most frequently used FAST-LTS algorithm is particularly slow when a sparsity-inducing penalty such as the $\ell_1$ norm is added. This paper proposes a computationally inexpensive algorithm for the sparse LTS, which is based on the proximal gradient method with a reformulation technique. Proposed method is equipped with theoretical convergence preferred over existing methods. Numerical experiments show that our method efficiently yields small objective value.
Modeling the complex relationships between multiple categorical response variables as a function of predictors is a fundamental task in the analysis of categorical data. However, existing methods can be difficult to interpret and may lack flexibility. To address these challenges, we introduce a penalized likelihood method for multivariate categorical response regression that relies on a novel subspace decomposition to parameterize interpretable association structures. Our approach models the relationships between categorical responses by identifying mutual, joint, and conditionally independent associations, which yields a linear problem within a tensor product space. We establish theoretical guarantees for our estimator, including error bounds in high-dimensional settings, and demonstrate the method's interpretability and prediction accuracy through comprehensive simulation studies.
Log symmetric distributions are useful in modeling data which show high skewness and have found applications in various fields. Using a recent characterization for log symmetric distributions, we propose a goodness of fit test for testing log symmetry. The asymptotic distributions of the test statistics under both null and alternate distributions are obtained. As the normal-based test is difficult to implement, we also propose a jackknife empirical likelihood (JEL) ratio test for testing log symmetry. We conduct a Monte Carlo Simulation to evaluate the performance of the JEL ratio test. Finally, we illustrated our methodology using different data sets.
We propose a fast scheme for approximating the Mittag-Leffler function by an efficient sum-of-exponentials (SOE), and apply the scheme to the viscoelastic model of wave propagation with mixed finite element methods for the spatial discretization and the Newmark-beta scheme for the second-order temporal derivative. Compared with traditional L1 scheme for fractional derivative, our fast scheme reduces the memory complexity from $\mathcal O(N_sN) $ to $\mathcal O(N_sN_{exp})$ and the computation complexity from $\mathcal O(N_sN^2)$ to $\mathcal O(N_sN_{exp}N)$, where $N$ denotes the total number of temporal grid points, $N_{exp}$ is the number of exponentials in SOE, and $N_s$ represents the complexity of memory and computation related to the spatial discretization. Numerical experiments are provided to verify the theoretical results.
A numerical method ADER-DG with a local DG predictor for solving a DAE system has been developed, which was based on the formulation of ADER-DG methods using a local DG predictor for solving ODE and PDE systems. The basis functions were chosen in the form of Lagrange interpolation polynomials with nodal points at the roots of the Radau polynomials, which differs from the classical formulations of the ADER-DG method, where it is customary to use the roots of Legendre polynomials. It was shown that the use of this basis leads to A-stability and L1-stability in the case of using the DAE solver as ODE solver. The numerical method ADER-DG allows one to obtain a highly accurate numerical solution even on very coarse grids, with a step greater than the main characteristic scale of solution variation. The local discrete time solution can be used as a numerical solution of the DAE system between grid nodes, thereby providing subgrid resolution even in the case of very coarse grids. The classical test examples were solved by developed numerical method ADER-DG. With increasing index of the DAE system, a decrease in the empirical convergence orders p is observed. An unexpected result was obtained in the numerical solution of the stiff DAE system -- the empirical convergence orders of the numerical solution obtained using the developed method turned out to be significantly higher than the values expected for this method in the case of stiff problems. It turns out that the use of Lagrange interpolation polynomials with nodal points at the roots of the Radau polynomials is much better suited for solving stiff problems. Estimates showed that the computational costs of the ADER-DG method are approximately comparable to the computational costs of implicit Runge-Kutta methods used to solve DAE systems. Methods were proposed to reduce the computational costs of the ADER-DG method.
Uncertainty around predictions from a model due to the finite size of the development sample has traditionally been approached using classical inferential techniques. The finite size of the sample will result in the discrepancy between the final model and the correct model that maps covariates to predicted risks. From a decision-theoretic perspective, this discrepancy might affect the subsequent treatment decisions, and thus is associated with utility loss. From this perspective, procuring more development data is associated in expected gain in the utility of using the model. In this work, we define the Expected Value of Sample Information (EVSI) as the expected gain in clinical utility, defined in net benefit (NB) terms in net true positive units, by procuring a further development sample of a given size. We propose a bootstrap-based algorithm for EVSI computations, and show its feasibility and face validity in a case study. Decision-theoretic metrics can complement classical inferential methods when designing studies that are aimed at developing risk prediction models.
To solve many problems on graphs, graph traversals are used, the usual variants of which are the depth-first search and the breadth-first search. Implementing a graph traversal we consequently reach all vertices of the graph that belong to a connected component. The breadth-first search is the usual choice when constructing efficient algorithms for finding connected components of a graph. Methods of simple iteration for solving systems of linear equations with modified graph adjacency matrices and with the properly specified right-hand side can be considered as graph traversal algorithms. These traversal algorithms, generally speaking, turn out to be non-equivalent neither to the depth-first search nor the breadth-first search. The example of such a traversal algorithm is the one associated with the Gauss-Seidel method. For an arbitrary connected graph, to visit all its vertices, the algorithm requires not more iterations than that is required for BFS. For a large number of instances of the problem, fewer iterations will be required.