In this note, we study a concatenation of quasi-Monte Carlo and plain Monte Carlo rules for high-dimensional numerical integration in weighted function spaces. In particular, we consider approximating the integral of periodic functions defined over the $s$-dimensional unit cube by using rank-1 lattice point sets only for the first $d\, (<s)$ coordinates and random points for the remaining $s-d$ coordinates. We prove that, by exploiting a decay of the weights of function spaces, almost the optimal order of the mean squared worst-case error is achieved by such a concatenated quadrature rule as long as $d$ scales at most linearly with the number of points. This result might be useful for numerical integration in extremely high dimensions, such as partial differential equations with random coefficients for which even the standard fast component-by-component algorithm is considered computationally expensive.
In this paper we study a convex-concave saddle-point problem $\min_x\max_y f(x) + y^\top\mathbf{A} x - g(y)$, where $f(x)$ and $g(y)$ are smooth and convex functions. We propose an Accelerated Primal-Dual Gradient Method for solving this problem which (i) achieves an optimal linear convergence rate in the strongly-convex-strongly-concave regime matching the lower complexity bound (Zhang et al., 2021) and (ii) achieves an accelerated linear convergence rate in the case when only one of the functions $f(x)$ and $g(y)$ is strongly convex or even none of them are. Finally, we obtain a linearly-convergent algorithm for the general smooth and convex-concave saddle point problem $\min_x\max_y F(x,y)$ without requirement of strong convexity or strong concavity.
The Feferman-Vaught theorem provides a way of evaluating a first order sentence $\varphi$ on a disjoint union of structures by producing a decomposition of $\varphi$ into sentences which can be evaluated on the individual structures and the results of these evaluations combined using a propositional formula. This decomposition can in general be non-elementarily larger than $\varphi$. We show that for first order sentences in prenex normal form with a fixed number of quantifier alternations, such a decomposition, further with the same number of quantifier alternations, can be obtained in time elementary in the size of $\varphi$. We obtain this result as a consequence of a more general decomposition theorem that we prove for a family of infinitary logics we define. We extend these results by considering binary operations other than disjoint union, in particular sum-like operations such as ordered sum and NLC-sum, that are definable using quantifier-free interpretations.
We introduce LAM, a subsystem of IMALL2 with restricted additive rules able to manage duplication linearly, called linear additive rules. LAM is presented as the type assignment system for a calculus endowed with copy constructors, which deal with substitution in a linear fashion. As opposed to the standard additive rules, the linear additive rules do not affect the complexity of term reduction: typable terms of LAM enjoy linear strong normalization. Moreover, a mildly weakened version of cut-elimination for this system is proven which takes a cubic number of steps. Finally, we define a sound translation from proofs of LAM into linear lambda terms of IMLL2, and we study its complexity.
We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our contribution is substantially more efficient than k-means as it does not require an all to all comparison of data points and clusters. We show that the optimal solutions of our approximation are the same as in the exact solution. However, our approach is considerably more efficient at extracting these clusters compared to the state-of-the-art. We compare our approximation with the exact k-means and alternative approximation approaches on a series of standardised clustering tasks. For the evaluation, we consider the algorithmic complexity, including number of operations to convergence, and the stability of the results.
In this paper, we study deep neural networks (DNNs) for solving high-dimensional evolution equations with oscillatory solutions. Different from deep least-squares methods that deal with time and space variables simultaneously, we propose a deep adaptive basis Galerkin (DABG) method which employs the spectral-Galerkin method for time variable by tensor-product basis for oscillatory solutions and the deep neural network method for high-dimensional space variables. The proposed method can lead to a linear system of differential equations having unknown DNNs that can be trained via the loss function. We establish a posterior estimates of the solution error which is bounded by the minimal loss function and the term $O(N^{-m})$, where $N$ is the number of basis functions and $m$ characterizes the regularity of the equation, and show that if the true solution is a Barron-type function, the error bound converges to zero as $M=O(N^p)$ approaches to infinity where $M$ is the width of the used networks and $p$ is a positive constant. Numerical examples including high-dimensional linear parabolic and hyperbolic equations, and nonlinear Allen-Cahn equation are presented to demonstrate the performance of the proposed DABG method is better than that of existing DNNs.
This paper deals with a special type of Lyapunov functions, namely the solution of Zubov's equation. Such a function can be used to characterize the domain of attraction for systems of ordinary differential equations. We derive and prove an integral form solution to Zubov's equation. For numerical computation, we develop two data-driven methods. One is based on the integration of an augmented system of differential equations; and the other one is based on deep learning. The former is effective for systems with a relatively low state space dimension and the latter is developed for high dimensional problems. The deep learning method is applied to a New England 10-generator power system model. We prove that a neural network approximation exists for the Lyapunov function of power systems such that the approximation error is a cubic polynomial of the number of generators. The error convergence rate as a function of n, the number of neurons, is proved.
Let $Q_{n}^{r}$ be the graph with vertex set $\{-1,1\}^{n}$ in which two vertices are joined if their Hamming distance is at most $r$. The edge-isoperimetric problem for $Q_{n}^{r}$ is that: For every $(n,r,M)$ such that $1\le r\le n$ and $1\le M\le2^{n}$, determine the minimum edge-boundary size of a subset of vertices of $Q_{n}^{r}$ with a given size $M$. In this paper, we apply two different approaches to prove bounds for this problem. The first approach is a linear programming approach and the second is a probabilistic approach. Our bound derived by the first approach generalizes the tight bound for $M=2^{n-1}$ derived by Kahn, Kalai, and Linial in 1989. Moreover, our bound is also tight for $M=2^{n-2}$ and $r\le\frac{n}{2}-1$. Our bounds derived by the second approach are expressed in terms of the \emph{noise stability}, and they are shown to be asymptotically tight as $n\to\infty$ when $r=2\lfloor\frac{\beta n}{2}\rfloor+1$ and $M=\lfloor\alpha2^{n}\rfloor$ for fixed $\alpha,\beta\in(0,1)$, and is tight up to a factor $2$ when $r=2\lfloor\frac{\beta n}{2}\rfloor$ and $M=\lfloor\alpha2^{n}\rfloor$. In fact, the edge-isoperimetric problem is equivalent to a ball-noise stability problem which is a variant of the traditional (i.i.d.-) noise stability problem. Our results can be interpreted as bounds for the ball-noise stability problem.
Working constructively, we study continuous directed complete posets (dcpos) and the Scott topology. Our two primary novelties are a notion of intrinsic apartness and a notion of sharp elements. Being apart is a positive formulation of being unequal, similar to how inhabitedness is a positive formulation of nonemptiness. To exemplify sharpness, we note that a lower real is sharp if and only if it is located. Our first main result is that for a large class of continuous dcpos, the Bridges-Vita apartness topology and the Scott topology coincide. Although we cannot expect a tight or cotransitive apartness on nontrivial dcpos, we prove that the intrinsic apartness is both tight and cotransitive when restricted to the sharp elements of a continuous dcpo. These include the strongly maximal elements, as studied by Smyth and Heckmann. We develop the theory of strongly maximal elements highlighting its connection to sharpness and the Lawson topology. Finally, we illustrate the intrinsic apartness, sharpness and strong maximality by considering several natural examples of continuous dcpos: the Cantor and Baire domains, the partial Dedekind reals and the lower reals.
The deep-learning-based least squares method has shown successful results in solving high-dimensional non-linear partial differential equations (PDEs). However, this method usually converges slowly. To speed up the convergence of this approach, an active-learning-based sampling algorithm is proposed in this paper. This algorithm actively chooses the most informative training samples from a probability density function based on residual errors to facilitate error reduction. In particular, points with larger residual errors will have more chances of being selected for training. This algorithm imitates the human learning process: learners are likely to spend more time repeatedly studying mistakes than other tasks they have correctly finished. A series of numerical results are illustrated to demonstrate the effectiveness of our active-learning-based sampling in high dimensions to speed up the convergence of the deep-learning-based least squares method.
Importance sampling is one of the most widely used variance reduction strategies in Monte Carlo rendering. In this paper, we propose a novel importance sampling technique that uses a neural network to learn how to sample from a desired density represented by a set of samples. Our approach considers an existing Monte Carlo rendering algorithm as a black box. During a scene-dependent training phase, we learn to generate samples with a desired density in the primary sample space of the rendering algorithm using maximum likelihood estimation. We leverage a recent neural network architecture that was designed to represent real-valued non-volume preserving ('Real NVP') transformations in high dimensional spaces. We use Real NVP to non-linearly warp primary sample space and obtain desired densities. In addition, Real NVP efficiently computes the determinant of the Jacobian of the warp, which is required to implement the change of integration variables implied by the warp. A main advantage of our approach is that it is agnostic of underlying light transport effects, and can be combined with many existing rendering techniques by treating them as a black box. We show that our approach leads to effective variance reduction in several practical scenarios.