The kinetic theory provides a good basis for developing numerical methods for multiscale gas flows covering a wide range of flow regimes. A particular challenge for kinetic schemes is whether they can capture the correct hydrodynamic behaviors of the system in the continuum regime (i.e., as the Knudsen number $\epsilon\ll 1$ ) without enforcing kinetic scale resolution. At the current stage, the main approach to analyze such property is the asymptotic preserving (AP) concept, which aims to show whether the kinetic scheme reduces to a solver for the hydrodynamic equations as $\epsilon \to 0$. However, the detailed asymptotic properties of the kinetic scheme are indistinguishable as $\epsilon$ is small but finite under the AP framework. In order to distinguish different characteristics of kinetic schemes, in this paper we introduce the concept of unified preserving (UP) aiming at assessing asmyptotic orders (in terms of $\epsilon$) of a kinetic scheme by employing the modified equation approach and Chapman-Enskon analysis. It is shown that the UP properties of a kinetic scheme generally depend on the spatial/temporal accuracy and closely on the inter-connections among the three scales (kinetic scale, numerical scale, and hydrodynamic scale). Specifically, the numerical resolution and specific discretization determine the numerical flow behaviors of the scheme in different regimes, especially in the near continuum limit. As two examples, the UP analysis is applied to the discrete unified gas-kinetic scheme (DUGKS) and a second-order implicit-explicit Runge-Kutta (IMEX-RK) scheme to evaluate their asymptotic behaviors in the continuum limit.
We study the problem of {\sl certification}: given queries to a function $f : \{0,1\}^n \to \{0,1\}$ with certificate complexity $\le k$ and an input $x^\star$, output a size-$k$ certificate for $f$'s value on $x^\star$. This abstractly models a central problem in explainable machine learning, where we think of $f$ as a blackbox model that we seek to explain the predictions of. For monotone functions, a classic local search algorithm of Angluin accomplishes this task with $n$ queries, which we show is optimal for local search algorithms. Our main result is a new algorithm for certifying monotone functions with $O(k^8 \log n)$ queries, which comes close to matching the information-theoretic lower bound of $\Omega(k \log n)$. The design and analysis of our algorithm are based on a new connection to threshold phenomena in monotone functions. We further prove exponential-in-$k$ lower bounds when $f$ is non-monotone, and when $f$ is monotone but the algorithm is only given random examples of $f$. These lower bounds show that assumptions on the structure of $f$ and query access to it are both necessary for the polynomial dependence on $k$ that we achieve.
We investigate the numerical implementation of the limiting equation for the phonon transport equation in the small Knudsen number regime. The main contribution is that we derive the limiting equation that achieves the second order convergence, and provide a numerical recipe for computing the Robin coefficients. These coefficients are obtained by solving an auxiliary half-space equation. Numerically the half-space equation is solved by a spectral method that relies on the even-odd decomposition to eliminate corner-point singularity. Numerical evidences will be presented to justify the second order asymptotic convergence rate.
We show that solution to the Hermite-Pad\'{e} type I approximation problem leads in a natural way to a subclass of solutions of the Hirota (discrete Kadomtsev-Petviashvili) system and of its adjoint linear problem. Our result explains the appearence of various ingredients of the integrable systems theory in application to multiple orthogonal polynomials, numerical algorthms, random matrices, and in other branches of mathematical physics and applied mathematics where the Hermite-Pad\'{e} approximation problem is relevant. We present also the geometric algorithm, based on the notion of Desargues maps, of construction of solutions of the problem in the projective space over the field of rational functions. As a byproduct we obtain the corresponding generalization of the Wynn recurrence. We isolate the boundary data of the Hirota system which provide solutions to Hermite-Pad\'{e} problem showing that the corresponding reduction lowers dimensionality of the system. In particular, we obtain certain equations which, in addition to the known ones given by Paszkowski, can be considered as direct analogs of the Frobenius identities. We study the place of the reduced system within the integrability theory, which results in finding multidimensional (in the sense of number of variables) extension of the discrete-time Toda chain equations.
The design of numerical approximations of the Cahn-Hilliard model preserving the maximum principle is a challenging problem, even more if considering additional transport terms. In this work we present a new upwind Discontinuous Galerkin scheme for the convective Cahn-Hilliard model with degenerate mobility which preserves the maximum principle and prevents non-physical spurious oscillations. Furthermore, we show some numerical experiments in agreement with the previous theoretical results. Finally, numerical comparisons with other schemes found in the literature are also carried out.
In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) and provide excess population risks for some special classes of functions that are faster than the previous results of general convex and strongly convex functions. In the first part of the paper, we study the case where the population risk function satisfies the Tysbakov Noise Condition (TNC) with some parameter $\theta>1$. Specifically, we first show that under some mild assumptions on the loss functions, there is an algorithm whose output could achieve an upper bound of $\tilde{O}((\frac{1}{\sqrt{n}}+\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $(\epsilon, \delta)$-DP when $\theta\geq 2$, here $n$ is the sample size and $d$ is the dimension of the space. Then we address the inefficiency issue, improve the upper bounds by $\text{Poly}(\log n)$ factors and extend to the case where $\theta\geq \bar{\theta}>1$ for some known $\bar{\theta}$. Next we show that the excess population risk of population functions satisfying TNC with parameter $\theta\geq 2$ is always lower bounded by $\Omega((\frac{d}{n\epsilon})^\frac{\theta}{\theta-1}) $ and $\Omega((\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $\epsilon$-DP and $(\epsilon, \delta)$-DP, respectively. In the second part, we focus on a special case where the population risk function is strongly convex. Unlike the previous studies, here we assume the loss function is {\em non-negative} and {\em the optimal value of population risk is sufficiently small}. With these additional assumptions, we propose a new method whose output could achieve an upper bound of $O(\frac{d\log\frac{1}{\delta}}{n^2\epsilon^2}+\frac{1}{n^{\tau}})$ for any $\tau\geq 1$ in $(\epsilon,\delta)$-DP model if the sample size $n$ is sufficiently large.
In this paper we consider a linearized variable-time-step two-step backward differentiation formula (BDF2) scheme for solving nonlinear parabolic equations. The scheme is constructed by using the variable time-step BDF2 for the linear term and a Newton linearized method for the nonlinear term in time combining with a Galerkin finite element method (FEM) in space. We prove the unconditionally optimal error estimate of the proposed scheme under mild restrictions on the ratio of adjacent time-steps, i.e. $0<r_k < r_{\max} \approx 4.8645$ and on the maximum time step. The proof involves the discrete orthogonal convolution (DOC) and discrete complementary convolution (DCC) kernels, and the error splitting approach. In addition, our analysis also shows that the first level solution $u^1$ obtained by BDF1 (i.e. backward Euler scheme) does not cause the loss of global accuracy of second order. Numerical examples are provided to demonstrate our theoretical results.
A strict bramble of a graph $G$ is a collection of pairwise-intersecting connected subgraphs of $G.$ The order of a strict bramble ${\cal B}$ is the minimum size of a set of vertices intersecting all sets of ${\cal B}.$ The strict bramble number of $G,$ denoted by ${\sf sbn}(G),$ is the maximum order of a strict bramble in $G.$ The strict bramble number of $G$ can be seen as a way to extend the notion of acyclicity, departing from the fact that (non-empty) acyclic graphs are exactly the graphs where every strict bramble has order one. We initiate the study of this graph parameter by providing three alternative definitions, each revealing different structural characteristics. The first is a min-max theorem asserting that ${\sf sbn}(G)$ is equal to the minimum $k$ for which $G$ is a minor of the lexicographic product of a tree and a clique on $k$ vertices (also known as the lexicographic tree product number). The second characterization is in terms of a new variant of a tree decomposition called lenient tree decomposition. We prove that ${\sf sbn}(G)$ is equal to the minimum $k$ for which there exists a lenient tree decomposition of $G$ of width at most $k.$ The third characterization is in terms of extremal graphs. For this, we define, for each $k,$ the concept of a $k$-domino-tree and we prove that every edge-maximal graph of strict bramble number at most $k$ is a $k$-domino-tree. We also identify three graphs that constitute the minor-obstruction set of the class of graphs with strict bramble number at most two. We complete our results by proving that, given some $G$ and $k,$ deciding whether ${\sf sbn}(G) \leq k$ is an ${\sf NP}$-complete problem.
In this short note, we reify the connection between work on the storage capacity problem in wide two-layer treelike neural networks and the rapidly-growing body of literature on kernel limits of wide neural networks. Concretely, we observe that the "effective order parameter" studied in the statistical mechanics literature is exactly equivalent to the infinite-width Neural Network Gaussian Process Kernel. This correspondence connects the expressivity and trainability of wide two-layer neural networks.
Counterfactual explanations are usually generated through heuristics that are sensitive to the search's initial conditions. The absence of guarantees of performance and robustness hinders trustworthiness. In this paper, we take a disciplined approach towards counterfactual explanations for tree ensembles. We advocate for a model-based search aiming at "optimal" explanations and propose efficient mixed-integer programming approaches. We show that isolation forests can be modeled within our framework to focus the search on plausible explanations with a low outlier score. We provide comprehensive coverage of additional constraints that model important objectives, heterogeneous data types, structural constraints on the feature space, along with resource and actionability restrictions. Our experimental analyses demonstrate that the proposed search approach requires a computational effort that is orders of magnitude smaller than previous mathematical programming algorithms. It scales up to large data sets and tree ensembles, where it provides, within seconds, systematic explanations grounded on well-defined models solved to optimality.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.