In 1986, Flagg and Friedman \cite{ff} gave an elegant alternative proof of the faithfulness of G\"{o}del translation $(\cdot)^\Box$ of Heyting arithmetic $\bf HA$ to Shapiro's epistemic arithmetic $\bf EA$. In \S 2, we shall prove the faithfulness of $(\cdot)^\Box$ without using stability, by introducing another translation from an epistemic system to corresponding intuitionistic system which we shall call \it the modified Rasiowa-Sikorski translation\rm . That is, this introduction of the new translation simplifies the original Flagg and Friedman's proof. In \S 3, we shall give some applications of the modified one for the disjunction property ($\mathsf{DP}$) and the numerical existence property ($\mathsf{NEP}$) of Heyting arithmetic. In \S 4, we shall show that epistemic Markov's rule $\mathsf{EMR}$ in $\bf EA$ is proved via $\bf HA$. So $\bf EA$ $\vdash \mathsf{EMR}$ and $\bf HA$ $\vdash \mathsf{MR}$ are equivalent. In \S 5, we shall give some relations among the translations treated in the previous sections. In \S 6, we shall give an alternative proof of Glivenko's theorem. In \S 7, we shall propose several (modal-)epistemic versions of Markov's rule for Horsten's modal-epistemic arithmetic $\bf MEA$. And, as in \S 4, we shall study some meta-implications among those versions of Markov's rules in $\bf MEA$ and one in $\bf HA$. Friedman and Sheard gave a modal analogue $\mathsf{FS}$ (i.e. Theorem in \cite{fs}) of Friedman's theorem $\mathsf{F}$ (i.e. Theorem 1 in \cite {friedman}): \it Any recursively enumerable extension of $\bf HA$ which has $\mathsf{DP}$ also has $\mathsf{NPE}$\rm . In \S 8, we shall propose a modified version of \it Fundamental Conjecture \rm $\mathsf{FC}$ ($\mathsf{FS} \Longrightarrow \mathsf{F}$) proposed by the author as $\Delta_0$-Fundamental Conjecture. In \S 9, I shall give some discussions and my philosophy.
Let $\Omega = [0,1]^d$ be the unit cube in $\mathbb{R}^d$. We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $W^s(L_q(\Omega))$ and Besov spaces $B^s_r(L_q(\Omega))$, with error measured in the $L_p(\Omega)$ norm. This problem is important when studying the application of neural networks in a variety of fields, including scientific computing and signal processing, and has previously been solved only when $p=q=\infty$. Our contribution is to provide a complete solution for all $1\leq p,q\leq \infty$ and $s > 0$ for which the corresponding Sobolev or Besov space compactly embeds into $L_p$. The key technical tool is a novel bit-extraction technique which gives an optimal encoding of sparse vectors. This enables us to obtain sharp upper bounds in the non-linear regime where $p > q$. We also provide a novel method for deriving $L_p$-approximation lower bounds based upon VC-dimension when $p < \infty$. Our results show that very deep ReLU networks significantly outperform classical methods of approximation in terms of the number of parameters, but that this comes at the cost of parameters which are not encodable.
The \emph{local edge-length ratio} of a planar straight-line drawing $\Gamma$ is the largest ratio between the lengths of any pair of edges of $\Gamma$ that share a common vertex. The \emph{global edge-length ratio} of $\Gamma$ is the largest ratio between the lengths of any pair of edges of $\Gamma$. The local (global) edge-length ratio of a planar graph is the infimum over all local (global) edge-length ratios of its planar straight-line drawings. We show that there exist planar graphs with $n$ vertices whose local edge-length ratio is $\Omega(\sqrt{n})$. We then show a technique to establish upper bounds on the global (and hence local) edge-length ratio of planar graphs and~apply~it to Halin graphs and to other families of graphs having outerplanarity two.
For the numerical solution of the cubic nonlinear Schr\"{o}dinger equation with periodic boundary conditions, a pseudospectral method in space combined with a filtered Lie splitting scheme in time is considered. This scheme is shown to converge even for initial data with very low regularity. In particular, for data in $H^s(\mathbb T^2)$, where $s>0$, convergence of order $\mathcal O(\tau^{s/2}+N^{-s})$ is proved in $L^2$. Here $\tau$ denotes the time step size and $N$ the number of Fourier modes considered. The proof of this result is carried out in an abstract framework of discrete Bourgain spaces, the final convergence result, however, is given in $L^2$. The stated convergence behavior is illustrated by several numerical examples.
The characterization of the solution set for a class of algebraic Riccati inequalities is studied. This class arises in the passivity analysis of linear time invariant control systems. Eigenvalue perturbation theory for the Hamiltonian matrix associated with the Riccati inequality is used to analyze the extremal points of the solution set.
We provide numerical bounds on the Crouzeix ratiofor KLS matrices $A$ which have a line segment on the boundary of the numerical range. The Crouzeix ratio is the supremum over all polynomials $p$ of the spectral norm of $p(A)$ dividedby the maximum absolute value of $p$ on the numerical range of $A$.Our bounds confirm the conjecture that this ratiois less than or equal to $2$. We also give a precise description of these numerical ranges.
We explore the maximum likelihood degree of a homogeneous polynomial $F$ on a projective variety $X$, $\mathrm{MLD}_F(X)$, which generalizes the concept of Gaussian maximum likelihood degree. We show that $\mathrm{MLD}_F(X)$ is equal to the count of critical points of a rational function on $X$, and give different geometric characterizations of it via topological Euler characteristic, dual varieties, and Chern classes.
We study the convergence of specific inexact alternating projections for two non-convex sets in a Euclidean space. The $\sigma$-quasioptimal metric projection ($\sigma \geq 1$) of a point $x$ onto a set $A$ consists of points in $A$ the distance to which is at most $\sigma$ times larger than the minimal distance $\mathrm{dist}(x,A)$. We prove that quasioptimal alternating projections, when one or both projections are quasioptimal, converge locally and linearly for super-regular sets with transversal intersection. The theory is motivated by the successful application of alternating projections to low-rank matrix and tensor approximation. We focus on two problems -- nonnegative low-rank approximation and low-rank approximation in the maximum norm -- and develop fast alternating-projection algorithms for matrices and tensor trains based on cross approximation and acceleration techniques. The numerical experiments confirm that the proposed methods are efficient and suggest that they can be used to regularise various low-rank computational routines.
We present two new positive results for reliable computation using formulas over physical alphabets of size $q > 2$. First, we show that for logical alphabets of size $\ell = q$ the threshold for denoising using gates subject to $q$-ary symmetric noise with error probability $\varepsilon$ is strictly larger than that for Boolean computation, and is possible as long as signals remain distinguishable, i.e. $\epsilon < (q - 1) / q$, in the limit of large fan-in $k \rightarrow \infty$. We also determine the point at which generalized majority gates with bounded fan-in fail, and show in particular that reliable computation is possible for $\epsilon < (q - 1) / (q (q + 1))$ in the case of $q$ prime and fan-in $k = 3$. Secondly, we provide an example where $\ell < q$, showing that reliable Boolean computation can be performed using $2$-input ternary logic gates subject to symmetric ternary noise of strength $\varepsilon < 1/6$ by using the additional alphabet element for error signaling.
The classical Zarankiewicz's problem asks for the maximum number of edges in a bipartite graph on $n$ vertices which does not contain the complete bipartite graph $K_{t,t}$. In one of the cornerstones of extremal graph theory, K\H{o}v\'ari S\'os and Tur\'an proved an upper bound of $O(n^{2-\frac{1}{t}})$. In a celebrated result, Fox et al. obtained an improved bound of $O(n^{2-\frac{1}{d}})$ for graphs of VC-dimension $d$ (where $d<t$). Basit, Chernikov, Starchenko, Tao and Tran improved the bound for the case of semilinear graphs. At SODA'23, Chan and Har-Peled further improved Basit et al.'s bounds and presented (quasi-)linear upper bounds for several classes of geometrically-defined incidence graphs, including a bound of $O(n \log \log n)$ for the incidence graph of points and pseudo-discs in the plane. In this paper we present a new approach to Zarankiewicz's problem, via $\epsilon$-t-nets - a recently introduced generalization of the classical notion of $\epsilon$-nets. We show that the existence of `small'-sized $\epsilon$-t-nets implies upper bounds for Zarankiewicz's problem. Using the new approach, we obtain a sharp bound of $O(n)$ for the intersection graph of two families of pseudo-discs, thus both improving and generalizing the result of Chan and Har-Peled from incidence graphs to intersection graphs. We also obtain a short proof of the $O(n^{2-\frac{1}{d}})$ bound of Fox et al., and show improved bounds for several other classes of geometric intersection graphs, including a sharp $O(n\frac{\log n}{\log \log n})$ bound for the intersection graph of two families of axis-parallel rectangles.
Due to their inherent capability in semantic alignment of aspects and their context words, attention mechanism and Convolutional Neural Networks (CNNs) are widely applied for aspect-based sentiment classification. However, these models lack a mechanism to account for relevant syntactical constraints and long-range word dependencies, and hence may mistakenly recognize syntactically irrelevant contextual words as clues for judging aspect sentiment. To tackle this problem, we propose to build a Graph Convolutional Network (GCN) over the dependency tree of a sentence to exploit syntactical information and word dependencies. Based on it, a novel aspect-specific sentiment classification framework is raised. Experiments on three benchmarking collections illustrate that our proposed model has comparable effectiveness to a range of state-of-the-art models, and further demonstrate that both syntactical information and long-range word dependencies are properly captured by the graph convolution structure.