For a finite set $\cal F$ of polynomials over fixed finite prime field of size $p$ containing all polynomials $x^2 - x$ a Nullstellensatz proof of the unsolvability of the system $$ f = 0\ ,\ \mbox{ all } f \in {\cal F} $$ in the field is a linear combination $\sum_{f \in {\cal F}} \ h_f \cdot f$ that equals to $1$ in the ring of polynomails. The measure of complexity of such a proof is its degree: $\max_f deg(h_f f)$. We study the problem to establish degree lower bounds for some {\em extended} NS proof systems: these systems prove the unsolvability of $\cal F$ by proving the unsolvability of a bigger set ${\cal F}\cup {\cal E}$, where set $\cal E$ may use new variables $r$ and contains all polynomials $r^p - r$, and satisfies the following soundness condition: -- - Any $0,1$-assignment $\overline a$ to variables $\overline x$ can be appended by an assignment $\overline b$ to variables $\overline r$ such that for all $g \in {\cal E}$ it holds that $g(\overline a, \overline b) = 0$. We define a notion of pseudo-solutions of $\cal F$ and prove that the existence of pseudo-solutions with suitable parameters implies lower bounds for two extended NS proof systems ENS and UENS defined in Buss et al. (1996/97). Further we give a combinatorial example of $\cal F$ and candidate pseudo-solutions based on the pigeonhole principle.
In this paper we prove convergence rates for time discretisation schemes for semi-linear stochastic evolution equations with additive or multiplicative Gaussian noise, where the leading operator $A$ is the generator of a strongly continuous semigroup $S$ on a Hilbert space $X$, and the focus is on non-parabolic problems. The main results are optimal bounds for the uniform strong error $$\mathrm{E}_{k}^{\infty} := \Big(\mathbb{E} \sup_{j\in \{0, \ldots, N_k\}} \|U(t_j) - U^j\|^p\Big)^{1/p},$$ where $p \in [2,\infty)$, $U$ is the mild solution, $U^j$ is obtained from a time discretisation scheme, $k$ is the step size, and $N_k = T/k$. The usual schemes such as splitting/exponential Euler, implicit Euler, and Crank-Nicolson, etc.\ are included as special cases. Under conditions on the nonlinearity and the noise we show - $\mathrm{E}_{k}^{\infty}\lesssim k \log(T/k)$ (linear equation, additive noise, general $S$); - $\mathrm{E}_{k}^{\infty}\lesssim \sqrt{k} \log(T/k)$ (nonlinear equation, multiplicative noise, contractive $S$); - $\mathrm{E}_{k}^{\infty}\lesssim k \log(T/k)$ (nonlinear wave equation, multiplicative noise). The logarithmic factor can be removed if the splitting scheme is used with a (quasi)-contractive $S$. The obtained bounds coincide with the optimal bounds for SDEs. Most of the existing literature is concerned with bounds for the simpler pointwise strong error $$\mathrm{E}_k:=\bigg(\sup_{j\in \{0,\ldots,N_k\}}\mathbb{E} \|U(t_j) - U^{j}\|^p\bigg)^{1/p}.$$ Applications to Maxwell equations, Schr\"odinger equations, and wave equations are included. For these equations our results improve and reprove several existing results with a unified method.
Transition systems are often used to describe the behaviour of software systems. If viewed as a graph then, at their most basic level, vertices correspond to the states of a program and each edge represents a transition between states via the (atomic) action labelled. In this setting, systems are thought to be consistent so that at each state formulas are evaluated as either True or False. On the other hand, when a structure of this sort - for example a map where states represent locations, some local properties are known and labelled transitions represent information available about different routes - is built resorting to multiple sources of information, it is common to find inconsistent or incomplete information regarding what holds at each state, both at the level of propositional variables and transitions. This paper aims at bringing together Belnap's four values, Dynamic Logic and hybrid machinery such as nominals and the satisfaction operator, so that reasoning is still possible in face of contradicting evidence. Proof-theory for this new logic is explored by means of a terminating, sound and complete tableaux system.
This paper proposes a new game search algorithm, PN-MCTS, that combines Monte-Carlo Tree Search (MCTS) and Proof-Number Search (PNS). These two algorithms have been successfully applied for decision making in a range of domains. We define three areas where the additional knowledge provided by the proof and disproof numbers gathered in MCTS trees might be used: final move selection, solving subtrees, and the UCT formula. We test all possible combinations on different time settings, playing against vanilla UCT MCTS on several games: Lines of Action ($7$$\times$$7$ and $8$$\times$$8$), MiniShogi, Knightthrough, Awari, and Gomoku. Furthermore, we extend this new algorithm to properly address games with draws, like Awari, by adding an additional layer of PNS on top of the MCTS tree. The experiments show that PN-MCTS confidently outperforms MCTS in 5 out of 6 game domains (all except Gomoku), achieving win rates up to 96.2% for Lines of Action.
The generalized coloring numbers of Kierstead and Yang (Order 2003) offer an algorithmically-useful characterization of graph classes with bounded expansion. In this work, we consider the hardness and approximability of these parameters. First, we complete the work of Grohe et al. (WG 2015) by showing that computing the weak 2-coloring number is NP-hard. Our approach further establishes that determining if a graph has weak $r$-coloring number at most $k$ is para-NP-hard when parameterized by $k$ for all $r \geq 2$. We adapt this to determining if a graph has $r$-coloring number at most $k$ as well, proving para-NP-hardness for all $r \geq 2$. Para-NP-hardness implies that no XP algorithm (runtime $O(n^{f(k)})$) exists for testing if a generalized coloring number is at most $k$. Moreover, there exists a constant $c$ such that it is NP-hard to approximate the generalized coloring numbers within a factor of $c$. To complement these results, we give an approximation algorithm for the generalized coloring numbers, improving both the runtime and approximation factor of the existing approach of Dvo\v{r}\'{a}k (EuJC 2013). We prove that greedily ordering vertices with small estimated backconnectivity achieves a $(k-1)^{r-1}$-approximation for the $r$-coloring number and an $O(k^{r-1})$-approximation for the weak $r$-coloring number.
With the increasing complexity of software permeating critical domains such as autonomous driving, new challenges are emerging in the ways the engineering of these systems needs to be rethought. Autonomous driving is expected to continue gradually overtaking all critical driving functions, which is adding to the complexity of the certification of autonomous driving systems. As a response, certification authorities have already started introducing strategies for the certification of autonomous vehicles and their software. But even with these new approaches, the certification procedures are not fully catching up with the dynamism and unpredictability of future autonomous systems, and thus may not necessarily guarantee compliance with all requirements imposed on these systems. In this paper, we identified a number of issues with the proposed certification strategies, which may impact the systems substantially. For instance, we emphasize the lack of adequate reflection on software changes occurring in constantly changing systems, or low support for systems' cooperation needed for the management of coordinated moves. Other shortcomings concern the narrow focus of the awarded certification by neglecting aspects such as the ethical behavior of autonomous software systems. The contribution of this paper is threefold. First, we discuss the motivation for the need to modify the current certification processes for autonomous driving systems. Second, we analyze current international standards used in the certification processes towards requirements derived from the requirements laid on dynamic software ecosystems and autonomous systems themselves. Third, we outline a concept for incorporating the missing parts into the certification procedure.
The matrix sensing problem is an important low-rank optimization problem that has found a wide range of applications, such as matrix completion, phase synchornization/retrieval, robust PCA, and power system state estimation. In this work, we focus on the general matrix sensing problem with linear measurements that are corrupted by random noise. We investigate the scenario where the search rank $r$ is equal to the true rank $r^*$ of the unknown ground truth (the exact parametrized case), as well as the scenario where $r$ is greater than $r^*$ (the overparametrized case). We quantify the role of the restricted isometry property (RIP) in shaping the landscape of the non-convex factorized formulation and assisting with the success of local search algorithms. First, we develop a global guarantee on the maximum distance between an arbitrary local minimizer of the non-convex problem and the ground truth under the assumption that the RIP constant is smaller than $1/(1+\sqrt{r^*/r})$. We then present a local guarantee for problems with an arbitrary RIP constant, which states that any local minimizer is either considerably close to the ground truth or far away from it. More importantly, we prove that this noisy, overparametrized problem exhibits the strict saddle property, which leads to the global convergence of perturbed gradient descent algorithm in polynomial time. The results of this work provide a comprehensive understanding of the geometric landscape of the matrix sensing problem in the noisy and overparametrized regime.
This paper is concerned with low-rank matrix optimization, which has found a wide range of applications in machine learning. This problem in the special case of matrix sensing has been studied extensively through the notion of Restricted Isometry Property (RIP), leading to a wealth of results on the geometric landscape of the problem and the convergence rate of common algorithms. However, the existing results can handle the problem in the case with a general objective function subject to noisy data only when the RIP constant is close to 0. In this paper, we develop a new mathematical framework to solve the above-mentioned problem with a far less restrictive RIP constant. We prove that as long as the RIP constant of the noiseless objective is less than $1/3$, any spurious local solution of the noisy optimization problem must be close to the ground truth solution. By working through the strict saddle property, we also show that an approximate solution can be found in polynomial time. We characterize the geometry of the spurious local minima of the problem in a local region around the ground truth in the case when the RIP constant is greater than $1/3$. Compared to the existing results in the literature, this paper offers the strongest RIP bound and provides a complete theoretical analysis on the global and local optimization landscapes of general low-rank optimization problems under random corruptions from any finite-variance family.
(Stochastic) bilevel optimization is a frequently encountered problem in machine learning with a wide range of applications such as meta-learning, hyper-parameter optimization, and reinforcement learning. Most of the existing studies on this problem only focused on analyzing the convergence or improving the convergence rate, while little effort has been devoted to understanding its generalization behaviors. In this paper, we conduct a thorough analysis on the generalization of first-order (gradient-based) methods for the bilevel optimization problem. We first establish a fundamental connection between algorithmic stability and generalization error in different forms and give a high probability generalization bound which improves the previous best one from $\bigO(\sqrt{n})$ to $\bigO(\log n)$, where $n$ is the sample size. We then provide the first stability bounds for the general case where both inner and outer level parameters are subject to continuous update, while existing work allows only the outer level parameter to be updated. Our analysis can be applied in various standard settings such as strongly-convex-strongly-convex (SC-SC), convex-convex (C-C), and nonconvex-nonconvex (NC-NC). Our analysis for the NC-NC setting can also be extended to a particular nonconvex-strongly-convex (NC-SC) setting that is commonly encountered in practice. Finally, we corroborate our theoretical analysis and demonstrate how iterations can affect the generalization error by experiments on meta-learning and hyper-parameter optimization.
We propose a new randomized method for solving systems of nonlinear equations, which can find sparse solutions or solutions under certain simple constraints. The scheme only takes gradients of component functions and uses Bregman projections onto the solution space of a Newton equation. In the special case of euclidean projections, the method is known as nonlinear Kaczmarz method. Furthermore, if the component functions are nonnegative, we are in the setting of optimization under the interpolation assumption and the method reduces to SGD with the recently proposed stochastic Polyak step size. For general Bregman projections, our method is a stochastic mirror descent with a novel adaptive step size. We prove that in the convex setting each iteration of our method results in a smaller Bregman distance to exact solutions as compared to the standard Polyak step. Our generalization to Bregman projections comes with the price that a convex one-dimensional optimization problem needs to be solved in each iteration. This can typically be done with globalized Newton iterations. Convergence is proved in two classical settings of nonlinearity: for convex nonnegative functions and locally for functions which fulfill the tangential cone condition. Finally, we show examples in which the proposed method outperforms similar methods with the same memory requirements.
Coverings of convex bodies have emerged as a central component in the design of efficient solutions to approximation problems involving convex bodies. Intuitively, given a convex body $K$ and $\epsilon> 0$, a covering is a collection of convex bodies whose union covers $K$ such that a constant factor expansion of each body lies within an $\epsilon$ expansion of $K$. Coverings have been employed in many applications, such as approximations for diameter, width, and $\epsilon$-kernels of point sets, approximate nearest neighbor searching, polytope approximations, and approximations to the Closest Vector Problem (CVP). It is known how to construct coverings of size $n^{O(n)} / \epsilon^{(n-1)/2}$ for general convex bodies in $\textbf{R}^n$. In special cases, such as when the convex body is the $\ell_p$ unit ball, this bound has been improved to $2^{O(n)} / \epsilon^{(n-1)/2}$. This raises the question of whether such a bound generally holds. In this paper we answer the question in the affirmative. We demonstrate the power and versatility of our coverings by applying them to the problem of approximating a convex body by a polytope, under the Banach-Mazur metric. Given a well-centered convex body $K$ and an approximation parameter $\epsilon> 0$, we show that there exists a polytope $P$ consisting of $2^{O(n)} / \epsilon^{(n-1)/2}$ vertices (facets) such that $K \subset P \subset K(1+\epsilon)$. This bound is optimal in the worst case up to factors of $2^{O(n)}$. As an additional consequence, we obtain the fastest $(1+\epsilon)$-approximate CVP algorithm that works in any norm, with a running time of $2^{O(n)} / \epsilon ^{(n-1)/2}$ up to polynomial factors in the input size, and we obtain the fastest $(1+\epsilon)$-approximation algorithm for integer programming. We also present a framework for constructing coverings of optimal size for any convex body (up to factors of $2^{O(n)}$).