For a given function $F$ from $\mathbb F_{p^n}$ to itself, determining whether there exists a function which is CCZ-equivalent but EA-inequivalent to $F$ is a very important and interesting problem. For example, K\"olsch \cite{KOL21} showed that there is no function which is CCZ-equivalent but EA-inequivalent to the inverse function. On the other hand, for the cases of Gold function $F(x)=x^{2^i+1}$ and $F(x)=x^3+{\rm Tr}(x^9)$ over $\mathbb F_{2^n}$, Budaghyan, Carlet and Pott (respectively, Budaghyan, Carlet and Leander) \cite{BCP06, BCL09FFTA} found functions which are CCZ-equivalent but EA-inequivalent to $F$. In this paper, when a given function $F$ has a component function which has a linear structure, we present functions which are CCZ-equivalent to $F$, and if suitable conditions are satisfied, the constructed functions are shown to be EA-inequivalent to $F$. As a consequence, for every quadratic function $F$ on $\mathbb F_{2^n}$ ($n\geq 4$) with nonlinearity $>0$ and differential uniformity $\leq 2^{n-3}$, we explicitly construct functions which are CCZ-equivalent but EA-inequivalent to $F$. Also for every non-planar quadratic function on $\mathbb F_{p^n}$ $(p>2, n\geq 4)$ with $|\mathcal W_F|\leq p^{n-1}$ and differential uniformity $\leq p^{n-3}$, we explicitly construct functions which are CCZ-equivalent but EA-inequivalent to $F$.
The discrepancy between in-distribution (ID) and out-of-distribution (OOD) samples can lead to \textit{distributional vulnerability} in deep neural networks, which can subsequently lead to high-confidence predictions for OOD samples. This is mainly due to the absence of OOD samples during training, which fails to constrain the network properly. To tackle this issue, several state-of-the-art methods include adding extra OOD samples to training and assign them with manually-defined labels. However, this practice can introduce unreliable labeling, negatively affecting ID classification. The distributional vulnerability presents a critical challenge for non-IID deep learning, which aims for OOD-tolerant ID classification by balancing ID generalization and OOD detection. In this paper, we introduce a novel \textit{supervision adaptation} approach to generate adaptive supervision information for OOD samples, making them more compatible with ID samples. Firstly, we measure the dependency between ID samples and their labels using mutual information, revealing that the supervision information can be represented in terms of negative probabilities across all classes. Secondly, we investigate data correlations between ID and OOD samples by solving a series of binary regression problems, with the goal of refining the supervision information for more distinctly separable ID classes. Our extensive experiments on four advanced network architectures, two ID datasets, and eleven diversified OOD datasets demonstrate the efficacy of our supervision adaptation approach in improving both ID classification and OOD detection capabilities.
The Subset Feedback Vertex Set problem (SFVS), to delete $k$ vertices from a given graph such that any vertex in a vertex subset (called a terminal set) is not in a cycle in the remaining graph, generalizes the famous Feedback Vertex Set problem and Multiway Cut problem. SFVS remains NP-hard even in split and chordal graphs, and SFVS in Chordal Graphs (SFVS-C) can be considered as an implicit 3-Hitting Set problem. However, it is not easy to solve SFVS-C faster than 3-Hitting Set. In 2019, Philip, Rajan, Saurabh, and Tale (Algorithmica 2019) proved that SFVS-C can be solved in $\mathcal{O}^{*}(2^{k})$ time, slightly improving the best result $\mathcal{O}^{*}(2.076^{k})$ for 3-Hitting Set. In this paper, we break the "$2^{k}$-barrier" for SFVS-C by giving an $\mathcal{O}^{*}(1.820^{k})$-time algorithm. Our algorithm uses reduction and branching rules based on the Dulmage-Mendelsohn decomposition and a divide-and-conquer method.
In 1953, Carlitz~\cite{Car53} showed that all permutation polynomials over $\F_q$, where $q>2$ is a power of a prime, are generated by the special permutation polynomials $x^{q-2}$ (the inversion) and $ ax+b$ (affine functions, where $0\neq a, b\in \F_q$). Recently, Nikova, Nikov and Rijmen~\cite{NNR19} proposed an algorithm (NNR) to find a decomposition of the inverse function in quadratics, and computationally covered all dimensions $n\leq 16$. Petrides~\cite{P23} found a class of integers for which it is easy to decompose the inverse into quadratics, and improved the NNR algorithm, thereby extending the computation up to $n\leq 32$. Here, we extend Petrides' result, as well as we propose a number theoretical approach, which allows us to cover easily all (surely, odd) exponents up to~$250$, at least.
We show a fully dynamic algorithm for maintaining $(1+\epsilon)$-approximate \emph{size} of maximum matching of the graph with $n$ vertices and $m$ edges using $m^{0.5-\Omega_{\epsilon}(1)}$ update time. This is the first polynomial improvement over the long-standing $O(n)$ update time, which can be trivially obtained by periodic recomputation. Thus, we resolve the value version of a major open question of the dynamic graph algorithms literature (see, e.g., [Gupta and Peng FOCS'13], [Bernstein and Stein SODA'16],[Behnezhad and Khanna SODA'22]). Our key technical component is the first sublinear algorithm for $(1,\epsilon n)$-approximate maximum matching with sublinear running time on dense graphs. All previous algorithms suffered a multiplicative approximation factor of at least $1.499$ or assumed that the graph has a very small maximum degree.
The $L_{2}$-regularized loss of Deep Linear Networks (DLNs) with more than one hidden layers has multiple local minima, corresponding to matrices with different ranks. In tasks such as matrix completion, the goal is to converge to the local minimum with the smallest rank that still fits the training data. While rank-underestimating minima can be avoided since they do not fit the data, GD might get stuck at rank-overestimating minima. We show that with SGD, there is always a probability to jump from a higher rank minimum to a lower rank one, but the probability of jumping back is zero. More precisely, we define a sequence of sets $B_{1}\subset B_{2}\subset\cdots\subset B_{R}$ so that $B_{r}$ contains all minima of rank $r$ or less (and not more) that are absorbing for small enough ridge parameters $\lambda$ and learning rates $\eta$: SGD has prob. 0 of leaving $B_{r}$, and from any starting point there is a non-zero prob. for SGD to go in $B_{r}$.
We consider the two-pronged fork frame $F$ and the variety $\mathbf{Eq}(B_F)$ generated by its dual closure algebra $B_F$. We describe the finite projective algebras in $\mathbf{Eq}(B_F)$ and give a purely semantic proof that unification in $\mathbf{Eq}(B_F)$ is finitary and not unitary.
The property that the velocity $\boldsymbol{u}$ belongs to $L^\infty(0,T;L^2(\Omega)^d)$ is an essential requirement in the definition of energy solutions of models for incompressible fluids. It is, therefore, highly desirable that the solutions produced by discretisation methods are uniformly stable in the $L^\infty(0,T;L^2(\Omega)^d)$-norm. In this work, we establish that this is indeed the case for Discontinuous Galerkin (DG) discretisations (in time and space) of non-Newtonian models with $p$-structure, assuming that $p\geq \frac{3d+2}{d+2}$; the time discretisation is equivalent to the RadauIIA Implicit Runge-Kutta method. We also prove (weak) convergence of the numerical scheme to the weak solution of the system; this type of convergence result for schemes based on quadrature seems to be new. As an auxiliary result, we also derive Gagliardo-Nirenberg-type inequalities on DG spaces, which might be of independent interest.
The nonlinear Poisson-Boltzmann equation (NPBE) is an elliptic partial differential equation used in applications such as protein interactions and biophysical chemistry (among many others). It describes the nonlinear electrostatic potential of charged bodies submerged in an ionic solution. The kinetic presence of the solvent molecules introduces randomness to the shape of a protein, and thus a more accurate model that incorporates these random perturbations of the domain is analyzed to compute the statistics of quantities of interest of the solution. When the parameterization of the random perturbations is high-dimensional, this calculation is intractable as it is subject to the curse of dimensionality. However, if the solution of the NPBE varies analytically with respect to the random parameters, the problem becomes amenable to techniques such as sparse grids and deep neural networks. In this paper, we show analyticity of the solution of the NPBE with respect to analytic perturbations of the domain by using the analytic implicit function theorem and the domain mapping method. Previous works have shown analyticity of solutions to linear elliptic equations but not for nonlinear problems. We further show how to derive \emph{a priori} bounds on the size of the region of analyticity. This method is applied to the trypsin molecule to demonstrate that the convergence rates of the quantity of interest are consistent with the analyticity result. Furthermore, the approach developed here is sufficiently general enough to be applied to other nonlinear problems in uncertainty quantification.
The $k$-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular $k$-means clustering objective and is known to give an $O(\log k)$-approximation in expectation. To obtain higher quality solutions, Lattanzi and Sohler (ICML 2019) proposed augmenting $k$-means++ with $O(k \log \log k)$ local search steps obtained through the $k$-means++ sampling distribution to yield a $c$-approximation to the $k$-means clustering problem, where $c$ is a large absolute constant. Here we generalize and extend their local search algorithm by considering larger and more sophisticated local search neighborhoods hence allowing to swap multiple centers at the same time. Our algorithm achieves a $9 + \varepsilon$ approximation ratio, which is the best possible for local search. Importantly we show that our approach yields substantial practical improvements, we show significant quality improvements over the approach of Lattanzi and Sohler (ICML 2019) on several datasets.
Let $E=\mathbb{Q}\big(\sqrt{-d}\big)$ be an imaginary quadratic field for a square-free positive integer $d$, and let $\mathcal{O}$ be its ring of integers. For each positive integer $m$, let $I_m$ be the free Hermitian lattice over $\mathcal{O}$ with an orthonormal basis, let $\mathfrak{S}_d(1)$ be the set consisting of all positive definite integral unary Hermitian lattices over $\mathcal{O}$ that can be represented by some $I_m$, and let $g_d(1)$ be the least positive integer such that all Hermitian lattices in $\mathfrak{S}_d(1)$ can be uniformly represented by $I_{g_d(1)}$. The main results of this work provide an algorithm to calculate the explicit form of $\mathfrak{S}_d(1)$ and the exact value of $g_d(1)$ for every imaginary quadratic field $E$, which can be viewed as a natural extension of the Pythagoras number in the lattice setting.