$k$-core is a subgraph where every node has at least $k$ neighbors within the subgraph. The $k$-core subgraphs has been employed in large platforms like Network Repository to comprehend the underlying structures and dynamics of the network. Existing studies have primarily focused on finding $k$-core groups without considering their size, despite the relevance of solution sizes in many real-world scenarios. This paper addresses this gap by introducing the size-prescribed $k$-core search (SPCS) problem, where the goal is to find a subgraph of a specified size that has the highest possible core number. We propose two algorithms, namely the {\it TSizeKcore-BU} and the {\it TSizeKcore-TD}, to identify cohesive subgraphs that satisfy both the $k$-core requirement and the size constraint. Our experimental results demonstrate the superiority of our approach in terms of solution quality and efficiency. The {\it TSizeKcore-BU} algorithm proves to be highly efficient in finding size-prescribed $k$-core subgraphs on large datasets, making it a favorable choice for such scenarios. On the other hand, the {\it TSizeKcore-TD} algorithm is better suited for small datasets where running time is less critical.
The weighted essentially non-oscillatory {technique} using a stencil of $2r$ points (WENO-$2r$) is an interpolatory method that consists in obtaining a higher approximation order from the non-linear combination of interpolants of $r+1$ nodes. The result is an interpolant of order $2r$ at the smooth parts and order $r+1$ when an isolated discontinuity falls at any grid interval of the large stencil except at the central one. Recently, a new WENO method based on Aitken-Neville's algorithm has been designed for interpolation of equally spaced data at the mid-points and presents progressive order of accuracy close to discontinuities. This paper is devoted to constructing a general progressive WENO method for non-necessarily uniformly spaced data and several variables interpolating in any point of the central interval. Also, we provide explicit formulas for linear and non-linear weights and prove the order obtained. Finally, some numerical experiments are presented to check the theoretical results.
MAG$\pi$ is a Multiparty, Asynchronous and Generalised $\pi$-calculus that introduces timeouts into session types as a means of reasoning about failure-prone communication. Its type system guarantees that all possible message-loss is handled by timeout branches. In this work, we argue that the previous is unnecessarily strict. We present MAG$\pi$!, an extension serving as the first introduction of replication into Multiparty Session Types (MPST). Replication is a standard $\pi$-calculus construct used to model infinitely available servers. We lift this construct to type-level, and show that it simplifies specification of distributed client-server interactions. We prove properties relevant to generalised MPST: subject reduction, session fidelity and process property verification.
We give a poly-time algorithm for the $k$-edge-connected spanning subgraph ($k$-ECSS) problem that returns a solution of cost no greater than the cheapest $(k+10)$-ECSS on the same graph. Our approach enhances the iterative relaxation framework with a new ingredient, which we call ghost values, that allows for high sparsity in intermediate problems. Our guarantees improve upon the best-known approximation factor of $2$ for $k$-ECSS whenever the optimal value of $(k+10)$-ECSS is close to that of $k$-ECSS. This is a property that holds for the closely related problem $k$-edge-connected spanning multi-subgraph ($k$-ECSM), which is identical to $k$-ECSS except edges can be selected multiple times at the same cost. As a consequence, we obtain a $\left(1+O\left(\frac{1}{k}\right)\right)$-approximation algorithm for $k$-ECSM, which resolves a conjecture of Pritchard and improves upon a recent $\left(1+O\left(\frac{1}{\sqrt{k}}\right)\right)$-approximation algorithm of Karlin, Klein, Oveis Gharan, and Zhang. Moreover, we present a matching lower bound for $k$-ECSM, showing that our approximation ratio is tight up to the constant factor in $O\left(\frac{1}{k}\right)$, unless $P=NP$.
This is a simplification of a previous version of this ArXiv note. We present an example of a function $f$ from $\{-1,1\}^n$ to the unit sphere in $\mathbb{C}$ with influence bounded by $1$ and entropy of $|\hat f|^2$ larger than $\frac12\log n$.
We consider the distributed complexity of the (degree+1)-list coloring problem, in which each node $u$ of degree $d(u)$ is assigned a palette of $d(u)+1$ colors, and the goal is to find a proper coloring using these color palettes. The (degree+1)-list coloring problem is a natural generalization of the classical $(\Delta+1)$-coloring and $(\Delta+1)$-list coloring problems, both being benchmark problems extensively studied in distributed and parallel computing. In this paper we settle the complexity of the (degree+1)-list coloring problem in the Congested Clique model by showing that it can be solved deterministically in a constant number of rounds.
One of the major open problems in complexity theory is to demonstrate an explicit function which requires super logarithmic depth, a.k.a, the $\mathbf{P}$ versus $\mathbf{NC^1}$ problem. The current best depth lower bound is $(3-o(1))\cdot \log n$, and it is widely open how to prove a super-$3\log n$ depth lower bound. Recently Mihajlin and Sofronova (CCC'22) show if considering formulas with restriction on top, we can break the $3\log n$ barrier. Formally, they prove there exist two functions $f:\{0,1\}^n \rightarrow \{0,1\},g:\{0,1\}^n \rightarrow \{0,1\}^n$, such that for any constant $0<\alpha<0.4$ and constant $0<\epsilon<\alpha/2$, their XOR composition $f(g(x)\oplus y)$ is not computable by an AND of $2^{(\alpha-\epsilon)n}$ formulas of size at most $2^{(1-\alpha/2-\epsilon)n}$. This implies a modified version of Andreev function is not computable by any circuit of depth $(3.2-\epsilon)\log n$ with the restriction that top $0.4-\epsilon$ layers only consist of AND gates for any small constant $\epsilon>0$. They ask whether the parameter $\alpha$ can be push up to nearly $1$ thus implying a nearly-$3.5\log n$ depth lower bound. In this paper, we provide a stronger answer to their question. We show there exist two functions $f:\{0,1\}^n \rightarrow \{0,1\},g:\{0,1\}^n \rightarrow \{0,1\}^n$, such that for any constant $0<\alpha<2-o(1)$, their XOR composition $f(g(x)\oplus y)$ is not computable by an AND of $2^{\alpha n}$ formulas of size at most $2^{(1-\alpha/2-o(1))n}$. This implies a $(4-o(1))\log n$ depth lower bound with the restriction that top $2-o(1)$ layers only consist of AND gates. We prove it by observing that one crucial component in Mihajlin and Sofronova's work, called the well-mixed set of functions, can be significantly simplified thus improved. Then with this observation and a more careful analysis, we obtain these nearly tight results.
Consider the problem of predicting the next symbol given a sample path of length n, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback- Leibler divergence up to universal constant factors. Extending existing results in finite-order Markov models [HJW23] and drawing ideas from universal compression, the proposed estimator has a prediction risk bounded by redundancy of the distribution class and a memory term that accounts for the long-range dependency of the model. Notably, for HMMs with bounded state and observation spaces, a polynomial-time estimator based on dynamic programming is shown to achieve the optimal prediction risk {\Theta}(log n/n); prior to this work, the only known result of this type is O(1/log n) obtained using Markov approximation [Sha+18]. Matching minimax lower bounds are obtained by making connections to redundancy and mutual information via a reduction argument.
We suggest that straight-line programs designed for algebraic computations should be accompanied by a comprehensive complexity analysis that takes into account both the number of fundamental algebraic operations needed, as well as memory requirements arising during evaluation. We introduce an approach for formalising this idea and, as illustration, construct and analyse straight-line programs for the Bruhat decomposition of $d\times d$ matrices with determinant $1$ over a finite field of order $q$ that have length $O(d^2\log(q))$ and require storing only $O(\log(q))$ matrices during evaluation.
When writing high-performance code for numerical computation in a scripting language like MATLAB, it is crucial to have the operations in a large for-loop vectorized. If not, the code becomes too slow to use, even for a moderately large problem. However, in the process of vectorizing, the code often loses its original structure and becomes less readable. This is particularly true in the case of a finite element implementation, even though finite element methods are inherently structured. A basic remedy to this is the separation of the vectorization part from the mathematics part of the code, which is easily achieved through building the code on top of the basic linear algebra subprograms that are already vectorized codes, an idea that has been used in a series of papers over the last fifteen years, developing codes that are fast and still structured and readable. We discuss the vectorized basic linear algebra package and introduce a formalism using multi-linear algebra to explain and define formally the functions in the package, as well as MATLAB pagetime functions. We provide examples from computations of varying complexity, including the computation of normal vectors, volumes, and finite element methods. Benchmarking shows that we also get fast computations. Using the library, we can write codes that closely follow our mathematical thinking, making writing, following, reusing, and extending the code easier.
In recent years, object detection has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO. We show that the overlap between small ground-truth objects and the predicted anchors is much lower than the expected IoU threshold. We conjecture this is due to two factors; (1) only a few images are containing small objects, and (2) small objects do not appear enough even within each image containing them. We thus propose to oversample those images with small objects and augment each of those images by copy-pasting small objects many times. It allows us to trade off the quality of the detector on large objects with that on small objects. We evaluate different pasting augmentation strategies, and ultimately, we achieve 9.7\% relative improvement on the instance segmentation and 7.1\% on the object detection of small objects, compared to the current state of the art method on MS COCO.