In Path Set Packing, the input is an undirected graph $G$, a collection $\calp$ of simple paths in $G$, and a positive integer $k$. The problem is to decide whether there exist $k$ edge-disjoint paths in $\calp$. We study the parameterized complexity of Path Set Packing with respect to both natural and structural parameters. We show that the problem is $W[1]$-hard with respect to vertex cover number, and $W[1]$-hard respect to pathwidth plus maximum degree plus solution size. These results answer an open question raised in COCOON 2018. On the positive side, we present an FPT algorithm parameterized by feedback vertex number plus maximum degree, and present an FPT algorithm parameterized by treewidth plus maximum degree plus maximum length of a path in $\calp$. These positive results complement the hardness of Path Set Packing with respect to any subset of the parameters used in the FPT algorithms. We also give a $4$-approximation algorithm for maximum path set packing problem which runs in FPT time when parameterized by feedback edge number.
We consider intersection graphs of disks of radius $r$ in the hyperbolic plane. Unlike the Euclidean setting, these graph classes are different for different values of $r$, where very small $r$ corresponds to an almost-Euclidean setting and $r \in \Omega(\log n)$ corresponds to a firmly hyperbolic setting. We observe that larger values of $r$ create simpler graph classes, at least in terms of separators and the computational complexity of the \textsc{Independent Set} problem. First, we show that intersection graphs of disks of radius $r$ in the hyperbolic plane can be separated with $\mathcal{O}((1+1/r)\log n)$ cliques in a balanced manner. Our second structural insight concerns Delaunay complexes in the hyperbolic plane and may be of independent interest. We show that for any set $S$ of $n$ points with pairwise distance at least $2r$ in the hyperbolic plane the corresponding Delaunay complex has outerplanarity $1+\mathcal{O}(\frac{\log n}{r})$, which implies a similar bound on the balanced separators and treewidth of such Delaunay complexes. Using this outerplanarity (and treewidth) bound we prove that \textsc{Independent Set} can be solved in $n^{\mathcal{O}(1+\frac{\log n}{r})}$ time. The algorithm is based on dynamic programming on some unknown sphere cut decomposition that is based on the solution. The resulting algorithm is a far-reaching generalization of a result of Kisfaludi-Bak (SODA 2020), and it is tight under the Exponential Time Hypothesis. In particular, \textsc{Independent Set} is polynomial-time solvable in the firmly hyperbolic setting of $r\in \Omega(\log n)$. Finally, in the case when the disks have ply (depth) at most $\ell$, we give a PTAS for \textsc{Maximum Independent Set} that has only quasi-polynomial dependence on $1/\varepsilon$ and $\ell$. Our PTAS is a further generalization of our exact algorithm.
In this work, we study the Biclique-Free Vertex Deletion problem: Given a graph $G$ and integers $k$ and $i \le j$, find a set of at most $k$ vertices that intersects every (not necessarily induced) biclique $K_{i, j}$ in $G$. This is a natural generalization of the Bounded-Degree Deletion problem, wherein one asks whether there is a set of at most $k$ vertices whose deletion results in a graph of a given maximum degree $r$. The two problems coincide when $i = 1$ and $j = r + 1$. We show that Biclique-Free Vertex Deletion is fixed-parameter tractable with respect to $k + d$ for the degeneracy $d$ by developing a $2^{O(d k^2)} \cdot n^{O(1)}$-time algorithm. We also show that it can be solved in $2^{O(f k)} \cdot n^{O(1)}$ time for the feedback vertex number $f$ when $i \ge 2$. In contrast, we find that it is W[1]-hard for the treedepth for any integer $i \ge 1$. Finally, we show that Biclique-Free Vertex Deletion has a polynomial kernel for every $i \ge 1$ when parameterized by the feedback edge number. Previously, for this parameter, its fixed-parameter tractability for $i = 1$ was known [Betzler et al., DAM '12] but the existence of polynomial kernel was open.
Given a graph $G=(V,E)$, a function $f:V\to \{0,1,2\}$ is said to be a \emph{Roman Dominating function} if for every $v\in V$ with $f(v)=0$, there exists a vertex $u\in N(v)$ such that $f(u)=2$. A Roman Dominating function $f$ is said to be an \emph{Independent Roman Dominating function} (or IRDF), if $V_1\cup V_2$ forms an independent set, where $V_i=\{v\in V~\vert~f(v)=i\}$, for $i\in \{0,1,2\}$. The total weight of $f$ is equal to $\sum_{v\in V} f(v)$, and is denoted as $w(f)$. The \emph{Independent Roman Domination Number} of $G$, denoted by $i_R(G)$, is defined as min$\{w(f)~\vert~f$ is an IRDF of $G\}$. For a given graph $G$, the problem of computing $i_R(G)$ is defined as the \emph{Minimum Independent Roman Domination problem}. The problem is already known to be NP-hard for bipartite graphs. In this paper, we further study the algorithmic complexity of the problem. In this paper, we propose a polynomial-time algorithm to solve the Minimum Independent Roman Domination problem for distance-hereditary graphs, split graphs, and $P_4$-sparse graphs.
We study a general factor analysis framework where the $n$-by-$p$ data matrix is assumed to follow a general exponential family distribution entry-wise. While this model framework has been proposed before, we here further relax its distributional assumption by using a quasi-likelihood setup. By parameterizing the mean-variance relationship on data entries, we additionally introduce a dispersion parameter and entry-wise weights to model large variations and missing values. The resulting model is thus not only robust to distribution misspecification but also more flexible and able to capture non-Gaussian covariance structures of the data matrix. Our main focus is on efficient computational approaches to perform the factor analysis. Previous modeling frameworks rely on simulated maximum likelihood (SML) to find the factorization solution, but this method was shown to lead to asymptotic bias when the simulated sample size grows slower than the square root of the sample size $n$, eliminating its practical application for data matrices with large $n$. Borrowing from expectation-maximization (EM) and stochastic gradient descent (SGD), we investigate three estimation procedures based on iterative factorization updates. Our proposed solution does not show asymptotic biases, and scales even better for large matrix factorizations with error $O(1/p)$. To support our findings, we conduct simulation experiments and discuss its application in three case studies.
Hashing functions, which are created to provide brief and erratic digests for the message entered, are the primary cryptographic primitives used in blockchain networks. Hashing is employed in blockchain networks to create linked block lists, which offer safe and secure distributed repository storage for critical information. Due to the unique nature of the hash search problem in blockchain networks, the most parallelization of calculations is possible. This technical report presents a performance evaluation of three popular hashing algorithms Blake3, SHA-256, and SHA-512. These hashing algorithms are widely used in various applications, such as digital signatures, message authentication, and password storage. It then discusses the performance metrics used to evaluate the algorithms, such as hash rate/throughput and memory usage. The evaluation is conducted on a range of hardware platforms, including desktop and VMs. The evaluation includes synthetic benchmarks. The results of the evaluation show that Blake3 generally outperforms both SHA-256 and SHA-512 in terms of throughput and latency. However, the performance advantage of Blake3 varies depending on the specific hardware platform and the size of the input data. The report concludes with recommendations for selecting the most suitable hashing algorithm for a given application, based on its performance requirements and security needs. The evaluation results can also inform future research and development efforts to improve the performance and security of hashing algorithms.
This paper addresses the problem of finding a minimum-cost $m$-state Markov chain $(S_0,\ldots,S_{m-1})$ in a large set of chains. The chains studied have a reward associated with each state. The cost of a chain is its "gain", i.e., its average reward under its stationary distribution. Specifically, for each $k=0,\ldots,m-1$ there is a known set ${\mathbb S}_k$ of type-$k$ states. A permissible Markov chain contains exactly one state of each type; the problem is to find a minimum-cost permissible chain. The original motivation was to find a cheapest binary AIFV-$m$ lossless code on a source alphabet of size $n$. Such a code is an $m$-tuple of trees, in which each tree can be viewed as a Markov Chain state. This formulation was then used to address other problems in lossless compression. The known solution techniques for finding minimum-cost Markov chains were iterative and ran in exponential time. This paper shows how to map every possible type-$k$ state into a type-$k$ hyperplane and then define a "Markov Chain Polytope" as the lower envelope of all such hyperplanes. Finding a minimum-cost Markov chain can then be shown to be equivalent to finding a "highest" point on this polytope. The local optimization procedures used in the previous iterative algorithms are shown to be separation oracles for this polytope. Since these were often polynomial time, an application of the Ellipsoid method immediately leads to polynomial time algorithms for these problems.
We consider the k-outconnected directed Steiner tree problem (k-DST). Given a directed edge-weighted graph $G=(V,E,w)$, where $V=\{r\}\cup S \cup T$, and an integer $k$, the goal is to find a minimum cost subgraph of $G$ in which there are $k$ edge-disjoint $rt$-paths for every terminal $t\in T$. The problem is know to be NP-hard. Furthermore, the question on whether a polynomial time, subpolynomial approximation algorithm exists for $k$-DST was answered negatively by Grandoni et al. (2018), by proving an approximation hardness of $\Omega (|T|/\log |T|)$ under $NP\neq ZPP$. Inspired by modern day applications, we focus on developing efficient algorithms for $k$-DST in graphs where terminals have out-degree $0$, and furthermore constitute the vast majority in the graph. We provide the first approximation algorithm for $k$-DST on such graphs, in which the approximation ratio depends (primarily) on the size of $S$. We present a randomized algorithm that finds a solution of weight at most $\mathcal O(k|S|\log |T|)$ times the optimal weight, and with high probability runs in polynomial time.
For non-decreasing sequence of integers $S=(a_1,a_2, \dots, a_k)$, an $S$-packing coloring of $G$ is a partition of $V(G)$ into $k$ subsets $V_1,V_2,\dots,V_k$ such that the distance between any two distinct vertices $x,y \in V_i$ is at least $a_{i}+1$, $1\leq i\leq k$. We consider the $S$-packing coloring problem on subclasses of subcubic graphs: For $0\le i\le 3$, a subcubic graph $G$ is said to be $i$-saturated if every vertex of degree 3 is adjacent to at most $i$ vertices of degree 3. Furthermore, a vertex of degree 3 in a subcubic graph is called heavy if all its three neighbors are of degree 3, and $G$ is said to be $(3,i)$-saturated if every heavy vertex is adjacent to at most $i$ heavy vertices. We prove that every 1-saturated subcubic graph is $(1,1,3,3)$-packing colorable and $(1,2,2,2,2)$-packing colorable. We also prove that every $(3,0)$-saturated subcubic graph is $(1,2,2,2,2,2)$-packing colorable.
This work uniquely identifies and characterizes four prevalent multimodal model architectural patterns in the contemporary multimodal landscape. Systematically categorizing models by architecture type facilitates monitoring of developments in the multimodal domain. Distinct from recent survey papers that present general information on multimodal architectures, this research conducts a comprehensive exploration of architectural details and identifies four specific architectural types. The types are distinguished by their respective methodologies for integrating multimodal inputs into the deep neural network model. The first two types (Type A and B) deeply fuses multimodal inputs within the internal layers of the model, whereas the following two types (Type C and D) facilitate early fusion at the input stage. Type-A employs standard cross-attention, whereas Type-B utilizes custom-designed layers for modality fusion within the internal layers. On the other hand, Type-C utilizes modality-specific encoders, while Type-D leverages tokenizers to process the modalities at the model's input stage. The identified architecture types aid the monitoring of any-to-any multimodal model development. Notably, Type-C and Type-D are currently favored in the construction of any-to-any multimodal models. Type-C, distinguished by its non-tokenizing multimodal model architecture, is emerging as a viable alternative to Type-D, which utilizes input-tokenizing techniques. To assist in model selection, this work highlights the advantages and disadvantages of each architecture type based on data and compute requirements, architecture complexity, scalability, simplification of adding modalities, training objectives, and any-to-any multimodal generation capability.
Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.