This work considers a discrete-time Poisson noise channel with an input amplitude constraint $\mathsf{A}$ and a dark current parameter $\lambda$. It is known that the capacity-achieving distribution for this channel is discrete with finitely many points. Recently, for $\lambda=0$, a lower bound of order $\sqrt{\mathsf{A}}$ and an upper bound of order $\mathsf{A} \log^2(\mathsf{A})$ have been demonstrated on the cardinality of the support of the optimal input distribution. In this work, we improve these results in several ways. First, we provide upper and lower bounds that hold for non-zero dark current. Second, we produce a sharper upper bound with a far simpler technique. In particular, for $\lambda=0$, we sharpen the upper bound from the order of $\mathsf{A} \log^2(\mathsf{A})$ to the order of $\mathsf{A}$. Finally, some other additional information about the location of the support is provided.
We study dynamic $(1-\epsilon)$-approximate rounding of fractional matchings -- a key ingredient in numerous breakthroughs in the dynamic graph algorithms literature. Our first contribution is a surprisingly simple deterministic rounding algorithm in bipartite graphs with amortized update time $O(\epsilon^{-1} \log^2 (\epsilon^{-1} \cdot n))$, matching an (unconditional) recourse lower bound of $\Omega(\epsilon^{-1})$ up to logarithmic factors. Moreover, this algorithm's update time improves provided the minimum (non-zero) weight in the fractional matching is lower bounded throughout. Combining this algorithm with novel dynamic \emph{partial rounding} algorithms to increase this minimum weight, we obtain several algorithms that improve this dependence on $n$. For example, we give a high-probability randomized algorithm with $\tilde{O}(\epsilon^{-1}\cdot (\log\log n)^2)$-update time against adaptive adversaries. (We use Soft-Oh notation, $\tilde{O}$, to suppress polylogarithmic factors in the argument, i.e., $\tilde{O}(f)=O(f\cdot \mathrm{poly}(\log f))$.) Using our rounding algorithms, we also round known $(1-\epsilon)$-decremental fractional bipartite matching algorithms with no asymptotic overhead, thus improving on state-of-the-art algorithms for the decremental bipartite matching problem. Further, we provide extensions of our results to general graphs and to maintaining almost-maximal matchings.
Martin-L\"{o}f type theory $\mathbf{MLTT}$ was extended by Setzer with the so-called Mahlo universe types. This extension is called $\mathbf{MLM}$ and was introduced to develop a variant of $\mathbf{MLTT}$ equipped with an analogue of a large cardinal. Another instance of constructive systems extended with an analogue of a large set was formulated in the context of Aczel's constructive set theory: $\mathbf{CZF}$. Rathjen, Griffor and Palmgren extended $\mathbf{CZF}$ with inaccessible sets of all transfinite orders. It is unknown whether this extension of $\mathbf{CZF}$ is directly interpretable by Mahlo universes. In particular, how to construct the transfinite hierarchy of inaccessible sets using the reflection property of the Mahlo universe in $\mathbf{MLM}$ is not well understood. We extend $\mathbf{MLM}$ further by adding the accessibility predicate to it and show that the above extension of $\mathbf{CZF}$ is directly interpretable in $\mathbf{MLM}$ using the accessibility predicate.
In the online facility assignment on a line (OFAL) with a set $S$ of $k$ servers and a capacity $c:S\to\mathbb{N}$, each server $s\in S$ with a capacity $c(s)$ is placed on a line and a request arrives on a line one-by-one. The task of an online algorithm is to irrevocably assign a current request to one of the servers with vacancies before the next request arrives. An algorithm can assign up to $c(s)$ requests to each server $s\in S$. In this paper, we show that the competitive ratio of the permutation algorithm is at least $k+1$ for OFAL where the servers are evenly placed on a line. This disproves the result that the permutation algorithm is $k$-competitive by Ahmed et al..
The crossing number of a graph $G$ is the minimum number of crossings in a drawing of $G$ in the plane. A rectilinear drawing of a graph $G$ represents vertices of $G$ by a set of points in the plane and represents each edge of $G$ by a straight-line segment connecting its two endpoints. The rectilinear crossing number of $G$ is the minimum number of crossings in a rectilinear drawing of $G$. By the crossing lemma, the crossing number of an $n$-vertex graph $G$ can be $O(n)$ only if $|E(G)|\in O(n)$. Graphs of bounded genus and bounded degree (B\"{o}r\"{o}czky, Pach and T\'{o}th, 2006) and in fact all bounded degree proper minor-closed families (Wood and Telle, 2007) have been shown to admit linear crossing number, with tight $\Theta(\Delta n)$ bound shown by Dujmovi\'c, Kawarabayashi, Mohar and Wood, 2008. Much less is known about rectilinear crossing number. It is not bounded by any function of the crossing number. We prove that graphs that exclude a single-crossing graph as a minor have the rectilinear crossing number $O(\Delta n)$. This dependence on $n$ and $\Delta$ is best possible. A single-crossing graph is a graph whose crossing number is at most one. Thus the result applies to $K_5$-minor-free graphs, for example. It also applies to bounded treewidth graphs, since each family of bounded treewidth graphs excludes some fixed planar graph as a minor. Prior to our work, the only bounded degree minor-closed families known to have linear rectilinear crossing number were bounded degree graphs of bounded treewidth (Wood and Telle, 2007), as well as, bounded degree $K_{3,3}$-minor-free graphs (Dujmovi\'c, Kawarabayashi, Mohar and Wood, 2008). In the case of bounded treewidth graphs, our $O(\Delta n)$ result is again tight and improves on the previous best known bound of $O(\Delta^2 n)$ by Wood and Telle, 2007 (obtained for convex geometric drawings).
We study the \emph{in-context learning} (ICL) ability of a \emph{Linear Transformer Block} (LTB) that combines a linear attention component and a linear multi-layer perceptron (MLP) component. For ICL of linear regression with a Gaussian prior and a \emph{non-zero mean}, we show that LTB can achieve nearly Bayes optimal ICL risk. In contrast, using only linear attention must incur an irreducible additive approximation error. Furthermore, we establish a correspondence between LTB and one-step gradient descent estimators with learnable initialization ($\mathsf{GD}\text{-}\mathbf{\beta}$), in the sense that every $\mathsf{GD}\text{-}\mathbf{\beta}$ estimator can be implemented by an LTB estimator and every optimal LTB estimator that minimizes the in-class ICL risk is effectively a $\mathsf{GD}\text{-}\mathbf{\beta}$ estimator. Finally, we show that $\mathsf{GD}\text{-}\mathbf{\beta}$ estimators can be efficiently optimized with gradient flow, despite a non-convex training objective. Our results reveal that LTB achieves ICL by implementing $\mathsf{GD}\text{-}\mathbf{\beta}$, and they highlight the role of MLP layers in reducing approximation error.
We present a parallel algorithm for the $(1-\epsilon)$-approximate maximum flow problem in capacitated, undirected graphs with $n$ vertices and $m$ edges, achieving $O(\epsilon^{-3}\text{polylog} n)$ depth and $O(m \epsilon^{-3} \text{polylog} n)$ work in the PRAM model. Although near-linear time sequential algorithms for this problem have been known for almost a decade, no parallel algorithms that simultaneously achieved polylogarithmic depth and near-linear work were known. At the heart of our result is a polylogarithmic depth, near-linear work recursive algorithm for computing congestion approximators. Our algorithm involves a recursive step to obtain a low-quality congestion approximator followed by a "boosting" step to improve its quality which prevents a multiplicative blow-up in error. Similar to Peng [SODA'16], our boosting step builds upon the hierarchical decomposition scheme of R\"acke, Shah, and T\"aubig [SODA'14]. A direct implementation of this approach, however, leads only to an algorithm with $n^{o(1)}$ depth and $m^{1+o(1)}$ work. To get around this, we introduce a new hierarchical decomposition scheme, in which we only need to solve maximum flows on subgraphs obtained by contracting vertices, as opposed to vertex-induced subgraphs used in R\"acke, Shah, and T\"aubig [SODA'14]. In particular, we are able to directly extract congestion approximators for the subgraphs from a congestion approximator for the entire graph, thereby avoiding additional recursion on those subgraphs. Along the way, we also develop a parallel flow-decomposition algorithm that is crucial to achieving polylogarithmic depth and may be of independent interest.
Despite the possibility to quickly compute reachable sets of large-scale linear systems, current methods are not yet widely applied by practitioners. The main reason for this is probably that current approaches are not push-button-capable and still require to manually set crucial parameters, such as time step sizes and the accuracy of the used set representation -- these settings require expert knowledge. We present a generic framework to automatically find near-optimal parameters for reachability analysis of linear systems given a user-defined accuracy. To limit the computational overhead as much as possible, our methods tune all relevant parameters during runtime. We evaluate our approach on benchmarks from the ARCH competition as well as on random examples. Our results show that our new framework verifies the selected benchmarks faster than manually-tuned parameters and is an order of magnitude faster compared to genetic algorithms.
We show that for large enough $n$, the number of non-isomorphic pseudoline arrangements of order $n$ is greater than $2^{c\cdot n^2}$ for some constant $c > 0.2604$, improving the previous best bound of $c>0.2083$ by Dumitrescu and Mandal (2020). Arrangements of pseudolines (and in particular arrangements of lines) are important objects appearing in many forms in discrete and computational geometry. They have strong ties for example with oriented matroids, sorting networks and point configurations. Let $B_n$ be the number of non-isomorphic pseudoline arrangements of order $n$ and let $b_n := \log_2(B_n)$. The problem of estimating $b_n$ dates back to Knuth, who conjectured that $b_n \leq 0.5n^2 + o(n^2)$ and derived the first bounds $n^2/6-O(n) \leq b_n \leq 0.7924(n^2+n)$. Both the upper and the lower bound have been improved a couple of times since. For the upper bound, it was first improved to $b_n < 0.6988n^2$ (Felsner, 1997), then $b_n < 0.6571 n^2$ by Felsner and Valtr (2011), for large enough $n$. In the same paper, Felsner and Valtr improved the constant in the lower bound to $c> 0.1887$, which was subsequently improved by Dumitrescu and Mandal to $c>0.2083$. Our new bound is based on a construction which starts with one of the constructions of Dumitrescu and Mandal and breaks it into constant sized pieces. We then use software to compute the contribution of each piece to the overall number of pseudoline arrangements. This method adds a lot of flexibility to the construction and thus offers many avenues for future tweaks and improvements which could lead to further tightening of the lower bound.
In this work, we analyze the conditions under which information about the context of an input $X$ can improve the predictions of deep learning models in new domains. Following work in marginal transfer learning in Domain Generalization (DG), we formalize the notion of context as a permutation-invariant representation of a set of data points that originate from the same domain as the input itself. We offer a theoretical analysis of the conditions under which this approach can, in principle, yield benefits, and formulate two necessary criteria that can be easily verified in practice. Additionally, we contribute insights into the kind of distribution shifts for which the marginal transfer learning approach promises robustness. Empirical analysis shows that our criteria are effective in discerning both favorable and unfavorable scenarios. Finally, we demonstrate that we can reliably detect scenarios where a model is tasked with unwarranted extrapolation in out-of-distribution (OOD) domains, identifying potential failure cases. Consequently, we showcase a method to select between the most predictive and the most robust model, circumventing the well-known trade-off between predictive performance and robustness.
We devise a data structure that can answer shortest path queries for two query points in a polygonal domain $P$ on $n$ vertices. For any $\varepsilon > 0$, the space complexity of the data structure is $O(n^{10+\varepsilon })$ and queries can be answered in $O(\log n)$ time. Alternatively, we can achieve a space complexity of $O(n^{9+\varepsilon })$ by relaxing the query time to $O(\log^2 n)$. This is the first improvement upon a conference paper by Chiang and Mitchell from 1999. They present a data structure with $O(n^{11})$ space complexity and $O(\log n)$ query time. Our main result can be extended to include a space-time trade-off. Specifically, we devise data structures with $O(n^{9+\varepsilon}/\hspace{1pt} \ell^{4 + O(\varepsilon )})$ space complexity and $O(\ell \log^2 n )$ query time, for any integer $1 \leq \ell \leq n$. Furthermore, we present improved data structures with $O(\log n)$ query time for the special case where we restrict one (or both) of the query points to lie on the boundary of $P$. When one of the query points is restricted to lie on the boundary, and the other query point is unrestricted, the space complexity becomes $O(n^{6+\varepsilon})$. When both query points are on the boundary, the space complexity is decreased further to $O(n^{4+\varepsilon })$, thereby improving an earlier result of Bae and Okamoto.