In [Sau11,SPW13], Saunderson, Parrilo and Willsky asked the following elegant geometric question: what is the largest $m= m(d)$ such that there is an ellipsoid in $\mathbb{R}^d$ that passes through $v_1, v_2, \ldots, v_m$ with high probability when the $v_i$s are chosen independently from the standard Gaussian distribution $N(0,I_{d})$. The existence of such an ellipsoid is equivalent to the existence of a positive semidefinite matrix $X$ such that $v_i^{\top}X v_i =1$ for every $1 \leq i \leq m$ - a natural example of a random semidefinite program. SPW conjectured that $m= (1-o(1)) d^2/4$ with high probability. Very recently, Potechin, Turner, Venkat and Wein and Kane and Diakonikolas proved that $m \geq d^2/\log^{O(1)}(d)$ via certain explicit constructions. In this work, we give a substantially tighter analysis of their construction to prove that $m \geq d^2/C$ for an absolute constant $C>0$. This resolves one direction of the SPW conjecture up to a constant. Our analysis proceeds via the method of Graphical Matrix Decomposition that has recently been used to analyze correlated random matrices arising in various areas [BHK+19]. Our key new technical tool is a refined method to prove singular value upper bounds on certain correlated random matrices that are tight up to absolute dimension-independent constants. In contrast, all previous methods that analyze such matrices lose logarithmic factors in the dimension.
Given $k$ input graphs $G_1, \dots ,G_k$, where each pair $G_i$, $G_j$ with $i \neq j$ shares the same graph $G$, the problem Simultaneous Embedding With Fixed Edges (SEFE) asks whether there exists a planar drawing for each input graph such that all drawings coincide on $G$. While SEFE is still open for the case of two input graphs, the problem is NP-complete for $k \geq 3$ [Schaefer, JGAA 13]. In this work, we explore the parameterized complexity of SEFE. We show that SEFE is FPT with respect to $k$ plus the vertex cover number or the feedback edge set number of the the union graph $G^\cup = G_1 \cup \dots \cup G_k$. Regarding the shared graph $G$, we show that SEFE is NP-complete, even if $G$ is a tree with maximum degree 4. Together with a known NP-hardness reduction [Angelini et al., TCS 15], this allows us to conclude that several parameters of $G$, including the maximum degree, the maximum number of degree-1 neighbors, the vertex cover number, and the number of cutvertices are intractable. We also settle the tractability of all pairs of these parameters. We give FPT algorithms for the vertex cover number plus either of the first two parameters and for the number of cutvertices plus the maximum degree, whereas we prove all remaining combinations to be intractable.
We investigate the decidability of the ${0,\infty}$ fragment of Timed Propositional Temporal Logic (TPTL). We show that the satisfiability checking of TPTL$^{0,\infty}$ is PSPACE-complete. Moreover, even its 1-variable fragment (1-TPTL$^{0,\infty}$) is strictly more expressive than Metric Interval Temporal Logic (MITL) for which satisfiability checking is EXPSPACE complete. Hence, we have a strictly more expressive logic with computationally easier satisfiability checking. To the best of our knowledge, TPTL$^{0,\infty}$ is the first multi-variable fragment of TPTL for which satisfiability checking is decidable without imposing any bounds/restrictions on the timed words (e.g. bounded variability, bounded time, etc.). The membership in PSPACE is obtained by a reduction to the emptiness checking problem for a new "non-punctual" subclass of Alternating Timed Automata with multiple clocks called Unilateral Very Weak Alternating Timed Automata (VWATA$^{0,\infty}$) which we prove to be in PSPACE. We show this by constructing a simulation equivalent non-deterministic timed automata whose number of clocks is polynomial in the size of the given VWATA$^{0,\infty}$.
Topic modeling and text mining are subsets of Natural Language Processing with relevance for conducting meta-analysis (MA) and systematic review (SR). For evidence synthesis, the above NLP methods are conventionally used for topic-specific literature searches or extracting values from reports to automate essential phases of SR and MA. Instead, this work proposes a comparative topic modeling approach to analyze reports of contradictory results on the same general research question. Specifically, the objective is to find topics exhibiting distinct associations with significant results for an outcome of interest by ranking them according to their proportional occurrence and consistency of distribution across reports of significant results. The proposed method was tested on broad-scope studies addressing whether supplemental nutritional compounds significantly benefit macular degeneration (MD). Eight compounds were identified as having a particular association with reports of significant results for benefitting MD. Six of these were further supported in terms of effectiveness upon conducting a follow-up literature search for validation (omega-3 fatty acids, copper, zeaxanthin, lutein, zinc, and nitrates). The two not supported by the follow-up literature search (niacin and molybdenum) also had the lowest scores under the proposed methods ranking system, suggesting that the proposed method's score for a given topic is a viable proxy for its degree of association with the outcome of interest. These results underpin the proposed methods potential to add specificity in understanding effects from broad-scope reports, elucidate topics of interest for future research, and guide evidence synthesis in a systematic and scalable way.
In 2013, Pak and Panova proved the strict unimodality property of $q$-binomial coefficients $\binom{\ell+m}{m}_q$ (as polynomials in $q$) based on the combinatorics of Young tableaux and the semigroup property of Kronecker coefficients. They showed it to be true for all $\ell,m\geq 8$ and a few other cases. We propose a different approach to this problem based on computer algebra, where we establish a closed form for the coefficients of these polynomials and then use cylindrical algebraic decomposition to identify exactly the range of coefficients where strict unimodality holds. This strategy allows us to tackle generalizations of the problem, e.g., to show unimodality with larger gaps or unimodality of related sequences. In particular, we present proofs of two additional cases of a conjecture by Stanley and Zanello.
We provide a framework to prove convergence rates for discretizations of kinetic Langevin dynamics for $M$-$\nabla$Lipschitz $m$-log-concave densities. Our approach provides convergence rates of $\mathcal{O}(m/M)$, with explicit stepsize restrictions, which are of the same order as the stability threshold for Gaussian targets and are valid for a large interval of the friction parameter. We apply this methodology to various integration methods which are popular in the molecular dynamics and machine learning communities. Finally we introduce the property ``$\gamma$-limit convergent" (GLC) to characterise underdamped Langevin schemes that converge to overdamped dynamics in the high friction limit and which have stepsize restrictions that are independent of the friction parameter; we show that this property is not generic by exhibiting methods from both the class and its complement.
Open-World Object Detection (OWOD) extends object detection problem to a realistic and dynamic scenario, where a detection model is required to be capable of detecting both known and unknown objects and incrementally learning newly introduced knowledge. Current OWOD models, such as ORE and OW-DETR, focus on pseudo-labeling regions with high objectness scores as unknowns, whose performance relies heavily on the supervision of known objects. While they can detect the unknowns that exhibit similar features to the known objects, they suffer from a severe label bias problem that they tend to detect all regions (including unknown object regions) that are dissimilar to the known objects as part of the background. To eliminate the label bias, this paper proposes a novel approach that learns an unsupervised discriminative model to recognize true unknown objects from raw pseudo labels generated by unsupervised region proposal methods. The resulting model can be further refined by a classification-free self-training method which iteratively extends pseudo unknown objects to the unlabeled regions. Experimental results show that our method 1) significantly outperforms the prior SOTA in detecting unknown objects while maintaining competitive performance of detecting known object classes on the MS COCO dataset, and 2) achieves better generalization ability on the LVIS and Objects365 datasets.
The Fisher-Kolmogorov equation is a diffusion-reaction PDE that is used to model the accumulation of prionic proteins, which are responsible for many different neurological disorders. Likely, the most important and studied misfolded protein in literature is the Amyloid-$\beta$, responsible for the onset of Alzheimer disease. Starting from medical images we construct a reduced-order model based on a graph brain connectome. The reaction coefficient of the proteins is modelled as a stochastic random field, taking into account all the many different underlying physical processes, which can hardly be measured. Its probability distribution is inferred by means of the Monte Carlo Markov Chain method applied to clinical data. The resulting model is patient-specific and can be employed for predicting the disease's future development. Forward uncertainty quantification techniques (Monte Carlo and sparse grid stochastic collocation) are applied with the aim of quantifying the impact of the variability of the reaction coefficient on the progression of protein accumulation within the next 20 years.
Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based applications. We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. Our goal in this survey is to provide an easier yet better understanding of the techniques belonging to different categories in which research has been done on this topic. Finally, we highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions.
It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.