Various static analysis problems are reformulated as instances of the Context-Free Language Reachability (CFL-r) problem. One promising way to make solving CFL-r more practical for large-scale interprocedural graphs is to reduce CFL-r to linear algebra operations on sparse matrices, as they are efficiently executed on modern hardware. In this work, we present five optimizations for a matrix-based CFL-r algorithm that utilize the specific properties of both the underlying semiring and the widely-used linear algebra library SuiteSparse:GraphBlas. Our experimental results show that these optimizations result in orders of magnitude speedup, with the optimized matrix-based CFL-r algorithm consistently outperforming state-of-the-art CFL-r solvers across four considered static analyses.
We study the problem of differentially-private (DP) stochastic (convex-concave) saddle-points in the polyhedral setting. We propose $(\varepsilon, \delta)$-DP algorithms based on stochastic mirror descent that attain nearly dimension-independent convergence rates for the expected duality gap, a type of guarantee that was known before only for bilinear objectives. For convex-concave and first-order-smooth stochastic objectives, our algorithms attain a rate of $\sqrt{\log(d)/n} + (\log(d)^{3/2}/[n\varepsilon])^{1/3}$, where $d$ is the dimension of the problem and $n$ the dataset size. Under an additional second-order-smoothness assumption, we improve the rate on the expected gap to $\sqrt{\log(d)/n} + (\log(d)^{3/2}/[n\varepsilon])^{2/5}$. Under this additional assumption, we also show, by using bias-reduced gradient estimators, that the duality gap is bounded by $\log(d)/\sqrt{n} + \log(d)/[n\varepsilon]^{1/2}$ with constant success probability. This result provides evidence of the near-optimality of the approach. Finally, we show that combining our methods with acceleration techniques from online learning leads to the first algorithm for DP Stochastic Convex Optimization in the polyhedral setting that is not based on Frank-Wolfe methods. For convex and first-order-smooth stochastic objectives, our algorithms attain an excess risk of $\sqrt{\log(d)/n} + \log(d)^{7/10}/[n\varepsilon]^{2/5}$, and when additionally assuming second-order-smoothness, we improve the rate to $\sqrt{\log(d)/n} + \log(d)/\sqrt{n\varepsilon}$. Instrumental to all of these results are various extensions of the classical Maurey Sparsification Lemma, which may be of independent interest.
Automatic Text Summarization (ATS), utilizing Natural Language Processing (NLP) algorithms, aims to create concise and accurate summaries, thereby significantly reducing the human effort required in processing large volumes of text. ATS has drawn considerable interest in both academic and industrial circles. Many studies have been conducted in the past to survey ATS methods; however, they generally lack practicality for real-world implementations, as they often categorize previous methods from a theoretical standpoint. Moreover, the advent of Large Language Models (LLMs) has altered conventional ATS methods. In this survey, we aim to 1) provide a comprehensive overview of ATS from a ``Process-Oriented Schema'' perspective, which is best aligned with real-world implementations; 2) comprehensively review the latest LLM-based ATS works; and 3) deliver an up-to-date survey of ATS, bridging the two-year gap in the literature. To the best of our knowledge, this is the first survey to specifically investigate LLM-based ATS methods.
Efficiently pricing multi-asset options poses a significant challenge in quantitative finance. The Monte Carlo (MC) method remains the prevalent choice for pricing engines; however, its slow convergence rate impedes its practical application. Fourier methods leverage the knowledge of the characteristic function to accurately and rapidly value options with up to two assets. Nevertheless, they face hurdles in the high-dimensional settings due to the tensor product (TP) structure of commonly employed quadrature techniques. This work advocates using the randomized quasi-MC (RQMC) quadrature to improve the scalability of Fourier methods with high dimensions. The RQMC technique benefits from the smoothness of the integrand and alleviates the curse of dimensionality while providing practical error estimates. Nonetheless, the applicability of RQMC on the unbounded domain, $\mathbb{R}^d$, requires a domain transformation to $[0,1]^d$, which may result in singularities of the transformed integrand at the corners of the hypercube, and deteriorate the rate of convergence of RQMC. To circumvent this difficulty, we design an efficient domain transformation procedure based on the derived boundary growth conditions of the integrand. This transformation preserves the sufficient regularity of the integrand and hence improves the rate of convergence of RQMC. To validate this analysis, we demonstrate the efficiency of employing RQMC with an appropriate transformation to evaluate options in the Fourier space for various pricing models, payoffs, and dimensions. Finally, we highlight the computational advantage of applying RQMC over MC or TP in the Fourier domain, and over MC in the physical domain for options with up to 15 assets.
This paper introduces Bespoke Non-Stationary (BNS) Solvers, a solver distillation approach to improve sample efficiency of Diffusion and Flow models. BNS solvers are based on a family of non-stationary solvers that provably subsumes existing numerical ODE solvers and consequently demonstrate considerable improvement in sample approximation (PSNR) over these baselines. Compared to model distillation, BNS solvers benefit from a tiny parameter space ($<$200 parameters), fast optimization (two orders of magnitude faster), maintain diversity of samples, and in contrast to previous solver distillation approaches nearly close the gap from standard distillation methods such as Progressive Distillation in the low-medium NFE regime. For example, BNS solver achieves 45 PSNR / 1.76 FID using 16 NFE in class-conditional ImageNet-64. We experimented with BNS solvers for conditional image generation, text-to-image generation, and text-2-audio generation showing significant improvement in sample approximation (PSNR) in all.
Decentralized Congestion Control (DCC) mechanisms have been a core part of protocol stacks for vehicular networks since their inception and standardization. The ETSI ITS-G5 protocol stack for vehicular communications considers the usage of DCC not only in the network or access layers, but also as a part of the cross-layer architecture that influences how often messages are generated and transmitted. ETSI DCC mechanisms have evolved from a reactive approach based on a finite state machine, to an adaptive approach that relies on a linear control algorithm. This linear control algorithm, called LIMERIC, is the basis of the mechanism used in the ETSI DCC Adaptive Approach. The behavior of this algorithm depends on a set of parameters. Different values for these parameters have been proposed in the literature, including those defined in the ETSI specification. A recent proposal is Dual-$\alpha$, which chooses parameters to improve convergence and fairness when the algorithm has to react to fast changes in the use of the shared medium (transitory situations). This article evaluates, by means of simulations, the performance of the ETSI DCC Adaptive Approach and related algorithms, considering both steady state and transitory situations. Results show that a bad selection of parameters can make a DCC algorithm ineffective, that the ETSI DCC Adaptive algorithm performs well in steady state conditions, and that Dual-$\alpha$ performs as well in steady state conditions and outperforms the ETSI DCC Adaptive Approach in transitory scenarios.
Growth in computational materials science and initiatives such as the Materials Genome Initiative (MGI) and the European Materials Modelling Council (EMMC) has motivated the development and application of ontologies. A key factor has been increased adoption of the FAIR principles, making research data findable, accessible, interoperable, and reusable (Wilkinson et al. 2016). This paper characterizes semantic interoperability among a subset of materials science ontologies in the MatPortal repository. Background context covers semantic interoperability, ontological commitment, and the materials science ontology landscape. The research focused on MatPortal's two interoperability protocols: LOOM term matching and URI matching. Results report the degree of overlap and demonstrate the different types of ambiguity among ontologies. The discussion considers implications for FAIR and AI, and the conclusion highlight key findings and next steps.
This paper intends to apply the sample-average-approximation (SAA) scheme to solve a system of stochastic equations (SSE), which has many applications in a variety of fields. The SAA is an effective paradigm to address risks and uncertainty in stochastic models from the perspective of Monte Carlo principle. Nonetheless, a numerical conflict arises from the sample size of SAA when one has to make a tradeoff between the accuracy of solutions and the computational cost. To alleviate this issue, we incorporate a gradually reinforced SAA scheme into a differentiable homotopy method and develop a gradually reinforced sample-average-approximation (GRSAA) differentiable homotopy method in this paper. By introducing a series of continuously differentiable functions of the homotopy parameter $t$ ranging between zero and one, we establish a differentiable homotopy system, which is able to gradually increase the sample size of SAA as $t$ descends from one to zero. The set of solutions to the homotopy system contains an everywhere smooth path, which starts from an arbitrary point and ends at a solution to the SAA with any desired accuracy. The GRSAA differentiable homotopy method serves as a bridge to link the gradually reinforced SAA scheme and a differentiable homotopy method and retains the nice property of global convergence the homotopy method possesses while greatly reducing the computational cost for attaining a desired solution to the original SSE. Several numerical experiments further confirm the effectiveness and efficiency of the proposed method.
Explainable Artificial Intelligence (XAI) is transforming the field of Artificial Intelligence (AI) by enhancing the trust of end-users in machines. As the number of connected devices keeps on growing, the Internet of Things (IoT) market needs to be trustworthy for the end-users. However, existing literature still lacks a systematic and comprehensive survey work on the use of XAI for IoT. To bridge this lacking, in this paper, we address the XAI frameworks with a focus on their characteristics and support for IoT. We illustrate the widely-used XAI services for IoT applications, such as security enhancement, Internet of Medical Things (IoMT), Industrial IoT (IIoT), and Internet of City Things (IoCT). We also suggest the implementation choice of XAI models over IoT systems in these applications with appropriate examples and summarize the key inferences for future works. Moreover, we present the cutting-edge development in edge XAI structures and the support of sixth-generation (6G) communication services for IoT applications, along with key inferences. In a nutshell, this paper constitutes the first holistic compilation on the development of XAI-based frameworks tailored for the demands of future IoT use cases.
The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.