To handle the complexities of irregular and incomplete time series data, we propose an invertible solution of Neural Differential Equations (NDE)-based method. While NDE-based methods are a powerful method for analyzing irregularly-sampled time series, they typically do not guarantee reversible transformations in their standard form. Our method suggests the variation of Neural Controlled Differential Equations (Neural CDEs) with Neural Flow, which ensures invertibility while maintaining a lower computational burden. Additionally, it enables the training of a dual latent space, enhancing the modeling of dynamic temporal dynamics. Our research presents an advanced framework that excels in both classification and interpolation tasks. At the core of our approach is an enhanced dual latent states architecture, carefully designed for high precision across various time series tasks. Empirical analysis demonstrates that our method significantly outperforms existing models. This work significantly advances irregular time series analysis, introducing innovative techniques and offering a versatile tool for diverse practical applications.
We examine the possibility of approximating Maximum Vertex-Disjoint Shortest Paths. In this problem, the input is an edge-weighted (directed or undirected) $n$-vertex graph $G$ along with $k$ terminal pairs $(s_1,t_1),(s_2,t_2),\ldots,(s_k,t_k)$. The task is to connect as many terminal pairs as possible by pairwise vertex-disjoint paths such that each path is a shortest path between the respective terminals. Our work is anchored in the recent breakthrough by Lochet [SODA '21], which demonstrates the polynomial-time solvability of the problem for a fixed value of $k$. Lochet's result implies the existence of a polynomial-time $ck$-approximation for Maximum Vertex-Disjoint Shortest Paths, where $c \leq 1$ is a constant. Our first result suggests that this approximation algorithm is, in a sense, the best we can hope for. More precisely, assuming the gap-ETH, we exclude the existence of an $o(k)$-approximations within $f(k) \cdot $poly($n$) time for any function $f$ that only depends on $k$. Our second result demonstrates the infeasibility of achieving an approximation ratio of $n^{\frac{1}{2}-\varepsilon}$ in polynomial time, unless P = NP. It is not difficult to show that a greedy algorithm selecting a path with the minimum number of arcs results in a $\lceil\sqrt{\ell}\rceil$-approximation, where $\ell$ is the number of edges in all the paths of an optimal solution. Since $\ell \leq n$, this underscores the tightness of the $n^{\frac{1}{2}-\varepsilon}$-inapproximability bound. Additionally, we establish that Maximum Vertex-Disjoint Shortest Paths is fixed-parameter tractable when parameterized by $\ell$ but does not admit a polynomial kernel. Our hardness results hold for undirected graphs with unit weights, while our positive results extend to scenarios where the input graph is directed and features arbitrary (non-negative) edge weights.
This paper presents the first systematic study of the evaluation of Deep Neural Networks (DNNs) for discrete dynamical systems under stochastic assumptions, with a focus on wildfire prediction. We develop a framework to study the impact of stochasticity on two classes of evaluation metrics: classification-based metrics, which assess fidelity to observed ground truth (GT), and proper scoring rules, which test fidelity-to-statistic. Our findings reveal that evaluating for fidelity-to-statistic is a reliable alternative in highly stochastic scenarios. We extend our analysis to real-world wildfire data, highlighting limitations in traditional wildfire prediction evaluation methods, and suggest interpretable stochasticity-compatible alternatives.
Despite the possibility to quickly compute reachable sets of large-scale linear systems, current methods are not yet widely applied by practitioners. The main reason for this is probably that current approaches are not push-button-capable and still require to manually set crucial parameters, such as time step sizes and the accuracy of the used set representation -- these settings require expert knowledge. We present a generic framework to automatically find near-optimal parameters for reachability analysis of linear systems given a user-defined accuracy. To limit the computational overhead as much as possible, our methods tune all relevant parameters during runtime. We evaluate our approach on benchmarks from the ARCH competition as well as on random examples. Our results show that our new framework verifies the selected benchmarks faster than manually-tuned parameters and is an order of magnitude faster compared to genetic algorithms.
Contextualized embeddings are the preferred tool for modeling Lexical Semantic Change (LSC). Current evaluations typically focus on a specific task known as Graded Change Detection (GCD). However, performance comparison across work are often misleading due to their reliance on diverse settings. In this paper, we evaluate state-of-the-art models and approaches for GCD under equal conditions. We further break the LSC problem into Word-in-Context (WiC) and Word Sense Induction (WSI) tasks, and compare models across these different levels. Our evaluation is performed across different languages on eight available benchmarks for LSC, and shows that (i) APD outperforms other approaches for GCD; (ii) XL-LEXEME outperforms other contextualized models for WiC, WSI, and GCD, while being comparable to GPT-4; (iii) there is a clear need for improving the modeling of word meanings, as well as focus on how, when, and why these meanings change, rather than solely focusing on the extent of semantic change.
Datalogo is an extension of Datalog that allows for aggregation and recursion over an arbitrary commutative semiring. Like Datalog, Datalogo programs can be evaluated via the natural iterative algorithm until a fixed point is reached. However unlike Datalog, the natural iterative evaluation of some Datalogo programs over some semirings may not converge. It is known that the commutative semirings for which the iterative evaluation of Datalogo programs is guaranteed to converge are exactly those semirings that are stable [7]. Previously, the best known upper bound on the number of iterations until convergence over $p$-stable semirings is $\sum_{i=1}^n (p+2)^i = \Theta(p^n)$ steps, where $n$ is (essentially) the output size. We establish that, in fact, the natural iterative evaluation of a Datalogoprogram over a $p$-stable semiring converges within a polynomial number of iterations. In particular our upper bound is $O( \sigma p n^2( n^2 \lg \lambda + \lg \sigma))$ where $\sigma$ is the number of elements in the semiring present in either the input databases or the Datalogo program, and $\lambda$ is the maximum number of terms in any product in the Datalogo program.
The primary objective of this scholarly work is to develop two estimation procedures - maximum likelihood estimator (MLE) and method of trimmed moments (MTM) - for the mean and variance of lognormal insurance payment severity data sets affected by different loss control mechanism, for example, truncation (due to deductibles), censoring (due to policy limits), and scaling (due to coinsurance proportions), in insurance and financial industries. Maximum likelihood estimating equations for both payment-per-payment and payment-per-loss data sets are derived which can be solved readily by any existing iterative numerical methods. The asymptotic distributions of those estimators are established via Fisher information matrices. Further, with a goal of balancing efficiency and robustness and to remove point masses at certain data points, we develop a dynamic MTM estimation procedures for lognormal claim severity models for the above-mentioned transformed data scenarios. The asymptotic distributional properties and the comparison with the corresponding MLEs of those MTM estimators are established along with extensive simulation studies. Purely for illustrative purpose, numerical examples for 1500 US indemnity losses are provided which illustrate the practical performance of the established results in this paper.
We study the extent to which it is possible to approximate the optimal value of a Unique Games instance in Fixed-Point Logic with Counting (FPC). Formally, we prove lower bounds against the accuracy of FPC-interpretations that map Unique Games instances (encoded as relational structures) to rational numbers giving the approximate fraction of constraints that can be satisfied. We prove two new FPC-inexpressibility results for Unique Games: the existence of a $(1/2, 1/3 + \delta)$-inapproximability gap, and inapproximability to within any constant factor. Previous recent work has established similar FPC-inapproximability results for a small handful of other problems. Our construction builds upon some of these ideas, but contains a novel technique. While most FPC-inexpressibility results are based on variants of the CFI-construction, ours is significantly different. We start with a graph of very large girth and label the edges with random affine vector spaces over $\mathbb{F}_2$ that determine the constraints in the two structures. Duplicator's strategy involves maintaining a partial isomorphism over a minimal tree that spans the pebbled vertices of the graph.
When constructing parametric models to predict the cost of future claims, several important details have to be taken into account: (i) models should be designed to accommodate deductibles, policy limits, and coinsurance factors, (ii) parameters should be estimated robustly to control the influence of outliers on model predictions, and (iii) all point predictions should be augmented with estimates of their uncertainty. The methodology proposed in this paper provides a framework for addressing all these aspects simultaneously. Using payment-per-payment and payment-per-loss variables, we construct the adaptive version of method of winsorized moments (MWM) estimators for the parameters of truncated and censored lognormal distribution. Further, the asymptotic distributional properties of this approach are derived and compared with those of the maximum likelihood estimator (MLE) and method of trimmed moments (MTM) estimators. The latter being a primary competitor to MWM. Moreover, the theoretical results are validated with extensive simulation studies and risk measure sensitivity analysis. Finally, practical performance of these methods is illustrated using the well-studied data set of 1500 U.S. indemnity losses. With this real data set, it is also demonstrated that the composite models do not provide much improvement in the quality of predictive models compared to a stand-alone fitted distribution specially for truncated and censored sample data.
Existing approaches to Theory of Mind (ToM) in Artificial Intelligence (AI) overemphasize prompted, or cue-based, ToM, which may limit our collective ability to develop Artificial Social Intelligence (ASI). Drawing from research in computer science, cognitive science, and related disciplines, we contrast prompted ToM with what we call spontaneous ToM -- reasoning about others' mental states that is grounded in unintentional, possibly uncontrollable cognitive functions. We argue for a principled approach to studying and developing AI ToM and suggest that a robust, or general, ASI will respond to prompts \textit{and} spontaneously engage in social reasoning.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.