Over the last decades, the family of $\alpha$-stale distributions has proven to be useful for modelling in telecommunication systems. Particularly, in the case of radar applications, finding a fast and accurate estimation for the amplitude density function parameters appears to be very important. In this work, the maximum likelihood estimator (MLE) is proposed for parameters of the amplitude distribution. To do this, the amplitude data are \emph{projected} on the horizontal and vertical axes using two simple transformations. It is proved that the \emph{projected} data follow a zero-location symmetric $\alpha$-stale distribution for which the MLE can be computed quite fast. The average of computed MLEs based on two \emph{projections} is considered as estimator for parameters of the amplitude distribution. Performance of the proposed \emph{projection} method is demonstrated through simulation study and analysis of two sets of real radar data.
We present a new algorithm for finding isolated zeros of a system of real-valued functions in a bounded interval in $\mathbb{R}^n$. It uses the Chebyshev proxy method combined with a mixture of subdivision, reduction methods, and elimination checks that leverage special properties of Chebyshev polynomials. We prove the method has R-quadratic convergence locally near simple zeros of the system. We also analyze the temporal complexity and the numerical stability of the algorithm and provide numerical evidence in dimensions up to three that the method is both fast and accurate on a wide range of problems. The algorithm should also work well in higher dimensions. Our tests show that the algorithm outperforms other standard methods on this problem of finding all real zeros in a bounded domain. Our Python implementation of the algorithm is publicly available.
Neural marked temporal point processes have been a valuable addition to the existing toolbox of statistical parametric models for continuous-time event data. These models are useful for sequences where each event is associated with a single item (a single type of event or a "mark") -- but such models are not suited for the practical situation where each event is associated with a set of items. In this work, we develop a general framework for modeling set-valued data in continuous-time, compatible with any intensity-based recurrent neural point process model. In addition, we develop inference methods that can use such models to answer probabilistic queries such as "the probability of item $A$ being observed before item $B$," conditioned on sequence history. Computing exact answers for such queries is generally intractable for neural models due to both the continuous-time nature of the problem setting and the combinatorially-large space of potential outcomes for each event. To address this, we develop a class of importance sampling methods for querying with set-based sequences and demonstrate orders-of-magnitude improvements in efficiency over direct sampling via systematic experiments with four real-world datasets. We also illustrate how to use this framework to perform model selection using likelihoods that do not involve one-step-ahead prediction.
Mobile edge computing (MEC) is powerful to alleviate the heavy computing tasks in integrated sensing and communication (ISAC) systems. In this paper, we investigate joint beamforming and offloading design in a three-tier integrated sensing, communication and computation (ISCC) framework comprising one cloud server, multiple mobile edge servers, and multiple terminals. While executing sensing tasks, the user terminals can optionally offload sensing data to either MEC server or cloud servers. To minimize the execution latency, we jointly optimize the transmit beamforming matrices and offloading decision variables under the constraint of sensing performance. An alternating optimization algorithm based on multidimensional fractional programming is proposed to tackle the non-convex problem. Simulation results demonstrates the superiority of the proposed mechanism in terms of convergence and task execution latency reduction, compared with the state-of-the-art two-tier ISCC framework.
Deep generative models have been demonstrated as problematic in the unsupervised out-of-distribution (OOD) detection task, where they tend to assign higher likelihoods to OOD samples. Previous studies on this issue are usually not applicable to the Variational Autoencoder (VAE). As a popular subclass of generative models, the VAE can be effective with a relatively smaller model size and be more stable and faster in training and inference, which can be more advantageous in real-world applications. In this paper, We propose a novel VAE-based score called Error Reduction (ER) for OOD detection, which is based on a VAE that takes a lossy version of the training set as inputs and the original set as targets. Experiments are carried out on various datasets to show the effectiveness of our method, we also present the effect of design choices with ablation experiments. Our code is available at: //github.com/ZJLAB-AMMI/VAE4OOD.
We propose a method for estimating a log-concave density on $\mathbb R^d$ from samples, under the assumption that there exists an orthogonal transformation that makes the components of the random vector independent. While log-concave density estimation is hard both computationally and statistically, the independent components assumption alleviates both issues, while still maintaining a large non-parametric class. We prove that under mild conditions, at most $\tilde{\mathcal{O}}(\epsilon^{-4})$ samples (suppressing constants and log factors) suffice for our proposed estimator to be within $\epsilon$ of the original density in squared Hellinger distance. On the computational front, while the usual log-concave maximum likelihood estimate can be obtained via a finite-dimensional convex program, it is slow to compute -- especially in higher dimensions. We demonstrate through numerical experiments that our estimator can be computed efficiently, making it more practical to use.
The $a$-number is an invariant of the isomorphism class of the $p$-torsion group scheme. We use the Cartier operator on $H^0(\mathcal{A}_2,\Omega^1)$ to find a closed formula for the $a$-number of the form $\mathcal{A}_2 = v(Y^{\sqrt{q}}+Y-x^{\frac{\sqrt{q}+1}{2}})$ where $q=p^s$ over the finite field $\mathbb{F}_{q^2}$. The application of the computed $a$-number in coding theory is illustrated by the relationship between the algebraic properties of the curve and the parameters of codes that are supported by it.
We establish the unique ergodicity of the Markov chain generated by the stochastic theta method (STM) with $\theta \in [1/2, 1]$ for monotone SODEs, without growth restriction on the coefficients, driven by nondegenerate multiplicative noise. The main ingredient of the arguments lies in the construction of new Lyapunov functions, involving the coefficients, the stepsize, and $\theta$, and the irreducibility and the strong Feller property for the STM. We also generalize the arguments to the STM and its Galerkin-based full discretizations for a class of monotone SPDEs driven by infinite-dimensional nondegenerate multiplicative trace-class noise. Applying these results to the stochastic Allen--Cahn equation indicates that its drift-implicit Euler scheme is uniquely ergodic for any interface thickness, which gives an affirmative answer to a question proposed in (J. Cui, J. Hong, and L. Sun, Stochastic Process. Appl. (2021): 55--93). Numerical experiments verify our theoretical results.
Watermarking of language model outputs enables statistical detection of model-generated text, which has many applications in the responsible deployment of language models. Existing watermarking strategies operate by altering the decoder of an existing language model, and the ability for a language model to directly learn to generate the watermark would have significant implications for the real-world deployment of watermarks. First, learned watermarks could be used to build open models that naturally generate watermarked text, allowing for open models to benefit from watermarking. Second, if watermarking is used to determine the provenance of generated text, an adversary can hurt the reputation of a victim model by spoofing its watermark and generating damaging watermarked text. To investigate the learnability of watermarks, we propose watermark distillation, which trains a student model to behave like a teacher model that uses decoding-based watermarking. We test our approach on three distinct decoding-based watermarking strategies and various hyperparameter settings, finding that models can learn to generate watermarked text with high detectability. We also find limitations to learnability, including the loss of watermarking capabilities under fine-tuning on normal text and high sample complexity when learning low-distortion watermarks.
We consider the problem of tracking multiple, unknown, and time-varying numbers of objects using a distributed network of heterogeneous sensors. In an effort to derive a formulation for practical settings, we consider limited and unknown sensor field-of-views (FoVs), sensors with limited local computational resources and communication channel capacity. The resulting distributed multi-object tracking algorithm involves solving an NP-hard multidimensional assignment problem either optimally for small-size problems or sub-optimally for general practical problems. For general problems, we propose an efficient distributed multi-object tracking algorithm that performs track-to-track fusion using a clustering-based analysis of the state space transformed into a density space to mitigate the complexity of the assignment problem. The proposed algorithm can more efficiently group local track estimates for fusion than existing approaches. To ensure we achieve globally consistent identities for tracks across a network of nodes as objects move between FoVs, we develop a graph-based algorithm to achieve label consensus and minimise track segmentation. Numerical experiments with a synthetic and a real-world trajectory dataset demonstrate that our proposed method is significantly more computationally efficient than state-of-the-art solutions, achieving similar tracking accuracy and bandwidth requirements but with improved label consistency.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.