Determining the weight distribution of a code is an old and fundamental topic in coding theory that has been thoroughly studied. In 1977, Helleseth, Kl{\o}ve, and Mykkeltveit presented a weight enumerator polynomial of the lifted code over $\mathbb{F}_{q^\ell}$ of a $q$-ary linear code with significant combinatorial properties, which can determine the support weight distribution of this linear code. The Solomon-Stiffler codes are a family of famous Griesmer codes, which were proposed by Solomon and Stiffler in 1965. In this paper, we determine the weight enumerator polynomials of the lifted codes of the projective Solomon-Stiffler codes using some combinatorial properties of subspaces. As a result, we determine the support weight distributions of the projective Solomon-Stiffler codes. In particular, we determine the weight hierarchies of the projective Solomon-Stiffler codes.
We present a rigorous and precise analysis of the maximum degree and the average degree in a dynamic duplication-divergence graph model introduced by Sol\'e, Pastor-Satorras et al. in which the graph grows according to a duplication-divergence mechanism, i.e. by iteratively creating a copy of some node and then randomly alternating the neighborhood of a new node with probability $p$. This model captures the growth of some real-world processes e.g. biological or social networks. In this paper, we prove that for some $0 < p < 1$ the maximum degree and the average degree of a duplication-divergence graph on $t$ vertices are asymptotically concentrated with high probability around $t^p$ and $\max\{t^{2 p - 1}, 1\}$, respectively, i.e. they are within at most a polylogarithmic factor from these values with probability at least $1 - t^{-A}$ for any constant $A > 0$.
The prediction accuracy of machine learning methods is steadily increasing, but the calibration of their uncertainty predictions poses a significant challenge. Numerous works focus on obtaining well-calibrated predictive models, but less is known about reliably assessing model calibration. This limits our ability to know when algorithms for improving calibration have a real effect, and when their improvements are merely artifacts due to random noise in finite datasets. In this work, we consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem. The null hypothesis is that the predictive model is calibrated, while the alternative hypothesis is that the deviation from calibration is sufficiently large. We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. When the conditional class probabilities are H\"older continuous, we propose T-Cal, a minimax optimal test for calibration based on a debiased plug-in estimator of the $\ell_2$-Expected Calibration Error (ECE). We further propose Adaptive T-Cal, a version that is adaptive to unknown smoothness. We verify our theoretical findings with a broad range of experiments, including with several popular deep neural net architectures and several standard post-hoc calibration methods. T-Cal is a practical general-purpose tool, which -- combined with classical tests for discrete-valued predictors -- can be used to test the calibration of virtually any probabilistic classification method.
The incompressibility method is a counting argument in the framework of algorithmic complexity that permits discovering properties that are satisfied by most objects of a class. This paper gives a preliminary insight into Kolmogorov's complexity of groupoids and some algebras. The incompressibility method shows that almost all the groupoids are asymmetric and simple: Only trivial or constant homomorphisms are possible. However, highly random groupoids allow subgroupoids with interesting restrictions that reveal intrinsic structural properties. We also study the issue of the algebraic varieties and wonder which equational identities allow randomness.
Transformer requires a fixed number of layers and heads which makes them inflexible to the complexity of individual samples and expensive in training and inference. To address this, we propose a sample-based Dynamic Hierarchical Transformer (DHT) model whose layers and heads can be dynamically configured with single data samples via solving contextual bandit problems. To determine the number of layers and heads, we use the Uniform Confidence Bound while we deploy combinatorial Thompson Sampling in order to select specific head combinations given their number. Different from previous work that focuses on compressing trained networks for inference only, DHT is not only advantageous for adaptively optimizing the underlying network architecture during training but also has a flexible network for efficient inference. To the best of our knowledge, this is the first comprehensive data-driven dynamic transformer without any additional auxiliary neural networks that implement the dynamic system. According to the experiment results, we achieve up to 74% computational savings for both training and inference with a minimal loss of accuracy.
We note a fact that stiff systems or differential equations that have highly oscillatory solutions cannot be solved efficiently using conventional methods. In this paper, we study two new classes of exponential Runge-Kutta (ERK) integrators for efficiently solving stiff systems or highly oscillatory problems. We first present a novel class of explicit modified version of exponential Runge-Kutta (MVERK) methods based on the order conditions. Furthermore, we consider a class of explicit simplified version of exponential Runge-Kutta (SVERK) methods. Numerical results demonstrate the high efficiency of the explicit MVERK integrators and SVERK methods derived in this paper compared with the well-known explicit ERK integrators for stiff systems or highly oscillatory problems in the literature.
We provide numerical bounds on the Crouzeix ratiofor KLS matrices $A$ which have a line segment on the boundary of the numerical range. The Crouzeix ratio is the supremum over all polynomials $p$ of the spectral norm of $p(A)$ dividedby the maximum absolute value of $p$ on the numerical range of $A$.Our bounds confirm the conjecture that this ratiois less than or equal to $2$. We also give a precise description of these numerical ranges.
We extend a localization result for the $H^{1/2}$ norm by B. Faermann to a wider class of subspaces of $H^{1/2}(\Gamma)$, and we prove an analogous result for the $H^{-1/2}(\Gamma)$ norm, $\Gamma$ being the boundary of a bounded polytopal domain $\Omega$ in $\mathbb{R}^n$, $n=2,3$. As a corollary, we obtain equivalent, better localized, norms for both $H^{1/2}(\Gamma)$ and $H^{-1/2}(\Gamma)$, which can be exploited, for instance, in the design of preconditioners or of stabilized methods.
In many applications, it is desired to obtain extreme eigenvalues and eigenvectors of large Hermitian matrices by efficient and compact algorithms. In particular, orthogonalization-free methods are preferred for large-scale problems for finding eigenspaces of extreme eigenvalues without explicitly computing orthogonal vectors in each iteration. For the top $p$ eigenvalues, the simplest orthogonalization-free method is to find the best rank-$p$ approximation to a positive semi-definite Hermitian matrix by algorithms solving the unconstrained Burer-Monteiro formulation. We show that the nonlinear conjugate gradient method for the unconstrained Burer-Monteiro formulation is equivalent to a Riemannian conjugate gradient method on a quotient manifold with the Bures-Wasserstein metric, thus its global convergence to a stationary point can be proven. Numerical tests suggest that it is efficient for computing the largest $k$ eigenvalues for large-scale matrices if the largest $k$ eigenvalues are nearly distributed uniformly.
We consider the application of the generalized Convolution Quadrature (gCQ) to approximate the solution of an important class of sectorial problems. The gCQ is a generalization of Lubich's Convolution Quadrature (CQ) that allows for variable steps. The available stability and convergence theory for the gCQ requires non realistic regularity assumptions on the data, which do not hold in many applications of interest, such as the approximation of subdiffusion equations. It is well known that for non smooth enough data the original CQ, with uniform steps, presents an order reduction close to the singularity. We generalize the analysis of the gCQ to data satisfying realistic regularity assumptions and provide sufficient conditions for stability and convergence on arbitrary sequences of time points. We consider the particular case of graded meshes and show how to choose them optimally, according to the behaviour of the data. An important advantage of the gCQ method is that it allows for a fast and memory reduced implementation. We describe how the fast and oblivious gCQ can be implemented and illustrate our theoretical results with several numerical experiments.
In the domain of Mobility Data Science, the intricate task of interpreting models trained on trajectory data, and elucidating the spatio-temporal movement of entities, has persistently posed significant challenges. Conventional XAI techniques, although brimming with potential, frequently overlook the distinct structure and nuances inherent within trajectory data. Observing this deficiency, we introduced a comprehensive framework that harmonizes pivotal XAI techniques: LIME (Local Interpretable Model-agnostic Explanations), SHAP (SHapley Additive exPlanations), Saliency maps, attention mechanisms, direct trajectory visualization, and Permutation Feature Importance (PFI). Unlike conventional strategies that deploy these methods singularly, our unified approach capitalizes on the collective efficacy of these techniques, yielding deeper and more granular insights for models reliant on trajectory data. In crafting this synthesis, we effectively address the multifaceted essence of trajectories, achieving not only amplified interpretability but also a nuanced, contextually rich comprehension of model decisions. To validate and enhance our framework, we undertook a survey to gauge preferences and reception among various user demographics. Our findings underscored a dichotomy: professionals with academic orientations, particularly those in roles like Data Scientist, IT Expert, and ML Engineer, showcased a profound, technical understanding and often exhibited a predilection for amalgamated methods for interpretability. Conversely, end-users or individuals less acquainted with AI and Data Science showcased simpler inclinations, such as bar plots indicating timestep significance or visual depictions pinpointing pivotal segments of a vessel's trajectory.