We describe a software package, TomOpt, developed to optimise the geometrical layout and specifications of detectors designed for tomography by scattering of cosmic-ray muons. The software exploits differentiable programming for the modeling of muon interactions with detectors and scanned volumes, the inference of volume properties, and the optimisation cycle performing the loss minimisation. In doing so, we provide the first demonstration of end-to-end-differentiable and inference-aware optimisation of particle physics instruments. We study the performance of the software on a relevant benchmark scenario and discuss its potential applications. Our code is available on Github.
Despite outstanding processes in many tasks, Large Language Models (LLMs) still lack accuracy when dealing with highly technical domains. Especially, telecommunications (telco) is a particularly challenging domain due the large amount of lexical, semantic and conceptual peculiarities. Yet, this domain holds many valuable use cases, directly linked to industrial needs. Hence, this paper studies how LLMs can be adapted to the telco domain. It reports our effort to (i) collect a massive corpus of domain-specific data (800M tokens, 80K instructions), (ii) perform adaptation using various methodologies, and (iii) benchmark them against larger generalist models in downstream tasks that require extensive knowledge of telecommunications. Our experiments on Llama-2-7b show that domain-adapted models can challenge the large generalist models. They also suggest that adaptation can be restricted to a unique instruction-tuning step, dicarding the need for any fine-tuning on raw texts beforehand.
Accelerated failure time (AFT) models are frequently used to model survival data, providing a direct quantification of the relationship between event times and covariates. These models allow for the acceleration or deceleration of failure times through a multiplicative factor that accounts for the effect of covariates. While existing literature provides numerous methods for fitting AFT models with time-fixed covariates, adapting these approaches to scenarios involving both time-varying covariates and partly interval-censored data remains challenging. Motivated by a randomised clinical trial dataset on advanced melanoma patients, we propose a maximum penalised likelihood approach for fitting a semiparametric AFT model to survival data with partly interval-censored failure times. This method also accommodates both time-fixed and time-varying covariates. We utilise Gaussian basis functions to construct a smooth approximation of the non-parametric baseline hazard and fit the model using a constrained optimisation approach. The effectiveness of our method is demonstrated through extensive simulations. Finally, we illustrate the relevance of our approach by applying it to a dataset from a randomised clinical trial involving patients with advanced melanoma.
Twisted generalized Reed-Solomon (TGRS) codes constitute an interesting family of evaluation codes, containing a large class of maximum distance separable codes non-equivalent to generalized Reed-Solomon (GRS) ones. Moreover, the Schur squares of TGRS codes may be much larger than those of GRS codes with same dimension. Exploiting these structural differences, in 2018, Beelen, Bossert, Puchinger and Rosenkilde proposed a subfamily of Maximum Distance Separable (MDS) Twisted Reed-Solomon (TRS) codes over $\mathbb{F}_q$ with $\ell$ twists $q \approx n^{2^{\ell}}$ for McEliece encryption, claiming their resistance to both Sidelnikov Shestakov attack and Schur products--based attacks. In short, they claimed these codes to resist to classical key recovery attacks on McEliece encryption scheme instantiated with Reed-Solomon (RS) or GRS codes. In 2020, Lavauzelle and Renner presented an original attack on this system based on the computation of the subfield subcode of the public TRS code. In this paper, we show that the original claim on the resistance of TRS and TGRS codes to Schur products based--attacks is wrong. We identify a broad class of codes including TRS and TGRS ones that is distinguishable from random by computing the Schur square of some shortening of the code. Then, we focus on the case of single twist (i.e., $\ell = 1$), which is the most efficient one in terms of decryption complexity, to derive an attack. The technique is similar to the distinguisher-based attacks of RS code-based systems given by Couvreur, Gaborit, Gauthier-Uma\~na, Otmani, Tillich in 2014.
To manage exceptions, software relies on a key architectural guarantee, precision: that exceptions appear to execute between instructions. However, this definition, dating back over 60 years, fundamentally assumes a sequential programmers model. Modern architectures such as Arm-A with programmer-observable relaxed behaviour make such a naive definition inadequate, and it is unclear exactly what guarantees programmers have on exception entry and exit. In this paper, we clarify the concepts needed to discuss exceptions in the relaxed-memory setting -- a key aspect of precisely specifying the architectural interface between hardware and software. We explore the basic relaxed behaviour across exception boundaries, and the semantics of external aborts, using Arm-A as a representative modern architecture. We identify an important problem, present yet unexplored for decades: pinning down what it means for exceptions to be precise in a relaxed setting. We describe key phenomena that any definition should account for. We develop an axiomatic model for Arm-A precise exceptions, tooling for axiomatic model execution, and a library of tests. Finally we explore the relaxed semantics of software-generated interrupts, as used in sophisticated programming patterns, and sketch how they too could be modelled.
Gradient boosting for decision tree algorithms are increasingly used in actuarial applications as they show superior predictive performance over traditional generalized linear models. Many improvements and sophistications to the first gradient boosting machine algorithm exist. We present in a unified notation, and contrast, all the existing point and probabilistic gradient boosting for decision tree algorithms: GBM, XGBoost, DART, LightGBM, CatBoost, EGBM, PGBM, XGBoostLSS, cyclic GBM, and NGBoost. In this comprehensive numerical study, we compare their performance on five publicly available datasets for claim frequency and severity, of various size and comprising different number of (high cardinality) categorical variables. We explain how varying exposure-to-risk can be handled with boosting in frequency models. We compare the algorithms on the basis of computational efficiency, predictive performance, and model adequacy. LightGBM and XGBoostLSS win in terms of computational efficiency. The fully interpretable EGBM achieves competitive predictive performance compared to the black box algorithms considered. We find that there is no trade-off between model adequacy and predictive accuracy: both are achievable simultaneously.
This document proposes an algorithm for a mobile application designed to monitor multidimensional child growth through digital phenotyping. Digital phenotyping offers a unique opportunity to collect and analyze high-frequency data in real time, capturing behavioral, psychological, and physiological states of children in naturalistic settings. Traditional models of child growth primarily focus on physical metrics, often overlooking multidimensional aspects such as emotional, social, and cognitive development. In this paper, we introduce a Bayesian artificial intelligence (AI) algorithm that leverages digital phenotyping to create a Multidimensional Index of Child Growth (MICG). This index integrates data from various dimensions of child development, including physical, emotional, cognitive, and environmental factors. By incorporating probabilistic modeling, the proposed algorithm dynamically updates its learning based on data collected by the mobile app used by mothers and children. The app also infers uncertainty from response times, adjusting the importance of each dimension of child growth accordingly. Our contribution applies state-of-the-art technology to track multidimensional child development, enabling families and healthcare providers to make more informed decisions in real time.
We present a novel class of projected gradient (PG) methods for minimizing a smooth but not necessarily convex function over a convex compact set. We first provide a novel analysis of the "vanilla" PG method, achieving the best-known iteration complexity for finding an approximate stationary point of the problem. We then develop an "auto-conditioned" projected gradient (AC-PG) variant that achieves the same iteration complexity without requiring the input of the Lipschitz constant of the gradient or any line search procedure. The key idea is to estimate the Lipschitz constant using first-order information gathered from the previous iterations, and to show that the error caused by underestimating the Lipschitz constant can be properly controlled. We then generalize the PG methods to the stochastic setting, by proposing a stochastic projected gradient (SPG) method and a variance-reduced stochastic gradient (VR-SPG) method, achieving new complexity bounds in different oracle settings. We also present auto-conditioned stepsize policies for both stochastic PG methods and establish comparable convergence guarantees.
Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.
We prove, for stably computably enumerable formal systems, direct analogues of the first and second incompleteness theorems of G\"odel. A typical stably computably enumerable set is the set of Diophantine equations with no integer solutions, and in particular such sets are generally not computably enumerable. And so this gives the first extension of the second incompleteness theorem to non classically computable formal systems. Let's motivate this with a somewhat physical application. Let $\mathcal{H} $ be the suitable infinite time limit (stabilization in the sense of the paper) of the mathematical output of humanity, specializing to first order sentences in the language of arithmetic (for simplicity), and understood as a formal system. Suppose that all the relevant physical processes in the formation of $\mathcal{H} $ are Turing computable. Then as defined $\mathcal{H} $ may \emph{not} be computably enumerable, but it is stably computably enumerable. Thus, the classical G\"odel disjunction applied to $\mathcal{H} $ is meaningless, but applying our incompleteness theorems to $\mathcal{H} $ we then get a sharper version of G\"odel's disjunction: assume $\mathcal{H} \vdash PA$ then either $\mathcal{H} $ is not stably computably enumerable or $\mathcal{H} $ is not 1-consistent (in particular is not sound) or $\mathcal{H} $ cannot prove a certain true statement of arithmetic (and cannot disprove it if in addition $\mathcal{H} $ is 2-consistent).
The stability number of a graph, defined as the cardinality of the largest set of pairwise non-adjacent vertices, is NP-hard to compute. The exact subgraph hierarchy (ESH) provides a sequence of increasingly tighter upper bounds on the stability number, starting with the Lov\'asz theta function at the first level and including all exact subgraph constraints of subgraphs of order $k$ into the semidefinite program to compute the Lov\'asz theta function at level $k$. In this paper, we investigate the ESH for Paley graphs, a class of strongly regular, vertex-transitive graphs. We show that for Paley graphs, the bounds obtained from the ESH remain the Lov\'asz theta function up to a certain threshold level, i.e., the bounds of the ESH do not improve up to a certain level. To overcome this limitation, we introduce the local ESH for the stable set problem for vertex-transitive graphs such as Paley graphs. We prove that this new hierarchy provides upper bounds on the stability number of vertex-transitive graphs that are at least as tight as those obtained from the ESH. Additionally, our computational experiments reveal that the local ESH produces superior bounds compared to the ESH for Paley graphs.