Calabi-Yau four-folds may be constructed as hypersurfaces in weighted projective spaces of complex dimension 5 defined via weight systems of 6 weights. In this work, neural networks were implemented to learn the Calabi-Yau Hodge numbers from the weight systems, where gradient saliency and symbolic regression then inspired a truncation of the Landau-Ginzburg model formula for the Hodge numbers of any dimensional Calabi-Yau constructed in this way. The approximation always provides a tight lower bound, is shown to be dramatically quicker to compute (with compute times reduced by up to four orders of magnitude), and gives remarkably accurate results for systems with large weights. Additionally, complementary datasets of weight systems satisfying the necessary but insufficient conditions for transversality were constructed, including considerations of the IP, reflexivity, and intradivisibility properties. Overall producing a classification of this weight system landscape, further confirmed with machine learning methods. Using the knowledge of this classification, and the properties of the presented approximation, a novel dataset of transverse weight systems consisting of 7 weights was generated for a sum of weights $\leq 200$; producing a new database of Calabi-Yau five-folds, with their respective topological properties computed. Further to this an equivalent database of candidate Calabi-Yau six-folds was generated with approximated Hodge numbers.
We study the edge-coloring problem in simple $n$-vertex $m$-edge graphs with maximum degree $\Delta$. This is one of the most classical and fundamental graph-algorithmic problems. Vizing's celebrated theorem provides $(\Delta+1)$-edge-coloring in $O(m\cdot n)$ deterministic time. This running time was improved to $O\left(m\cdot\min\left\{\Delta\cdot\log n, \sqrt{n}\right\}\right)$. It is also well-known that $3\left\lceil\frac{\Delta}{2}\right\rceil$-edge-coloring can be computed in $O(m\cdot\log\Delta)$ time deterministically. Duan et al. devised a randomized $(1+\varepsilon)\Delta$-edge-coloring algorithm with running time $O\left(m\cdot\frac{\log^6 n}{\varepsilon^2}\right)$. It was however open if there exists a deterministic near-linear time algorithm for this basic problem. We devise a simple deterministic $(1+\varepsilon)\Delta$-edge-coloring algorithm with running time $O\left(m\cdot\frac{\log n}{\varepsilon}\right)$. We also devise a randomized $(1+\varepsilon)\Delta$-edge-coloring algorithm with running time $O(m\cdot(\varepsilon^{-18}+\log(\varepsilon\cdot\Delta)))$. For $\varepsilon\geq\frac{1}{\log^{1/18}\Delta}$, this running time is $O(m\cdot\log\Delta)$.
We propose the algorithm that solves the symmetric cone programs (SCPs) by iteratively calling the projection and rescaling methods the algorithms for solving exceptional cases of SCP. Although our algorithm can solve SCPs by itself, we propose it intending to use it as a post-processing step for interior point methods since it can solve the problems more efficiently by using an approximate optimal (interior feasible) solution. We also conduct numerical experiments to see the numerical performance of the proposed algorithm when used as a post-processing step of the solvers implementing interior point methods, using several instances where the symmetric cone is given by a direct product of positive semidefinite cones. Numerical results show that our algorithm can obtain approximate optimal solutions more accurately than the solvers. When at least one of the primal and dual problems did not have an interior feasible solution, the performance of our algorithm was slightly reduced in terms of optimality. However, our algorithm stably returned more accurate solutions than the solvers when the primal and dual problems had interior feasible solutions.
This paper presents exact formulas for the probability distribution function (PDF) and moment generating function (MGF) of the sum-product of statistically independent but not necessarily identically distributed (i.n.i.d.) Nakagami-$m$ random variables (RVs) in terms of Meijer's G-function. Additionally, exact series representations are also derived for the sum of double-Nakagami RVs, providing useful insights on the trade-off between accuracy and computational cost. Simple asymptotic analytical expressions are provided to gain further insight into the derived formula, and the achievable diversity order is obtained. The suggested statistical properties are proved to be a highly useful tool for modeling parallel cascaded Nakagami-$m$ fading channels. The application of these new results is illustrated by deriving exact expressions and simple tight upper bounds for the outage probability (OP) and average symbol error rate (ASER) of several binary and multilevel modulation signals in intelligent reflecting surfaces (IRSs)-assisted communication systems operating over Nakagami-$m$ fading channels. It is demonstrated that the new asymptotic expression is highly accurate and can be extended to encompass a wider range of scenarios. To validate the theoretical frameworks and formulations, Monte-Carlo simulation results are presented. Additionally, supplementary simulations are provided to compare the derived results with two common types of approximations available in the literature, namely the central limit theorem (CLT) and gamma distribution.
Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources. Existing long-context extension methods usually need additional training procedures to support corresponding long-context windows, where the long-context training data (e.g., 32k) is needed, and high GPU training costs are assumed. To address the aforementioned issues, we propose an Efficient and Extreme length extension method for Large Language Models, called E 2 -LLM, with only one training procedure and dramatically reduced computation cost, which also removes the need to collect long-context data. Concretely, first, the training data of our E 2 -LLM only requires a short length (e.g., 4k), which reduces the tuning cost greatly. Second, the training procedure on the short training context window is performed only once time, and we can support different evaluation context windows at inference. Third, in E 2 - LLM, based on RoPE position embeddings, we introduce two different augmentation methods on the scale and position index parameters for different samples in training. It aims to make the model more robust to the different relative differences when directly interpolating the arbitrary context length at inference. Comprehensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our E 2 -LLM on challenging long-context tasks.
We address a key challenge for neuro-symbolic (NeSy) systems by leveraging convex and bilevel optimization techniques to develop a general gradient-based framework for end-to-end neural and symbolic parameter learning. The applicability of our framework is demonstrated with NeuPSL, a state-of-the-art NeSy architecture. To achieve this, we propose a smooth primal and dual formulation of NeuPSL inference and show learning gradients are functions of the optimal dual variables. Additionally, we develop a dual block coordinate descent algorithm for the new formulation that naturally exploits warm-starts. This leads to over 100x learning runtime improvements over the current best NeuPSL inference method. Finally, we provide extensive empirical evaluations across $8$ datasets covering a range of tasks and demonstrate our learning framework achieves up to a 16% point prediction performance improvement over alternative learning methods.
SARRIGUREN, a new complete algorithm for SAT based on counting clauses (which is valid also for Unique-SAT and #SAT) is described, analyzed and tested. Although existing complete algorithms for SAT perform slower with clauses with many literals, that is an advantage for SARRIGUREN, because the more literals are in the clauses the bigger is the probability of overlapping among clauses, a property that makes the clause counting process more efficient. Actually, it provides a $O(m^2 \times n/k)$ time complexity for random $k$-SAT instances of $n$ variables and $m$ relatively dense clauses, where that density level is relative to the number of variables $n$, that is, clauses are relatively dense when $k\geq7\sqrt{n}$. Although theoretically there could be worst-cases with exponential complexity, the probability of those cases to happen in random $k$-SAT with relatively dense clauses is practically zero. The algorithm has been empirically tested and that polynomial time complexity maintains also for $k$-SAT instances with less dense clauses ($k\geq5\sqrt{n}$). That density could, for example, be of only 0.049 working with $n=20000$ variables and $k=989$ literals. In addition, they are presented two more complementary algorithms that provide the solutions to $k$-SAT instances and valuable information about number of solutions for each literal. Although this algorithm does not solve the NP=P problem (it is not a polynomial algorithm for 3-SAT), it broads the knowledge about that subject, because $k$-SAT with $k>3$ and dense clauses is not harder than 3-SAT. Moreover, the Python implementation of the algorithms, and all the input datasets and obtained results in the experiments are made available.
This work introduces E3x, a software package for building neural networks that are equivariant with respect to the Euclidean group $\mathrm{E}(3)$, consisting of translations, rotations, and reflections of three-dimensional space. Compared to ordinary neural networks, $\mathrm{E}(3)$-equivariant models promise benefits whenever input and/or output data are quantities associated with three-dimensional objects. This is because the numeric values of such quantities (e.g. positions) typically depend on the chosen coordinate system. Under transformations of the reference frame, the values change predictably, but the underlying rules can be difficult to learn for ordinary machine learning models. With built-in $\mathrm{E}(3)$-equivariance, neural networks are guaranteed to satisfy the relevant transformation rules exactly, resulting in superior data efficiency and accuracy. The code for E3x is available from //github.com/google-research/e3x, detailed documentation and usage examples can be found on //e3x.readthedocs.io.
Quantitative analysis of pseudo-diffusion in diffusion-weighted magnetic resonance imaging (DWI) data shows potential for assessing fetal lung maturation and generating valuable imaging biomarkers. Yet, the clinical utility of DWI data is hindered by unavoidable fetal motion during acquisition. We present IVIM-morph, a self-supervised deep neural network model for motion-corrected quantitative analysis of DWI data using the Intra-voxel Incoherent Motion (IVIM) model. IVIM-morph combines two sub-networks, a registration sub-network, and an IVIM model fitting sub-network, enabling simultaneous estimation of IVIM model parameters and motion. To promote physically plausible image registration, we introduce a biophysically informed loss function that effectively balances registration and model-fitting quality. We validated the efficacy of IVIM-morph by establishing a correlation between the predicted IVIM model parameters of the lung and gestational age (GA) using fetal DWI data of 39 subjects. IVIM-morph exhibited a notably improved correlation with gestational age (GA) when performing in-vivo quantitative analysis of fetal lung DWI data during the canalicular phase. IVIM-morph shows potential in developing valuable biomarkers for non-invasive assessment of fetal lung maturity with DWI data. Moreover, its adaptability opens the door to potential applications in other clinical contexts where motion compensation is essential for quantitative DWI analysis. The IVIM-morph code is readily available at: //github.com/TechnionComputationalMRILab/qDWI-Morph.
Batteryless energy harvesting systems enable a wide array of new sensing, computation, and communication platforms untethered by power delivery or battery maintenance demands. Energy harvesters charge a buffer capacitor from an unreliable environmental source until enough energy is stored to guarantee a burst of operation despite changes in power input. Current platforms use a fixed-size buffer chosen at design time to meet constraints on charge time or application longevity, but static energy buffers are a poor fit for the highly volatile power sources found in real-world deployments: fixed buffers waste energy both as heat when they reach capacity during a power surplus and as leakage when they fail to charge the system during a power deficit. To maximize batteryless system performance in the face of highly dynamic input power, we propose REACT: a responsive buffering circuit which varies total capacitance according to net input power. REACT uses a variable capacitor bank to expand capacitance to capture incoming energy during a power surplus and reconfigures internal capacitors to reclaim additional energy from each capacitor as power input falls. Compared to fixed-capacity systems, REACT captures more energy, maximizes usable energy, and efficiently decouples system voltage from stored charge -- enabling low-power and high-performance designs previously limited by ambient power. Our evaluation on real-world platforms shows that REACT eliminates the tradeoff between responsiveness, efficiency, and longevity, increasing the energy available for useful work by an average 25.6% over static buffers optimized for reactivity and capacity, improving event responsiveness by an average 7.7x without sacrificing capacity, and enabling programmer directed longevity guarantees.
Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.