In this paper, we propose a fuzzy adaptive loss function for enhancing deep learning performance in classification tasks. Specifically, we redefine the cross-entropy loss to effectively address class-level noise conditions, including the challenging problem of class imbalance. Our approach introduces aggregation operators, leveraging the power of fuzzy logic to improve classification accuracy. The rationale behind our proposed method lies in the iterative up-weighting of class-level components within the loss function, focusing on those with larger errors. To achieve this, we employ the ordered weighted average (OWA) operator and combine it with an adaptive scheme for gradient-based learning. Through extensive experimentation, our method outperforms other commonly used loss functions, such as the standard cross-entropy or focal loss, across various binary and multiclass classification tasks. Furthermore, we explore the influence of hyperparameters associated with the OWA operators and present a default configuration that performs well across different experimental settings.
In this paper, we propose a computational interpretation of the generalized Kreisel-Putnam rule, also known as the generalized Harrop rule or simply the Split rule, in the style of BHK semantics. We will achieve this by exploiting the Curry-Howard correspondence between formulas and types. First, we inspect the inferential behavior of the Split rule in the setting of a natural deduction system for the intuitionistic propositional logic. This will guide our process of formulating an appropriate program that would capture the corresponding computational content of the typed Split rule. In other words, we want to find an appropriate selector function for the Split rule by considering its typed variant. Our investigation can also be reframed as an effort to answer the following questions: is the Split rule constructively valid in the sense of BHK semantics? Our answer is positive for the Split rule as well as for its newly proposed generalized version called the S rule.
In this paper, we introduce a novel approach for generating random elements of a finite group given a set of generators of that. Our method draws upon combinatorial group theory and automata theory to achieve this objective. Furthermore, we explore the application of this method in generating random elements of a particularly significant group, namely the symmetric group (or group of permutations on a set). Through rigorous analysis, we demonstrate that our proposed method requires fewer average swaps to generate permutations compared to existing approaches. However, recognizing the need for practical applications, we propose a hardware-based implementation based on our theoretical approach, and provide a comprehensive comparison with previous methods. Our evaluation reveals that our method outperforms existing approaches in certain scenarios. Although our primary proposed method only aims to speed up the shuffling and does not decrease its time complexity, we also extend our method to improve the time complexity.
In many practical applications, 3D point cloud analysis requires rotation invariance. In this paper, we present a learnable descriptor invariant under 3D rotations and reflections, i.e., the O(3) actions, utilizing the recently introduced steerable 3D spherical neurons and vector neurons. Specifically, we propose an embedding of the 3D spherical neurons into 4D vector neurons, which leverages end-to-end training of the model. In our approach, we perform TetraTransform--an equivariant embedding of the 3D input into 4D, constructed from the steerable neurons--and extract deeper O(3)-equivariant features using vector neurons. This integration of the TetraTransform into the VN-DGCNN framework, termed TetraSphere, negligibly increases the number of parameters by less than 0.0002%. TetraSphere sets a new state-of-the-art performance classifying randomly rotated real-world object scans of the challenging subsets of ScanObjectNN. Additionally, TetraSphere outperforms all equivariant methods on randomly rotated synthetic data: classifying objects from ModelNet40 and segmenting parts of the ShapeNet shapes. Thus, our results reveal the practical value of steerable 3D spherical neurons for learning in 3D Euclidean space.
In Computational Fluid Dynamics (CFD), coarse mesh simulations offer computational efficiency but often lack precision. Applying conventional super-resolution to these simulations poses a significant challenge due to the fundamental contrast between downsampling high-resolution images and authentically emulating low-resolution physics. The former method conserves more of the underlying physics, surpassing the usual constraints of real-world scenarios. We propose a novel definition of super-resolution tailored for PDE-based problems. Instead of simply downsampling from a high-resolution dataset, we use coarse-grid simulated data as our input and predict fine-grid simulated outcomes. Employing a physics-infused UNet upscaling method, we demonstrate its efficacy across various 2D-CFD problems such as discontinuity detection in Burger's equation, Methane combustion, and fouling in Industrial heat exchangers. Our method enables the generation of fine-mesh solutions bypassing traditional simulation, ensuring considerable computational saving and fidelity to the original ground truth outcomes. Through diverse boundary conditions during training, we further establish the robustness of our method, paving the way for its broad applications in engineering and scientific CFD solvers.
In this paper we introduce a novel statistical framework based on the first two quantile conditional moments that facilitates effective goodness-of-fit testing for one-sided L\'evy distributions. The scale-ratio framework introduced in this paper extends our previous results in which we have shown how to extract unique distribution features using conditional variance ratio for the generic class of {\alpha}-stable distributions. We show that the conditional moment-based goodness-of-fit statistics are a good alternative to other methods introduced in the literature tailored to the one-sided L\'evy distributions. The usefulness of our approach is verified using an empirical test power study. For completeness, we also derive the asymptotic distributions of the test statistics and show how to apply our framework to real data.
This work considers the fundamental problem of learning an unknown object from training data using a given model class. We introduce a unified framework that allows for objects in arbitrary Hilbert spaces, general types of (random) linear measurements as training data and general types of nonlinear model classes. We establish a series of learning guarantees for this framework. These guarantees provide explicit relations between the amount of training data and properties of the model class to ensure near-best generalization bounds. In doing so, we also introduce and develop the key notion of the variation of a model class with respect to a distribution of sampling operators. To exhibit the versatility of this framework, we show that it can accommodate many different types of well-known problems of interest. We present examples such as matrix sketching by random sampling, compressed sensing with isotropic vectors, active learning in regression and compressed sensing with generative models. In all cases, we show how known results become straightforward corollaries of our general learning guarantees. For compressed sensing with generative models, we also present a number of generalizations and improvements of recent results. In summary, our work not only introduces a unified way to study learning unknown objects from general types of data, but also establishes a series of general theoretical guarantees which consolidate and improve various known results.
In this paper, we study numerical approximations for stochastic differential equations (SDEs) that use adaptive step sizes. In particular, we consider a general setting where decisions to reduce step sizes are allowed to depend on the future trajectory of the underlying Brownian motion. Since these adaptive step sizes may not be previsible, the standard mean squared error analysis cannot be directly applied to show that the numerical method converges to the solution of the SDE. Building upon the pioneering work of Gaines and Lyons, we shall instead use rough path theory to establish convergence for a wide class of adaptive numerical methods on general Stratonovich SDEs (with sufficiently smooth vector fields). To the author's knowledge, this is the first error analysis applicable to standard solvers, such as the Milstein and Heun methods, with non-previsible step sizes. In our analysis, we require the sequence of adaptive step sizes to be nested and the SDE solver to have unbiased "L\'evy area" terms in its Taylor expansion. We conjecture that for adaptive SDE solvers more generally, convergence is still possible provided the method does not introduce "L\'evy area bias". We present a simple example where the step size control can skip over previously considered times, resulting in the numerical method converging to an incorrect limit (i.e. not the Stratonovich SDE). Finally, we conclude with a numerical experiment demonstrating a newly introduced adaptive scheme and showing the potential improvements in accuracy when step sizes are allowed to be non-previsible.
In this paper, we introduce a new simple approach to developing and establishing the convergence of splitting methods for a large class of stochastic differential equations (SDEs), including additive, diagonal and scalar noise types. The central idea is to view the splitting method as a replacement of the driving signal of an SDE, namely Brownian motion and time, with a piecewise linear path that yields a sequence of ODEs $-$ which can be discretised to produce a numerical scheme. This new way of understanding splitting methods is inspired by, but does not use, rough path theory. We show that when the driving piecewise linear path matches certain iterated stochastic integrals of Brownian motion, then a high order splitting method can be obtained. We propose a general proof methodology for establishing the strong convergence of these approximations that is akin to the general framework of Milstein and Tretyakov. That is, once local error estimates are obtained for the splitting method, then a global rate of convergence follows. This approach can then be readily applied in future research on SDE splitting methods. By incorporating recently developed approximations for iterated integrals of Brownian motion into these piecewise linear paths, we propose several high order splitting methods for SDEs satisfying a certain commutativity condition. In our experiments, which include the Cox-Ingersoll-Ross model and additive noise SDEs (noisy anharmonic oscillator, stochastic FitzHugh-Nagumo model, underdamped Langevin dynamics), the new splitting methods exhibit convergence rates of $O(h^{3/2})$ and outperform schemes previously proposed in the literature.
In this paper, we derive an optimal first-order Taylor-like formula. In a seminal paper [14], we introduced a new first-order Taylor-like formula that yields a reduced remainder compared to the classical Taylor's formula. Here, we relax the assumption of equally spaced points in our formula. Instead, we consider a sequence of unknown points and a sequence of unknown weights. Then, we solve an optimization problem to determine the best distribution of points and weights that ensures that the remainder is as minimal as possible.
We propose, in this paper, a Variable Spiking Wavelet Neural Operator (VS-WNO), which aims to bridge the gap between theoretical and practical implementation of Artificial Intelligence (AI) algorithms for mechanics applications. With recent developments like the introduction of neural operators, AI's potential for being used in mechanics applications has increased significantly. However, AI's immense energy and resource requirements are a hurdle in its practical field use case. The proposed VS-WNO is based on the principles of spiking neural networks, which have shown promise in reducing the energy requirements of the neural networks. This makes possible the use of such algorithms in edge computing. The proposed VS-WNO utilizes variable spiking neurons, which promote sparse communication, thus conserving energy, and its use is further supported by its ability to tackle regression tasks, often faced in the field of mechanics. Various examples dealing with partial differential equations, like Burger's equation, Allen Cahn's equation, and Darcy's equation, have been shown. Comparisons have been shown against wavelet neural operator utilizing leaky integrate and fire neurons (direct and encoded inputs) and vanilla wavelet neural operator utilizing artificial neurons. The results produced illustrate the ability of the proposed VS-WNO to converge to ground truth while promoting sparse communication.