We study the problem of fairly allocating a set of indivisible goods among agents with matroid rank valuations -- every good provides a marginal value of $0$ or $1$ when added to a bundle and valuations are submodular. We generalize the Yankee Swap algorithm to create a simple framework, called General Yankee Swap, that can efficiently compute allocations that maximize any justice criterion (or fairness objective) satisfying some mild assumptions. Along with maximizing a justice criterion, General Yankee Swap is guaranteed to maximize utilitarian social welfare, ensure strategyproofness and use at most a quadratic number of valuation queries. We show how General Yankee Swap can be used to compute allocations for five different well-studied justice criteria: (a) Prioritized Lorenz dominance, (b) Maximin fairness, (c) Weighted leximin, (d) Max weighted Nash welfare, and (e) Max weighted $p$-mean welfare. In particular, our framework provides the first polynomial time algorithms to compute weighted leximin, max weighted Nash welfare and max weighted $p$-mean welfare allocations for agents with matroid rank valuations.
This research proposes a machine learning-based attack detection model for power systems, specifically targeting smart grids. By utilizing data and logs collected from Phasor Measuring Devices (PMUs), the model aims to learn system behaviors and effectively identify potential security boundaries. The proposed approach involves crucial stages including dataset pre-processing, feature selection, model creation, and evaluation. To validate our approach, we used a dataset used, consist of 15 separate datasets obtained from different PMUs, relay snort alarms and logs. Three machine learning models: Random Forest, Logistic Regression, and K-Nearest Neighbour were built and evaluated using various performance metrics. The findings indicate that the Random Forest model achieves the highest performance with an accuracy of 90.56% in detecting power system disturbances and has the potential in assisting operators in decision-making processes.
The operationalization of algorithmic fairness comes with several practical challenges, not the least of which is the availability or reliability of protected attributes in datasets. In real-world contexts, practical and legal impediments may prevent the collection and use of demographic data, making it difficult to ensure algorithmic fairness. While initial fairness algorithms did not consider these limitations, recent proposals aim to achieve algorithmic fairness in classification by incorporating noisiness in protected attributes or not using protected attributes at all. To the best of our knowledge, this is the first head-to-head study of fair classification algorithms to compare attribute-reliant, noise-tolerant and attribute-blind algorithms along the dual axes of predictivity and fairness. We evaluated these algorithms via case studies on four real-world datasets and synthetic perturbations. Our study reveals that attribute-blind and noise-tolerant fair classifiers can potentially achieve similar level of performance as attribute-reliant algorithms, even when protected attributes are noisy. However, implementing them in practice requires careful nuance. Our study provides insights into the practical implications of using fair classification algorithms in scenarios where protected attributes are noisy or partially available.
Citation metrics are the best tools for research assessments. However, current metrics are misleading in research systems that pursue simultaneously different goals, such as the advance of science and incremental innovations, because their publications have different citation distributions. We estimate the contribution to the progress of knowledge by studying only a limited number of the most cited papers, which are dominated by publications pursuing this progress. To field-normalize the metrics, we substitute the number of citations by the rank position of papers from one country in the global list of papers. Using synthetic series of lognormally distributed numbers, we developed the Rk-index, which is calculated from the global ranks of the 10 highest numbers in each series, and demonstrate its equivalence to the number of papers in top percentiles, P top 0.1% and P top 0.01% . In real cases, the Rk-index is simple and easy to calculate, and evaluates the contribution to the progress of knowledge much better than commonly used metrics. Although further research is needed, rank analysis of the most cited papers is a promising approach for research evaluation. It is also demonstrated that, for this purpose, domestic and collaborative papers should be studied independently.
Over the course of the past two decades, a substantial body of research has substantiated the viability of utilising cardiac signals as a biometric modality. This paper presents a novel approach for patient identification in healthcare systems using electrocardiogram signals. A convolutional neural network is used to classify users based on images extracted from ECG signals. The proposed identification system is evaluated in multiple databases, providing a comprehensive understanding of its potential in real-world scenarios. The impact of Cardiovascular Diseases on generic user identification has been largely overlooked in previous studies. The presented method takes into account the cardiovascular condition of the patients, ensuring that the results obtained are not biased or limited. Furthermore, the results obtained are consistent and reliable, with lower error rates and higher accuracy metrics, as demonstrated through extensive experimentation. All these features make the proposed method a valuable contribution to the field of patient identification in healthcare systems, and make it a strong contender for practical applications.
The problem of comparing probability distributions is at the heart of many tasks in statistics and machine learning and the most classical comparison methods assume that the distributions occur in spaces of the same dimension. Recently, a new geometric solution has been proposed to address this problem when the measures live in Euclidean spaces of differing dimensions. Here, we study the same problem of comparing probability distributions of different dimensions in the tropical geometric setting, which is becoming increasingly relevant in computations and applications involving complex, geometric data structures. Specifically, we construct a Wasserstein distance between measures on different tropical projective tori - the focal metric spaces in both theory and applications of tropical geometry - via tropical mappings between probability measures. We prove equivalence of the directionality of the maps, whether starting from the lower dimensional space and mapping to the higher dimensional space or vice versa. As an important practical implication, our work provides a framework for comparing probability distributions on the spaces of phylogenetic trees with different leaf sets.
Mobile edge computing (MEC) enables low-latency and high-bandwidth applications by bringing computation and data storage closer to end-users. Intelligent computing is an important application of MEC, where computing resources are used to solve intelligent task-related problems based on task requirements. However, efficiently offloading computing and allocating resources for intelligent tasks in MEC systems is a challenging problem due to complex interactions between task requirements and MEC resources. To address this challenge, we investigate joint computing offloading and resource allocation for intelligent tasks in MEC systems. Our goal is to optimize system utility by jointly considering computing accuracy and task delay to achieve maximum system performance. We focus on classification intelligence tasks and formulate an optimization problem that considers both the accuracy requirements of tasks and the parallel computing capabilities of MEC systems. To solve the optimization problem, we decompose it into three subproblems: subcarrier allocation, computing capacity allocation, and compression offloading. We use convex optimization and successive convex approximation to derive closed-form expressions for the subcarrier allocation, offloading decisions, computing capacity, and compressed ratio. Based on our solutions, we design an efficient computing offloading and resource allocation algorithm for intelligent tasks in MEC systems. Our simulation results demonstrate that our proposed algorithm significantly improves the performance of intelligent tasks in MEC systems and achieves a flexible trade-off between system revenue and cost considering intelligent tasks compared with the benchmarks.
This paper presents the FormAI dataset, a large collection of 112,000 AI-generated compilable and independent C programs with vulnerability classification. We introduce a dynamic zero-shot prompting technique, constructed to spawn a diverse set of programs utilizing Large Language Models (LLMs). The dataset is generated by GPT-3.5-turbo and comprises programs with varying levels of complexity. Some programs handle complicated tasks such as network management, table games, or encryption, while others deal with simpler tasks like string manipulation. Every program is labeled with the vulnerabilities found within the source code, indicating the type, line number, and vulnerable function name. This is accomplished by employing a formal verification method using the Efficient SMT-based Bounded Model Checker (ESBMC), which performs model checking, abstract interpretation, constraint programming, and satisfiability modulo theories, to reason over safety/security properties in programs. This approach definitively detects vulnerabilities and offers a formal model known as a counterexample, thus eliminating the possibility of generating false positive reports. This property of the dataset makes it suitable for evaluating the effectiveness of various static and dynamic analysis tools. Furthermore, we have associated the identified vulnerabilities with relevant Common Weakness Enumeration (CWE) numbers. We make the source code available for the 112,000 programs, accompanied by a comprehensive list detailing the vulnerabilities detected in each individual program including location and function name, which makes the dataset ideal to train LLMs and machine learning algorithms.
As Internet of Things (IoT) devices proliferate, sustainable methods for powering them are becoming indispensable. The wireless provision of power enables battery-free operation and is crucial for complying with weight and size restrictions. For the energy harvesting components of these devices to be small, a high operating frequency is necessary. In conjunction with an electrically large antenna, the receivers may be located in the radiating near-field (Fresnel) region, e.g., in indoor scenarios. In this paper, we propose a wireless power transfer system to ensure a reliable supply of power to an arbitrary number of mobile, low-power, and single-antenna receivers, which are located in a three-dimensional cuboid room. To this end, we formulate a max-min optimisation problem to determine the optimal allocation of transmit power among an infinite number of radiating elements of the system's transmit antenna array. Thereby, the optimal deployment, i.e, the set of transmit antenna positions that are allocated non-zero transmit power according to the optimal allocation, is obtained implicitly. Generally, the set of transmit antenna positions corresponding to the optimal deployment has Lebesgue measure zero and the closure of the set has empty interior. Moreover, for a one-dimensional transmit antenna array, the set of transmit antenna positions is proven to be finite. The proposed optimal solution is validated through simulation. Simulation results indicate that the optimal deployment requires a finite number of transmit antennas and depends on the geometry of the environment and the dimensionality of the transmit antenna array. The robustness of the solution, which is obtained under a line-of-sight (LoS) assumption between the transmitter and receiver, is assessed in an isotropic scattering environment containing a strong LoS component.
To improve the convergence property of the randomized Kaczmarz (RK) method for solving linear systems, Bai and Wu (SIAM J. Sci. Comput., 40(1):A592--A606, 2018) originally introduced a greedy probability criterion for effectively selecting the working row from the coefficient matrix and constructed the greedy randomized Kaczmarz (GRK) method. Due to its simplicity and efficiency, this approach has inspired numerous subsequent works in recent years, such as the capped adaptive sampling rule, the greedy augmented randomized Kaczmarz method, and the greedy randomized coordinate descent method. Since the iterates of the GRK method are actually random variables, existing convergence analyses are all related to the expectation of the error. In this note, we prove that the linear convergence rate of the GRK method is deterministic, i.e. not in the sense of expectation. Moreover, the Polyak's heavy ball momentum technique is incorporated to improve the performance of the GRK method. We propose a refined convergence analysis, compared with the technique used in Loizou and Richt\'{a}rik (Comput. Optim. Appl., 77(3):653--710, 2020), of momentum variants of randomized iterative methods, which shows that the proposed GRK method with momentum (mGRK) also enjoys a deterministic linear convergence. Numerical experiments show that the mGRK method is more efficient than the GRK method.
In real applications, non-Gaussian distributions are frequently caused by outliers and impulsive disturbances, and these will impair the performance of the classical cubature Kalman filter (CKF) algorithm. In this letter, a modified generalized minimum error entropy criterion with fiducial point (GMEEFP) is studied to ensure that the error comes together to around zero, and a new CKF algorithm based on the GMEEFP criterion, called GMEEFP-CKF algorithm, is developed. To demonstrate the practicality of the GMEEFP-CKF algorithm, several simulations are performed, and it is demonstrated that the proposed GMEEFP-CKF algorithm outperforms the existing CKF algorithms with impulse noise.