The combined increase of energy demand and environmental pollution at a global scale is entailing a rethinking of the production models in sustainable terms. As a consequence, energy suppliers are starting to adopt strategies that flatten demand peaks in power plants by means of pricing policies that stimulate a change in the consumption practices of customers. A representative example is the Time-of-Use (TOU)-based tariffs policy, which encourages electricity usage at off-peak hours by means of low prices, while penalizing peak hours with higher prices. The TOU-based tariffs policy induces a partitioning of the time horizon into a set of time slots, each associated with a cost that becomes a part of the optimization objective. This thesis focuses on a representative bi-objective energy-efficient job scheduling problem on parallel identical machines under TOU-based tariffs by delving into the description of its inherent properties, mathematical formulations, and solution approaches. Specifically, the thesis starts by reviewing the flourishing literature on the subject, and providing a useful framework for theoreticians and practitioners. Subsequently, it describes the considered problem and investigates its theoretical properties. In the same chapter, it presents a first mathematical model for the problem, as well as a possible reformulation that exploits the structure of the solution space so as to achieve a considerable increase in compactness. Afterwards, the thesis introduces a sophisticated heuristic scheme to tackle the inherent hardness of the problem, and an exact algorithm that exploits the mathematical models. Then, it shows the computational efficiency of the presented solution approaches on a wide test benchmark. Finally, it presents a perspective on future research directions for the class of energy-efficient scheduling problems under TOU-based tariffs as a whole.
As an efficient distributed machine learning approach, Federated learning (FL) can obtain a shared model by iterative local model training at the user side and global model aggregating at the central server side, thereby protecting privacy of users. Mobile users in FL systems typically communicate with base stations (BSs) via wireless channels, where training performance could be degraded due to unreliable access caused by user mobility. However, existing work only investigates a static scenario or random initialization of user locations, which fail to capture mobility in real-world networks. To tackle this issue, we propose a practical model for user mobility in FL across multiple BSs, and develop a user scheduling and resource allocation method to minimize the training delay with constrained communication resources. Specifically, we first formulate an optimization problem with user mobility that jointly considers user selection, BS assignment to users, and bandwidth allocation to minimize the latency in each communication round. This optimization problem turned out to be NP-hard and we proposed a delay-aware greedy search algorithm (DAGSA) to solve it. Simulation results show that the proposed algorithm achieves better performance than the state-of-the-art baselines and a certain level of user mobility could improve training performance.
In this paper, we show that in a parallel processing system, if a partial order is induced among the local states visited by a node, then synchronization cost can be eliminated. As a result of this partial order, a DAG is induced among the global states. Specifically, we show that in such systems, correctness is preserved even if the nodes execute asynchronously and read old information of other nodes. We present two variations for inducing DAGs -- \textit{DAG-inducing problems}, where the problem definition itself induces a DAG, and \textit{DAG-inducing algorithms}, where a DAG is induced by the algorithm. We demonstrate that the dominant clique (DC) problem and shortest path (SP) problem are DAG-inducing problems. Among these, DC allows self-stabilization, whereas the algorithm that we present for SP does not. We demonstrate that maximal matching (MM) is not a DAG-inducing problem. However, a DAG-inducing algorithm can be developed for it. The algorithm for MM allows self-stabilization. This algorithm converges in $2n$ moves and does not require a synchronous environment, which is an improvement over the existing algorithms in the literature. The algorithm for DC converges in $2m$ moves, and the algorithm for SP converges in $\mathcal{D}$ rounds. ($n$ is the number of nodes and $m$ is the number of edges in the input graph, and $\mathcal{D}$ is its diameter.) We also note that DAG-inducing problems are more general than, and encapsulate, lattice linear problems (Garg, SPAA 2020). Similarly, DAG-inducing algorithms encapsulate lattice linear algorithms (Gupta and Kulkarni, SSS 2022). We also show that a partial order induced among the local states visited by a node, as discussed above, is a necessary and sufficient condition to allow asynchrony.
In this paper we study the relation of two fundamental problems in scheduling and fair allocation: makespan minimization on unrelated parallel machines and max-min fair allocation, also known as the Santa Claus problem. For both of these problems the best approximation factor is a notorious open question; more precisely, whether there is a better-than-2 approximation for the former problem and whether there is a constant approximation for the latter. While the two problems are intuitively related and history has shown that techniques can often be transferred between them, no formal reductions are known. We first show that an affirmative answer to the open question for makespan minimization implies the same for the Santa Claus problem by reducing the latter problem to the former. We also prove that for problem instances with only two input values both questions are equivalent. We then move to a special case called ``restricted assignment'', which is well studied in both problems. Although our reductions do not maintain the characteristics of this special case, we give a reduction in a slight generalization, where the jobs or resources are assigned to multiple machines or players subject to a matroid constraint and in addition we have only two values. This draws a similar picture as before: equivalence for two values and the general case of Santa Claus can only be easier than makespan minimization. To complete the picture, we give an algorithm for our new matroid variant of the Santa Claus problem using a non-trivial extension of the local search method from restricted assignment. Thereby we unify, generalize, and improve several previous results. We believe that this matroid generalization may be of independent interest and provide several sample applications.
In many stochastic service systems, decision-makers find themselves making a sequence of decisions, with the number of decisions being unpredictable. To enhance these decisions, it is crucial to uncover the causal impact these decisions have through careful analysis of observational data from the system. However, these decisions are not made independently, as they are shaped by previous decisions and outcomes. This phenomenon is called sequential bias and violates a key assumption in causal inference that one person's decision does not interfere with the potential outcomes of another. To address this issue, we establish a connection between sequential bias and the subfield of causal inference known as dynamic treatment regimes. We expand these frameworks to account for the random number of decisions by modeling the decision-making process as a marked point process. Consequently, we can define and identify causal effects to quantify sequential bias. Moreover, we propose estimators and explore their properties, including double robustness and semiparametric efficiency. In a case study of 27,831 encounters with a large academic emergency department, we use our approach to demonstrate that the decision to route a patient to an area for low acuity patients has a significant impact on the care of future patients.
\textit{Weighted shortest processing time first} (WSPT) is one of the best known algorithms for total weighted completion time scheduling problems. For each job $J_j$, it first combines the two independent job parameters weight $w_j$ and processing time $p_j$ by simply forming the so called Smith ratio $w_j/p_j$. Then it schedules the jobs in order of decreasing Smith ratio values. The algorithm guarantees an optimal schedule for a single machine and the approximation factor $1.2071$ for parallel identical machines. For the corresponding online problem in a single machine environment with preemption, the \textit{weighted shortest remaining processing time first} (WSRPT) algorithm replaces the processing time $p_j$ with the remaining processing time $p_j(t)$ for every job that is only partially executed at time $t$ when determining the Smith ratio. Since more than 10 years, we only know that the competitive ratio of this algorithm is in the interval $[1.2157,2]$. In this paper, we present the tight competitive ratio $1.2259$ for WSRPT. To this end, we iteratively reduce the instance space of the problem without affecting the worst case performance until we are able to analyze the remaining instances. This result makes WSRPT the best known algorithm for deterministic online total weighted completion time scheduling in a preemptive single machine environment improving the previous competitive ratio of $1.5651$. Additionally, we increase the lower bound of this competitive ratio from $1.0730$ to $1.1038$.
Linear Regression is a seminal technique in statistics and machine learning, where the objective is to build linear predictive models between a response (i.e., dependent) variable and one or more predictor (i.e., independent) variables. In this paper, we revisit the classical technique of Quantile Regression (QR), which is statistically a more robust alternative to the other classical technique of Ordinary Least Square Regression (OLS). However, while there exist efficient algorithms for OLS, almost all of the known results for QR are only weakly polynomial. Towards filling this gap, this paper proposes several efficient strongly polynomial algorithms for QR for various settings. For two dimensional QR, making a connection to the geometric concept of $k$-set, we propose an algorithm with a deterministic worst-case time complexity of $\mathcal{O}(n^{4/3} polylog(n))$ and an expected time complexity of $\mathcal{O}(n^{4/3})$ for the randomized version. We also propose a randomized divide-and-conquer algorithm -- RandomizedQR with an expected time complexity of $\mathcal{O}(n\log^2{(n)})$ for two dimensional QR problem. For the general case with more than two dimensions, our RandomizedQR algorithm has an expected time complexity of $\mathcal{O}(n^{d-1}\log^2{(n)})$.
Current automated machine learning (ML) tools are model-centric, focusing on model selection and parameter optimization. However, the majority of the time in data analysis is devoted to data cleaning and wrangling, for which limited tools are available. Here we present DataAssist, an automated data preparation and cleaning platform that enhances dataset quality using ML-informed methods. We show that DataAssist provides a pipeline for exploratory data analysis and data cleaning, including generating visualization for user-selected variables, unifying data annotation, suggesting anomaly removal, and preprocessing data. The exported dataset can be readily integrated with other autoML tools or user-specified model for downstream analysis. Our data-centric tool is applicable to a variety of fields, including economics, business, and forecasting applications saving over 50\% time of the time spent on data cleansing and preparation.
This paper discusses and evaluates ideas of data balancing and data augmentation in the context of mathematical objects: an important topic for both the symbolic computation and satisfiability checking communities, when they are making use of machine learning techniques to optimise their tools. We consider a dataset of non-linear polynomial problems and the problem of selecting a variable ordering for cylindrical algebraic decomposition to tackle these with. By swapping the variable names in already labelled problems, we generate new problem instances that do not require any further labelling when viewing the selection as a classification problem. We find this augmentation increases the accuracy of ML models by 63% on average. We study what part of this improvement is due to the balancing of the dataset and what is achieved thanks to further increasing the size of the dataset, concluding that both have a very significant effect. We finish the paper by reflecting on how this idea could be applied in other uses of machine learning in mathematics.
Over recent years, there has been a rapid development of deep learning (DL) in both industry and academia fields. However, finding the optimal hyperparameters of a DL model often needs high computational cost and human expertise. To mitigate the above issue, evolutionary computation (EC) as a powerful heuristic search approach has shown significant merits in the automated design of DL models, so-called evolutionary deep learning (EDL). This paper aims to analyze EDL from the perspective of automated machine learning (AutoML). Specifically, we firstly illuminate EDL from machine learning and EC and regard EDL as an optimization problem. According to the DL pipeline, we systematically introduce EDL methods ranging from feature engineering, model generation, to model deployment with a new taxonomy (i.e., what and how to evolve/optimize), and focus on the discussions of solution representation and search paradigm in handling the optimization problem by EC. Finally, key applications, open issues and potentially promising lines of future research are suggested. This survey has reviewed recent developments of EDL and offers insightful guidelines for the development of EDL.
This paper serves as a survey of recent advances in large margin training and its theoretical foundations, mostly for (nonlinear) deep neural networks (DNNs) that are probably the most prominent machine learning models for large-scale data in the community over the past decade. We generalize the formulation of classification margins from classical research to latest DNNs, summarize theoretical connections between the margin, network generalization, and robustness, and introduce recent efforts in enlarging the margins for DNNs comprehensively. Since the viewpoint of different methods is discrepant, we categorize them into groups for ease of comparison and discussion in the paper. Hopefully, our discussions and overview inspire new research work in the community that aim to improve the performance of DNNs, and we also point to directions where the large margin principle can be verified to provide theoretical evidence why certain regularizations for DNNs function well in practice. We managed to shorten the paper such that the crucial spirit of large margin learning and related methods are better emphasized.