We consider the computational problem of finding short paths in the skeleton of the perfect matching polytope of a bipartite graph. We prove that unless $P=NP$, there is no polynomial-time algorithm that computes a path of constant length between two vertices at distance two of the perfect matching polytope of a bipartite graph. Conditioned on $P\neq NP$, this disproves a conjecture by Ito, Kakimura, Kamiyama, Kobayashi and Okamoto [SIAM Journal on Discrete Mathematics, 36(2), pp. 1102-1123 (2022)]. Assuming the Exponential Time Hypothesis we prove the stronger result that there exists no polynomial-time algorithm computing a path of length at most $\left(\frac{1}{4}-o(1)\right)\frac{\log N}{\log \log N}$ between two vertices at distance two of the perfect matching polytope of an $N$-vertex bipartite graph. These results remain true if the bipartite graph is restricted to be of maximum degree three. The above has the following interesting implication for the performance of pivot rules for the simplex algorithm on simply-structured combinatorial polytopes: If $P\neq NP$, then for every simplex pivot rule executable in polynomial time and every constant $k \in \mathbb{N}$ there exists a linear program on a perfect matching polytope and a starting vertex of the polytope such that the optimal solution can be reached in two monotone steps from the starting vertex, yet the pivot rule will require at least $k$ steps to reach the optimal solution. This result remains true in the more general setting of pivot rules for so-called circuit-augmentation algorithms.
Explicitly accounting for uncertainties is paramount to the safety of engineering structures. Optimization which is often carried out at the early stage of the structural design offers an ideal framework for this task. When the uncertainties are mainly affecting the objective function, robust design optimization is traditionally considered. This work further assumes the existence of multiple and competing objective functions that need to be dealt with simultaneously. The optimization problem is formulated by considering quantiles of the objective functions which allows for the combination of both optimality and robustness in a single metric. By introducing the concept of common random numbers, the resulting nested optimization problem may be solved using a general-purpose solver, herein the non-dominated sorting genetic algorithm (NSGA-II). The computational cost of such an approach is however a serious hurdle to its application in real-world problems. We therefore propose a surrogate-assisted approach using Kriging as an inexpensive approximation of the associated computational model. The proposed approach consists of sequentially carrying out NSGA-II while using an adaptively built Kriging model to estimate the quantiles. Finally, the methodology is adapted to account for mixed categorical-continuous parameters as the applications involve the selection of qualitative design parameters as well. The methodology is first applied to two analytical examples showing its efficiency. The third application relates to the selection of optimal renovation scenarios of a building considering both its life cycle cost and environmental impact. It shows that when it comes to renovation, the heating system replacement should be the priority.
While deep learning-based methods for blind face restoration have achieved unprecedented success, they still suffer from two major limitations. First, most of them deteriorate when facing complex degradations out of their training data. Second, these methods require multiple constraints, e.g., fidelity, perceptual, and adversarial losses, which require laborious hyper-parameter tuning to stabilize and balance their influences. In this work, we propose a novel method named DifFace that is capable of coping with unseen and complex degradations more gracefully without complicated loss designs. The key of our method is to establish a posterior distribution from the observed low-quality (LQ) image to its high-quality (HQ) counterpart. In particular, we design a transition distribution from the LQ image to the intermediate state of a pre-trained diffusion model and then gradually transmit from this intermediate state to the HQ target by recursively applying a pre-trained diffusion model. The transition distribution only relies on a restoration backbone that is trained with $L_2$ loss on some synthetic data, which favorably avoids the cumbersome training process in existing methods. Moreover, the transition distribution can contract the error of the restoration backbone and thus makes our method more robust to unknown degradations. Comprehensive experiments show that DifFace is superior to current state-of-the-art methods, especially in cases with severe degradations. Our code and model are available at //github.com/zsyOAOA/DifFace.
Nesterov's accelerated gradient descent (NAG) is one of the milestones in the history of first-order algorithms. It was not successfully uncovered until the high-resolution differential equation framework was proposed in [Shi et al., 2022] that the mechanism behind the acceleration phenomenon is due to the gradient correction term. To deepen our understanding of the high-resolution differential equation framework on the convergence rate, we continue to investigate NAG for the $\mu$-strongly convex function based on the techniques of Lyapunov analysis and phase-space representation in this paper. First, we revisit the proof from the gradient-correction scheme. Similar to [Chen et al., 2022], the straightforward calculation simplifies the proof extremely and enlarges the step size to $s=1/L$ with minor modification. Meanwhile, the way of constructing Lyapunov functions is principled. Furthermore, we also investigate NAG from the implicit-velocity scheme. Due to the difference in the velocity iterates, we find that the Lyapunov function is constructed from the implicit-velocity scheme without the additional term and the calculation of iterative difference becomes simpler. Together with the optimal step size obtained, the high-resolution differential equation framework from the implicit-velocity scheme of NAG is perfect and outperforms the gradient-correction scheme.
Warning: this paper contains content that may be offensive or upsetting. Considering the large amount of content created online by the minute, slang-aware automatic tools are critically needed to promote social good, and assist policymakers and moderators in restricting the spread of offensive language, abuse, and hate speech. Despite the success of large language models and the spontaneous emergence of slang dictionaries, it is unclear how far their combination goes in terms of slang understanding for downstream social good tasks. In this paper, we provide a framework to study different combinations of representation learning models and knowledge resources for a variety of downstream tasks that rely on slang understanding. Our experiments show the superiority of models that have been pre-trained on social media data, while the impact of dictionaries is positive only for static word embeddings. Our error analysis identifies core challenges for slang representation learning, including out-of-vocabulary words, polysemy, variance, and annotation disagreements, which can be traced to characteristics of slang as a quickly evolving and highly subjective language.
In this paper, we study the asymptotic properties (bias, variance, mean squared error) of Bernstein estimators for cumulative distribution functions and density functions near and on the boundary of the $d$-dimensional simplex. Our results generalize those found by Leblanc (2012), who treated the case $d=1$, and complement the results from Ouimet (2021) in the interior of the simplex. Since the "edges" of the $d$-dimensional simplex have dimensions going from $0$ (vertices) up to $d - 1$ (facets) and our kernel function is multinomial, the asymptotic expressions for the bias, variance and mean squared error are not straightforward extensions of one-dimensional asymptotics as they would be for product-type estimators studied by almost all past authors in the context of Bernstein estimators or asymmetric kernel estimators. This point makes the mathematical analysis much more interesting.
We study geometric variations of the discriminating code problem. In the \emph{discrete version} of the problem, a finite set of points $P$ and a finite set of objects $S$ are given in $\mathbb{R}^d$. The objective is to choose a subset $S^* \subseteq S$ of minimum cardinality such that for each point $p_i \in P$, the subset $S_i^* \subseteq S^*$ covering $p_i$ satisfies $S_i^*\neq \emptyset$, and each pair $p_i,p_j \in P$, $i \neq j$, we have $S_i^* \neq S_j^*$. In the \emph{continuous version} of the problem, the solution set $S^*$ can be chosen freely among a (potentially infinite) class of allowed geometric objects. In the 1-dimensional case ($d=1$), the points in $P$ are placed on a horizontal line $L$, and the objects in $S$ are finite-length line segments aligned with $L$ (called intervals). We show that the discrete version of this problem is NP-complete. This is somewhat surprising as the continuous version is known to be polynomial-time solvable. Still, for the 1-dimensional discrete version, we design a polynomial-time $2$-approximation algorithm. We also design a PTAS for both discrete and continuous versions in one dimension, for the restriction where the intervals are all required to have the same length. We then study the 2-dimensional case ($d=2$) for axis-parallel unit square objects. We show that both continuous and discrete versions are NP-complete, and design polynomial-time approximation algorithms that produce $(16\cdot OPT+1)$-approximate and $(64\cdot OPT+1)$-approximate solutions respectively, using rounding of suitably defined integer linear programming problems. We show that the identifying code problem for axis-parallel unit square intersection graphs (in $d=2$) can be solved in the same manner as for the discrete version of the discriminating code problem for unit square objects.
The quantum circuits that declare quantum supremacy, such as Google Sycamore [Nature \textbf{574}, 505 (2019)], raises a paradox in building reliable result references. While simulation on traditional computers seems the sole way to provide reliable verification, the required run time is doomed with an exponentially-increasing compute complexity. To find a way to validate current ``quantum-supremacy" circuits with more than $50$ qubits, we propose a simulation method that exploits the ``classical advantage" (the inherent ``store-and-compute" operation mode of von Neumann machines) of current supercomputers, and computes uncorrelated amplitudes of a random quantum circuit with an optimal reuse of the intermediate results and a minimal memory overhead throughout the process. Such a reuse strategy reduces the original linear scaling of the total compute cost against the number of amplitudes to a sublinear pattern, with greater reduction for more amplitudes. Based on a well-optimized implementation of this method on a new-generation Sunway supercomputer, we directly verify Sycamore by computing three million exact amplitudes for the experimentally generated bitstrings, obtaining an XEB fidelity of $0.191\%$ which closely matches the estimated value of $0.224\%$. Our computation scales up to $41,932,800$ cores with a sustained single-precision performance of $84.8$ Pflops, which is accomplished within $8.5$ days. Our method has a far-reaching impact in solving quantum many-body problems, statistical problems as well as combinatorial optimization problems where one often needs to contract many tensor networks which share a significant portion of tensors in common.
Assignment mechanisms for many-to-one matching markets with preferences revolve around the key concept of stability. Using school choice as our matching market application, we introduce the problem of jointly allocating a school capacity expansion and finding the best stable allocation for the students in the expanded market. We analyze theoretically the problem, focusing on the trade-off behind the multiplicity of student-optimal assignments, the incentive properties, and the problem's complexity. Due to the impossibility of efficiently solving the problem with classical methods, we generalize existent mathematical programming formulations of stability constraints to our setting, most of which result in integer quadratically-constrained programs. In addition, we propose a novel mixed-integer linear programming formulation that is exponentially-large on the problem size. We show that its stability constraints can be separated in linear time, leading to an effective cutting-plane method. We evaluate the performance of our approaches in a detailed computational study, and we find that our cutting-plane method outperforms mixed-integer programming solvers applied to the formulations obtained by extending existing approaches. We also propose two heuristics that are effective for large instances of the problem. Finally, we use the Chilean school choice system data to demonstrate the impact of capacity planning under stability conditions. Our results show that each additional school seat can benefit multiple students. Moreover, our methodology can prioritize the assignment of previously unassigned students or improve the assignment of several students through improvement chains. These insights empower the decision-maker in tuning the matching algorithm to provide a fair application-oriented solution.
Integrated information theory (IIT) is a theoretical framework that provides a quantitative measure to estimate when a physical system is conscious, its degree of consciousness, and the complexity of the qualia space that the system is experiencing. Formally, IIT rests on the assumption that if a surrogate physical system can fully embed the phenomenological properties of consciousness, then the system properties must be constrained by the properties of the qualia being experienced. Following this assumption, IIT represents the physical system as a network of interconnected elements that can be thought of as a probabilistic causal graph, $\mathcal{G}$, where each node has an input-output function and all the graph is encoded in a transition probability matrix. Consequently, IIT's quantitative measure of consciousness, $\Phi$, is computed with respect to the transition probability matrix and the present state of the graph. In this paper, we provide a random search algorithm that is able to optimize $\Phi$ in order to investigate, as the number of nodes increases, the structure of the graphs that have higher $\Phi$. We also provide arguments that show the difficulties of applying more complex black-box search algorithms, such as Bayesian optimization or metaheuristics, in this particular problem. Additionally, we suggest specific research lines for these techniques to enhance the search algorithm that guarantees maximal $\Phi$.
Spectral clustering (SC) is a popular clustering technique to find strongly connected communities on a graph. SC can be used in Graph Neural Networks (GNNs) to implement pooling operations that aggregate nodes belonging to the same cluster. However, the eigendecomposition of the Laplacian is expensive and, since clustering results are graph-specific, pooling methods based on SC must perform a new optimization for each new sample. In this paper, we propose a graph clustering approach that addresses these limitations of SC. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.