Reweighted wake-sleep (RWS) is a machine learning method for performing Bayesian inference in a very general class of models. RWS draws $K$ samples from an underlying approximate posterior, then uses importance weighting to provide a better estimate of the true posterior. RWS then updates its approximate posterior towards the importance-weighted estimate of the true posterior. However, recent work [Chattergee and Diaconis, 2018] indicates that the number of samples required for effective importance weighting is exponential in the number of latent variables. Attaining such a large number of importance samples is intractable in all but the smallest models. Here, we develop massively parallel RWS, which circumvents this issue by drawing $K$ samples of all $n$ latent variables, and individually reasoning about all $K^n$ possible combinations of samples. While reasoning about $K^n$ combinations might seem intractable, the required computations can be performed in polynomial time by exploiting conditional independencies in the generative model. We show considerable improvements over standard "global" RWS, which draws $K$ samples from the full joint.
The Independent Cutset problem asks whether there is a set of vertices in a given graph that is both independent and a cutset. Such a problem is $\textsf{NP}$-complete even when the input graph is planar and has maximum degree five. In this paper, we first present a $\mathcal{O}^*(1.4423^{n})$-time algorithm for the problem. We also show how to compute a minimum independent cutset (if any) in the same running time. Since the property of having an independent cutset is MSO$_1$-expressible, our main results are concerned with structural parameterizations for the problem considering parameters that are not bounded by a function of the clique-width of the input. We present $\textsf{FPT}$-time algorithms for the problem considering the following parameters: the dual of the maximum degree, the dual of the solution size, the size of a dominating set (where a dominating set is given as an additional input), the size of an odd cycle transversal, the distance to chordal graphs, and the distance to $P_5$-free graphs. We close by introducing the notion of $\alpha$-domination, which allows us to identify more fixed-parameter tractable and polynomial-time solvable cases.
For a graph class $\mathcal{G}$, we define the $\mathcal{G}$-modular cardinality of a graph $G$ as the minimum size of a vertex partition of $G$ into modules that each induces a graph in $\mathcal{G}$. This generalizes other module-based graph parameters such as neighborhood diversity and iterated type partition. Moreover, if $\mathcal{G}$ has bounded modular-width, the W[1]-hardness of a problem in $\mathcal{G}$-modular cardinality implies hardness on modular-width, clique-width, and other related parameters. On the other hand, fixed-parameter tractable (FPT) algorithms in $\mathcal{G}$-modular cardinality may provide new ideas for algorithms using such parameters. Several FPT algorithms based on modular partitions compute a solution table in each module, then combine each table into a global solution. This works well when each table has a succinct representation, but as we argue, when no such representation exists, the problem is typically W[1]-hard. We illustrate these ideas on the generic $(\alpha, \beta)$-domination problem, which asks for a set of vertices that contains at least a fraction $\alpha$ of the adjacent vertices of each unchosen vertex, plus some (possibly negative) amount $\beta$. This generalizes known domination problems such as Bounded Degree Deletion, $k$-Domination, and $\alpha$-Domination. We show that for graph classes $\mathcal{G}$ that require arbitrarily large solution tables, these problems are W[1]-hard in the $\mathcal{G}$-modular cardinality, whereas they are fixed-parameter tractable when they admit succinct solution tables. This leads to several new positive and negative results for many domination problems parameterized by known and novel structural graph parameters such as clique-width, modular-width, and $cluster$-modular cardinality.
Non-Gaussian Bayesian filtering is a core problem in stochastic filtering. The difficulty of the problem lies in parameterizing the state estimates. However the existing methods are not able to treat it well. We propose to use power moments to obtain a parameterization. Unlike the existing parametric estimation methods, our proposed algorithm does not require prior knowledge about the state to be estimated, e.g. the number of modes and the feasible classes of function. Moreover, the proposed algorithm is not required to store massive parameters during filtering as the existing nonparametric Bayesian filters, e.g. the particle filter. The parameters of the proposed parametrization can also be determined by a convex optimization scheme with moments constraints, to which the solution is proved to exist and be unique. A necessary and sufficient condition for all the power moments of the density estimate to exist and be finite is provided. The errors of power moments are analyzed for the density estimate being either light-tailed or heavy-tailed. Error upper bounds of the density estimate for the one-step prediction are proposed. Simulation results on different types of density functions of the state are given, including the heavy-tailed densities, to validate the proposed algorithm.
The multivariate adaptive regression spline (MARS) is one of the popular estimation methods for nonparametric multivariate regressions. However, as MARS is based on marginal splines, to incorporate interactions of covariates, products of the marginal splines must be used, which leads to an unmanageable number of basis functions when the order of interaction is high and results in low estimation efficiency. In this paper, we improve the performance of MARS by using linear combinations of the covariates which achieve sufficient dimension reduction. The special basis functions of MARS facilitate calculation of gradients of the regression function, and estimation of the linear combinations is obtained via eigen-analysis of the outer-product of the gradients. Under some technical conditions, the asymptotic theory is established for the proposed estimation method. Numerical studies including both simulation and empirical applications show its effectiveness in dimension reduction and improvement over MARS and other commonly-used nonparametric methods in regression estimation and prediction.
In our previous paper, we proposed a non-Gaussian Bayesian filter using power moments of the system state. A density surrogate parameterized as an analytic function is proposed to approximate the true system state, of which the distribution is only assumed Lebesgue integrable. To our knowledge, it is the first Bayesian filter where there is no prior constraints on the true density of the state and the state estimate has a continuous form of function. In this very preliminary version of paper, we propose a new type of statistics, which is called the generalized logarithmic moments. They are used to parameterize the state distribution together with the power moments. The map from the parameters of the proposed density surrogate to the power moments is proved to be a diffeomorphism, which allows to use gradient methods to treat the optimization problem determining the parameters. The simulation results reveal the advantage of using both moments for estimating mixtures of complicated types of functions.
We consider the problem of energy-efficient scheduling across multiple processors with a power-down mechanism. In this setting a set of $n$ jobs with individual release times, deadlines, and processing volumes must be scheduled across $m$ parallel processors while minimizing the consumed energy. Idle processors can be turned off to save energy, while turning them on requires a fixed amount of energy. For the special case of a single processor, the greedy Left-to-Right algorithm guarantees an approximation factor of $2$. We generalize this simple greedy policy to the case of $m \geq 1$ processors running in parallel and show that the energy costs are still bounded by $2 \text{OPT} + P$, where $\text{OPT}$ is the energy consumed by an optimal solution and $P < \text{OPT}$ is the total processing volume. Our algorithm has a running time of $\mathcal{O}(n f \log d)$, where $d$ is the difference between the latest deadline and the earliest release time, and $f$ is the running time of a maximum flow calculation in a network of $\mathcal{O}(n)$ nodes.
Learning the community structure of a large-scale graph is a fundamental problem in machine learning, computer science and statistics. We study the problem of exactly recovering the communities in a graph generated from the Stochastic Block Model (SBM) in the Massively Parallel Computation (MPC) model. Specifically, given $kn$ vertices that are partitioned into $k$ equal-sized clusters (i.e., each has size $n$), a graph on these $kn$ vertices is randomly generated such that each pair of vertices is connected with probability~$p$ if they are in the same cluster and with probability $q$ if not, where $p > q > 0$. We give MPC algorithms for the SBM in the (very general) \emph{$s$-space MPC model}, where each machine has memory $s=\Omega(\log n)$. Under the condition that $\frac{p-q}{\sqrt{p}}\geq \tilde{\Omega}(k^{\frac12}n^{-\frac12+\frac{1}{2(r-1)}})$ for any integer $r\in [3,O(\log n)]$, our first algorithm exactly recovers all the $k$ clusters in $O(kr\log_s n)$ rounds using $\tilde{O}(m)$ total space, or in $O(r\log_s n)$ rounds using $\tilde{O}(km)$ total space. If $\frac{p-q}{\sqrt{p}}\geq \tilde{\Omega}(k^{\frac34}n^{-\frac14})$, our second algorithm achieves $O(\log_s n)$ rounds and $\tilde{O}(m)$ total space complexity. Both algorithms significantly improve upon a recent result of Cohen-Addad et al. [PODC'22], who gave algorithms that only work in the \emph{sublinear space MPC model}, where each machine has local memory~$s=O(n^{\delta})$ for some constant $\delta>0$, with a much stronger condition on $p,q,k$. Our algorithms are based on collecting the $r$-step neighborhood of each vertex and comparing the difference of some statistical information generated from the local neighborhoods for each pair of vertices. To implement the clustering algorithms in parallel, we present efficient approaches for implementing some basic graph operations in the $s$-space MPC model.
This paper studies an intelligent reflecting surface (IRS)-aided multi-antenna simultaneous wireless information and power transfer (SWIPT) system where an $M$-antenna access point (AP) serves $K$ single-antenna information users (IUs) and $J$ single-antenna energy users (EUs) with the aid of an IRS with phase errors. We explicitly concentrate on overloaded scenarios where $K + J > M$ and $K \geq M$. Our goal is to maximize the minimum throughput among all the IUs by optimizing the allocation of resources (including time, transmit beamforming at the AP, and reflect beamforming at the IRS), while guaranteeing the minimum amount of harvested energy at each EU. Towards this goal, we propose two user grouping (UG) schemes, namely, the non-overlapping UG scheme and the overlapping UG scheme, where the difference lies in whether identical IUs can exist in multiple groups. Different IU groups are served in orthogonal time dimensions, while the IUs in the same group are served simultaneously with all the EUs via spatial multiplexing. The two problems corresponding to the two UG schemes are mixed-integer non-convex optimization problems and difficult to solve optimally. We propose efficient algorithms for these two problems based on the big-M formulation, the penalty method, the block coordinate descent, and the successive convex approximation. Simulation results show that: 1) the non-robust counterparts of the proposed robust designs are unsuitable for practical IRS-aided SWIPT systems with phase errors since the energy harvesting constraints cannot be satisfied; 2) the proposed UG strategies can significantly improve the max-min throughput over the benchmark schemes without UG or adopting random UG; 3) the overlapping UG scheme performs much better than its non-overlapping counterpart when the absolute difference between $K$ and $M$ is small and the EH constraints are not stringent.
Intelligent reflecting surfaces (IRSs) were introduced to enhance the performance of wireless systems. However, from a cellular service provider's view, a concern with the use of an IRS is its effect on out-of-band (OOB) quality of service. Specifically, given two operators, say X and Y, providing services in a geographical area using non-overlapping frequency bands, if operator-X uses an IRS to optimally enhance the throughput of its users, does the IRS degrade the performance of operator-Y? We answer this by deriving the ergodic sum spectral efficiency (SE) of both operators under round-robin scheduling. We also derive the complementary cumulative distribution function of the change in effective channel at an OOB user with and without the IRS, which provides deeper insights into OOB performance. Surprisingly, we find that even though the IRS is randomly configured from operator-Y's view, the OOB operator still benefits from the IRS, witnessing a performance enhancement for free. This happens because the IRS introduces additional paths between the nodes, increasing the signal power at the receiver and providing diversity benefits. We verify our findings numerically and conclude that an IRS is beneficial to every operator, even when the IRS is deployed to optimally serve only one operator.
We consider the problem of discovering $K$ related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a $l_1/l_2$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures. To allow practical implementation, we design a continuous optimization problem whose optimizer is the same as the joint estimator and can be approximated efficiently by an iterative algorithm. We validate the theoretical analysis and the effectiveness of the joint estimator in experiments.