亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The sub-packetization $\ell$ and the field size $q$ are of paramount importance in the MSR array code constructions. For optimal-access MSR codes, Balaji et al. proved that $\ell\geq s^{\left\lceil n/s \right\rceil}$, where $s = d-k+1$. Rawat et al. showed that this lower bound is attainable for all admissible values of $d$ when the field size is exponential in $n$. After that, tremendous efforts have been devoted to reducing the field size. However, till now, reduction to linear field size is only available for $d\in\{k+1,k+2,k+3\}$ and $d=n-1$. In this paper, we construct the first class of explicit optimal-access MSR codes with the smallest sub-packetization $\ell = s^{\left\lceil n/s \right\rceil}$ for all $d$ between $k+1$ and $n-1$, resolving an open problem in the survey (Ramkumar et al., Foundations and Trends in Communications and Information Theory: Vol. 19: No. 4). We further propose another class of explicit MSR code constructions (not optimal-access) with even smaller sub-packetization $s^{\left\lceil n/(s+1)\right\rceil }$ for all admissible values of $d$, making significant progress on another open problem in the survey. Previously, MSR codes with $\ell=s^{\left\lceil n/(s+1)\right\rceil }$ and $q=O(n)$ were only known for $d=k+1$ and $d=n-1$. The key insight that enables a linear field size in our construction is to reduce $\binom{n}{r}$ global constraints of non-vanishing determinants to $O_s(n)$ local ones, which is achieved by carefully designing the parity check matrices.

相關內容

挖掘軟件存儲庫(MSR)會議分析軟件存儲庫中可用的豐富數據,以發現有關軟件系統和項目的有趣和可操作的信息。官網鏈接: · · 確切的 · 漢明距離 · Performer ·
2024 年 1 月 31 日

We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $\alpha$-stable if there is a unique OPT which remains optimum if distances are (non-uniformly) stretched by a factor of at most $\alpha$. Stable clustering instances have been studied to explain why heuristics such as Lloyd's algorithm perform well in practice. In this work we show that for any fixed $\epsilon>0$, $(1+\epsilon)$-stable instances of $k$-Means in doubling metrics can be solved in polynomial time. More precisely we show a natural multiswap local search algorithm finds OPT for $(1+\epsilon)$-stable instances of $k$-Means and $k$-Median in a polynomial number of iterations. We complement this result by showing that under a new PCP theorem, this is essentially tight: that when the dimension d is part of the input, there is a fixed $\epsilon_0>0$ s.t. there is not even a PTAS for $(1+\epsilon_0)$-stable $k$-Means in $R^d$ unless NP=RP. To do this, we consider a robust property of CSPs; call an instance stable if there is a unique optimum solution $x^*$ and for any other solution $x'$, the number of unsatisfied clauses is proportional to the Hamming distance between $x^*$ and $x'$. Dinur et al. have already shown stable QSAT is hard to approximate for some constant Q, our hypothesis is simply that stable QSAT with bounded variable occurrence is also hard. Given this hypothesis we consider "stability-preserving" reductions to prove our hardness for stable k-Means. Such reductions seem to be more fragile than standard L-reductions and may be of further use to demonstrate other stable optimization problems are hard.

We investigate the Witsenhausen counterexample in a continuous vector-valued context with a causal encoder and noncausal decoder. Our main result is the optimal single-letter condition that characterizes the set of achievable Witsenhausen power costs and estimation costs, leveraging a modified weak typicality approach. In particular, we accommodate our power analysis to the causal encoder constraint, and provide an improved distortion error analysis for the challenging estimation of the interim state. Interestingly, the idea of dual role of control is explicitly captured by the two auxiliary random variables.

Vertex splitting is a graph operation that replaces a vertex $v$ with two nonadjacent new vertices and makes each neighbor of $v$ adjacent with one or both of the introduced vertices. Vertex splitting has been used in contexts from circuit design to statistical analysis. In this work, we explore the computational complexity of achieving a given graph property $\Pi$ by a limited number of vertex splits, formalized as the problem $\Pi$ Vertex Splitting ($\Pi$-VS). We focus on hereditary graph properties and contribute four groups of results: First, we classify the classical complexity of $\Pi$-VS for graph properties characterized by forbidden subgraphs of size at most 3. Second, we provide a framework that allows to show NP-completeness whenever one can construct a combination of a forbidden subgraph and prescribed vertex splits that satisfy certain conditions. Leveraging this framework we show NP-completeness when $\Pi$ is characterized by forbidden subgraphs that are sufficiently well connected. In particular, we show that $F$-Free-VS is NP-complete for each biconnected graph $F$. Third, we study infinite families of forbidden subgraphs, obtaining NP-hardness for Bipartite-VS and Perfect-VS. Finally, we touch upon the parameterized complexity of $\Pi$-VS with respect to the number of allowed splits, showing para-NP-hardness for $K_3$-Free-VS and deriving an XP-algorithm when each vertex is only allowed to be split at most once.

The Sibson and Arimoto capacity, which are based on the Sibson and Arimoto mutual information (MI) of order {\alpha}, respectively, are well-known generalizations of the channel capacity C. In this study, we derive novel alternating optimization algorithms for computing these capacities by providing new variational characterizations of the Sibson MI and Arimoto MI. Moreover, we prove that all iterative algorithms for computing these capacities are equivalent under appropriate conditions imposed on their initial distributions.

Score-based statistical models play an important role in modern machine learning, statistics, and signal processing. For hypothesis testing, a score-based hypothesis test is proposed in \cite{wu2022score}. We analyze the performance of this score-based hypothesis testing procedure and derive upper bounds on the probabilities of its Type I and II errors. We prove that the exponents of our error bounds are asymptotically (in the number of samples) tight for the case of simple null and alternative hypotheses. We calculate these error exponents explicitly in specific cases and provide numerical studies for various other scenarios of interest.

Markov processes are widely used mathematical models for describing dynamic systems in various fields. However, accurately simulating large-scale systems at long time scales is computationally expensive due to the short time steps required for accurate integration. In this paper, we introduce an inference process that maps complex systems into a simplified representational space and models large jumps in time. To achieve this, we propose Time-lagged Information Bottleneck (T-IB), a principled objective rooted in information theory, which aims to capture relevant temporal features while discarding high-frequency information to simplify the simulation task and minimize the inference error. Our experiments demonstrate that T-IB learns information-optimal representations for accurately modeling the statistical properties and dynamics of the original process at a selected time lag, outperforming existing time-lagged dimensionality reduction methods.

Consider the supervised learning setting where the goal is to learn to predict labels $\mathbf y$ given points $\mathbf x$ from a distribution. An \textit{omnipredictor} for a class $\mathcal L$ of loss functions and a class $\mathcal C$ of hypotheses is a predictor whose predictions incur less expected loss than the best hypothesis in $\mathcal C$ for every loss in $\mathcal L$. Since the work of [GKR+21] that introduced the notion, there has been a large body of work in the setting of binary labels where $\mathbf y \in \{0, 1\}$, but much less is known about the regression setting where $\mathbf y \in [0,1]$ can be continuous. Our main conceptual contribution is the notion of \textit{sufficient statistics} for loss minimization over a family of loss functions: these are a set of statistics about a distribution such that knowing them allows one to take actions that minimize the expected loss for any loss in the family. The notion of sufficient statistics relates directly to the approximate rank of the family of loss functions. Our key technical contribution is a bound of $O(1/\varepsilon^{2/3})$ on the $\epsilon$-approximate rank of convex, Lipschitz functions on the interval $[0,1]$, which we show is tight up to a factor of $\mathrm{polylog} (1/\epsilon)$. This yields improved runtimes for learning omnipredictors for the class of all convex, Lipschitz loss functions under weak learnability assumptions about the class $\mathcal C$. We also give efficient omnipredictors when the loss families have low-degree polynomial approximations, or arise from generalized linear models (GLMs). This translation from sufficient statistics to faster omnipredictors is made possible by lifting the technique of loss outcome indistinguishability introduced by [GKH+23] for Boolean labels to the regression setting.

In this note we give proofs for results relating to the Instrumental Variable (IV) model with binary response $Y$ and binary treatment $X$, but with an instrument $Z$ with $K$ states. These results were originally stated in Richardson & Robins (2014), "ACE Bounds; SEMS with Equilibrium Conditions," arXiv:1410.0470.

Calibrating simulation models that take large quantities of multi-dimensional data as input is a hard simulation optimization problem. Existing adaptive sampling strategies offer a methodological solution. However, they may not sufficiently reduce the computational cost for estimation and solution algorithm's progress within a limited budget due to extreme noise levels and heteroskedasticity of system responses. We propose integrating stratification with adaptive sampling for the purpose of efficiency in optimization. Stratification can exploit local dependence in the simulation inputs and outputs. Yet, the state-of-the-art does not provide a full capability to adaptively stratify the data as different solution alternatives are evaluated. We devise two procedures for data-driven calibration problems that involve a large dataset with multiple covariates to calibrate models within a fixed overall simulation budget. The first approach dynamically stratifies the input data using binary trees, while the second approach uses closed-form solutions based on linearity assumptions between the objective function and concomitant variables. We find that dynamical adjustment of stratification structure accelerates optimization and reduces run-to-run variability in generated solutions. Our case study for calibrating a wind power simulation model, widely used in the wind industry, using the proposed stratified adaptive sampling, shows better-calibrated parameters under a limited budget.

The family of log-concave density functions contains various kinds of common probability distributions. Due to the shape restriction, it is possible to find the nonparametric estimate of the density, for example, the nonparametric maximum likelihood estimate (NPMLE). However, the associated uncertainty quantification of the NPMLE is less well developed. The current techniques for uncertainty quantification are Bayesian, using a Dirichlet process prior combined with the use of Markov chain Monte Carlo (MCMC) to sample from the posterior. In this paper, we start with the NPMLE and use a version of the martingale posterior distribution to establish uncertainty about the NPMLE. The algorithm can be implemented in parallel and hence is fast. We prove the convergence of the algorithm by constructing suitable submartingales. We also illustrate results with different models and settings and some real data, and compare our method with that within the literature.

北京阿比特科技有限公司