美国式禁忌电影在线观看免费观看_国产日本亚洲欧美一区二区_91久久久精品人妻专区不卡_国产欧美另类久久久精品9_色综合无码不卡在线播放AV_亚洲中文无码永久免费视频_国产成人无码18禁午夜在线

We consider the classical fair division problem which studies how to allocate resources fairly and efficiently. We give a complete landscape on the computational complexity and approximability of maximizing the social welfare within (1) envy-free up to any item (EFX) and (2) envy-free up to one item (EF1) allocations of indivisible goods for both normalized and unnormalized valuations. We show that a partial EFX allocation may have a higher social welfare than a complete EFX allocation, while it is well-known that this is not true for EF1 allocations. Thus, our first group of results focuses on the problem of maximizing social welfare subject to (partial) EFX allocations. For $n=2$ agents, we provide a polynomial time approximation scheme (PTAS) and an NP-hardness result. For a general number of agents $n>2$, we present algorithms that achieve approximation ratios of $O(n)$ and $O(\sqrt{n})$ for unnormalized and normalized valuations, respectively. These results are complemented by the asymptotically tight inapproximability results. We also study the same constrained optimization problem for EF1. For $n=2$, we show a fully polynomial time approximation scheme (FPTAS) and complement this positive result with an NP-hardness result. For general $n$, we present polynomial inapproximability ratios for both normalized and unnormalized valuations. Our results also imply the price of EFX is $\Theta(\sqrt{n})$ for normalized valuations, which is unknown in the previous literature.

相關內容

規(gui)范(fan)化的

關注 2

約束 · 連續變量 · 凸函數 · 約束模型 · 線性可分 ·

2023 年 3 月 31 日

Constrained Optimization of Rank-One Functions with Indicator Variables

Soroosh Shafieezadeh-Abadeh,Fatma K?l?n?-Karzan

Optimization problems involving minimization of a rank-one convex function over constraints modeling restrictions on the support of the decision variables emerge in various machine learning applications. These problems are often modeled with indicator variables for identifying the support of the continuous variables. In this paper we investigate compact extended formulations for such problems through perspective reformulation techniques. In contrast to the majority of previous work that relies on support function arguments and disjunctive programming techniques to provide convex hull results, we propose a constructive approach that exploits a hidden conic structure induced by perspective functions. To this end, we first establish a convex hull result for a general conic mixed-binary set in which each conic constraint involves a linear function of independent continuous variables and a set of binary variables. We then demonstrate that extended representations of sets associated with epigraphs of rank-one convex functions over constraints modeling indicator relations naturally admit such a conic representation. This enables us to systematically give perspective formulations for the convex hull descriptions of these sets with nonlinear separable or non-separable objective functions, sign constraints on continuous variables, and combinatorial constraints on indicator variables. We illustrate the efficacy of our results on sparse nonnegative logistic regression problems.

樣本復雜度 · 獨立成分分析 · 最優 · 最優性 · 樣本 ·

2023 年 3 月 31 日

Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability

Arnab Auddy,Ming Yuan

In this paper, we investigate the optimal statistical performance and the impact of computational constraints for independent component analysis (ICA). Our goal is twofold. On the one hand, we characterize the precise role of dimensionality on sample complexity and statistical accuracy, and how computational consideration may affect them. In particular, we show that the optimal sample complexity is linear in dimensionality, and interestingly, the commonly used sample kurtosis-based approaches are necessarily suboptimal. However, the optimal sample complexity becomes quadratic, up to a logarithmic factor, in the dimension if we restrict ourselves to estimates that can be computed with low-degree polynomial algorithms. On the other hand, we develop computationally tractable estimates that attain both the optimal sample complexity and minimax optimal rates of convergence. We study the asymptotic properties of the proposed estimates and establish their asymptotic normality that can be readily used for statistical inferences. Our method is fairly easy to implement and numerical experiments are presented to further demonstrate its practical merits.

Softmax · 損失函數 · 損失 · 多標簽分類 · 閾值 ·

2023 年 3 月 31 日

A two-head loss function for deep Average-K classification

Camille Garcin,Maximilien Servajean,Alexis Joly,Joseph Salmon

Average-K classification is an alternative to top-K classification in which the number of labels returned varies with the ambiguity of the input image but must average to K over all the samples. A simple method to solve this task is to threshold the softmax output of a model trained with the cross-entropy loss. This approach is theoretically proven to be asymptotically consistent, but it is not guaranteed to be optimal for a finite set of samples. In this paper, we propose a new loss function based on a multi-label classification head in addition to the classical softmax. This second head is trained using pseudo-labels generated by thresholding the softmax head while guaranteeing that K classes are returned on average. We show that this approach allows the model to better capture ambiguities between classes and, as a result, to return more consistent sets of possible classes. Experiments on two datasets from the literature demonstrate that our approach outperforms the softmax baseline, as well as several other loss functions more generally designed for weakly supervised multi-label classification. The gains are larger the higher the uncertainty, especially for classes with few samples.

PAC學習理論 · 一致 · 混合方法 · 非i.i.d.數據 · 非平穩 ·

2023 年 3 月 31 日

A unified recipe for deriving (time-uniform) PAC-Bayes bounds

Ben Chugg,Hongjian Wang,Aaditya Ramdas

from arxiv, 46 pages

We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike most previous literature on this topic, our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times, not only for a fixed sample size. Our approach combines four tools in the following order: (a) nonnegative supermartingales or reverse submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula (or other convex duality principles), and (d) Ville's inequality. Our main result is a PAC-Bayes theorem which holds for a wide class of discrete stochastic processes. We show how this result implies time-uniform versions of well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester, Maurer, and Catoni, in addition to many recent bounds. We also present several novel bounds. Our framework also enables us to relax traditional assumptions; in particular, we consider nonstationary loss functions and non-i.i.d. data. In sum, we unify the derivation of past bounds and ease the search for future bounds: one may simply check if our supermartingale or submartingale conditions are met and, if so, be guaranteed a (time-uniform) PAC-Bayes bound.

布爾函數 · 比特 · 度量 · 決策樹 · 分解 ·

2023 年 3 月 31 日

Level-p-complexity of Boolean functions using Thinning, Memoization, and Polynomials

Julia Jansson,Patrik Jansson

from arxiv, 20 pages, 10 figures

This paper describes a purely functional library for computing level-$p$-complexity of Boolean functions, and applies it to two-level iterated majority. Boolean functions are simply functions from $n$ bits to one bit, and they can describe digital circuits, voting systems, etc. An example of a Boolean function is majority, which returns the value that has majority among the $n$ input bits for odd $n$. The complexity of a Boolean function $f$ measures the cost of evaluating it: how many bits of the input are needed to be certain about the result of $f$. There are many competing complexity measures but we focus on level-$p$-complexity -- a function of the probability $p$ that a bit is 1. The level-$p$-complexity $D_p(f)$ is the minimum expected cost when the input bits are independent and identically distributed with Bernoulli($p$) distribution. We specify the problem as choosing the minimum expected cost of all possible decision trees -- which directly translates to a clearly correct, but very inefficient implementation. The library uses thinning and memoization for efficiency and type classes for separation of concerns. The complexity is represented using polynomials, and the order relation used for thinning is implemented using polynomial factorisation and root-counting. Finally we compute the complexity for two-level iterated majority and improve on an earlier result by J.~Jansson.

正確性 · 重現性 · 潛在 · 程序正確性 · Conformer ·

2023 年 3 月 31 日

Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP

Sara Papi,Marco Gaido,Andrea Pilzer,Matteo Negri

Despite its pivotal role in research experiments, code correctness is often presumed only on the basis of the perceived quality of the results. This comes with the risk of erroneous outcomes and potentially misleading findings. To address this issue, we posit that the current focus on result reproducibility should go hand in hand with the emphasis on coding best practices. We bolster our call to the NLP community by presenting a case study, in which we identify (and correct) three bugs in widely used open-source implementations of the state-of-the-art Conformer architecture. Through comparative experiments on automatic speech recognition and translation in various language settings, we demonstrate that the existence of bugs does not prevent the achievement of good and reproducible results and can lead to incorrect conclusions that potentially misguide future research. In response to this, this study is a call to action toward the adoption of coding best practices aimed at fostering correctness and improving the quality of the developed software.

小批量 · 對抗 · 批量大小 · 對抗樣本 · 樣本 ·

2023 年 3 月 30 日

Generating Adversarial Samples in Mini-Batches May Be Detrimental To Adversarial Robustness

Timothy Redgrave,Colton Crum

from arxiv, 6 pages, 3 figures

Neural networks have been proven to be both highly effective within computer vision, and highly vulnerable to adversarial attacks. Consequently, as the use of neural networks increases due to their unrivaled performance, so too does the threat posed by adversarial attacks. In this work, we build towards addressing the challenge of adversarial robustness by exploring the relationship between the mini-batch size used during adversarial sample generation and the strength of the adversarial samples produced. We demonstrate that an increase in mini-batch size results in a decrease in the efficacy of the samples produced, and we draw connections between these observations and the phenomenon of vanishing gradients. Next, we formulate loss functions such that adversarial sample strength is not degraded by mini-batch size. Our findings highlight a potential risk for underestimating the true (practical) strength of adversarial attacks, and a risk of overestimating a model's robustness. We share our codes to let others replicate our experiments and to facilitate further exploration of the connections between batch size and adversarial sample strength.

分析 · 分布式機器學習 · 機器學習平臺 · 匿名化技術 · 匿名化 ·

2023 年 3 月 27 日

PADME-SoSci: A Platform for Analytics and Distributed Machine Learning for the Social Sciences

Zeyd Boukhers,Arnim Bleier,Yeliz Ucer Yediel,Mio Hienstorfer-Heitmann,Mehrshad Jaberansaryand Adamantios Koumpisand Oya Beyan

from arxiv, accepted to be published @ ACM/IEEE JCDL 2023 - Joint Conference on Digital Libraries

Data privacy and ownership are significant in social data science, raising legal and ethical concerns. Sharing and analyzing data is difficult when different parties own different parts of it. An approach to this challenge is to apply de-identification or anonymization techniques to the data before collecting it for analysis. However, this can reduce data utility and increase the risk of re-identification. To address these limitations, we present PADME, a distributed analytics tool that federates model implementation and training. PADME uses a federated approach where the model is implemented and deployed by all parties and visits each data location incrementally for training. This enables the analysis of data across locations while still allowing the model to be trained as if all data were in a single location. Training the model on data in its original location preserves data ownership. Furthermore, the results are not provided until the analysis is completed on all data locations to ensure privacy and avoid bias in the results.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

Softmax · 邊緣化 · Performer · Better · state-of-the-art ·

2018 年 1 月 18 日

Additive Margin Softmax for Face Verification

Feng Wang,Weiyang Liu,Haijun Liu,Jian Cheng

from arxiv, technical report

In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax