In many areas of interest, modern risk assessment requires estimation of the extremal behaviour of sums of random variables. We derive the first order upper-tail behaviour of the weighted sum of bivariate random variables under weak assumptions on their marginal distributions and their copula. The extremal behaviour of the marginal variables is characterised by the generalised Pareto distribution and their extremal dependence through subclasses of the limiting representations of Ledford and Tawn (1997) and Heffernan and Tawn (2004). We find that the upper tail behaviour of the aggregate is driven by different factors dependent on the signs of the marginal shape parameters; if they are both negative, the extremal behaviour of the aggregate is determined by both marginal shape parameters and the coefficient of asymptotic independence (Ledford and Tawn, 1996); if they are both positive or have different signs, the upper-tail behaviour of the aggregate is given solely by the largest marginal shape. We also derive the aggregate upper-tail behaviour for some well known copulae which reveals further insight into the tail structure when the copula falls outside the conditions for the subclasses of the limiting dependence representations.
Mark-point dependence plays a critical role in research problems that can be fitted into the general framework of marked point processes. In this work, we focus on adjusting for mark-point dependence when estimating the mean and covariance functions of the mark process, given independent replicates of the marked point process. We assume that the mark process is a Gaussian process and the point process is a log-Gaussian Cox process, where the mark-point dependence is generated through the dependence between two latent Gaussian processes. Under this framework, naive local linear estimators ignoring the mark-point dependence can be severely biased. We show that this bias can be corrected using a local linear estimator of the cross-covariance function and establish uniform convergence rates of the bias-corrected estimators. Furthermore, we propose a test statistic based on local linear estimators for mark-point independence, which is shown to converge to an asymptotic normal distribution in a parametric $\sqrt{n}$-convergence rate. Model diagnostics tools are developed for key model assumptions and a robust functional permutation test is proposed for a more general class of mark-point processes. The effectiveness of the proposed methods is demonstrated using extensive simulations and applications to two real data examples.
In this paper, we study a sequential decision making problem faced by e-commerce carriers related to when to send out a vehicle from the central depot to serve customer requests, and in which order to provide the service, under the assumption that the time at which parcels arrive at the depot is stochastic and dynamic. The objective is to maximize the number of parcels that can be delivered during the service hours. We propose two reinforcement learning approaches for solving this problem, one based on a policy function approximation (PFA) and the second on a value function approximation (VFA). Both methods are combined with a look-ahead strategy, in which future release dates are sampled in a Monte-Carlo fashion and a tailored batch approach is used to approximate the value of future states. Our PFA and VFA make a good use of branch-and-cut-based exact methods to improve the quality of decisions. We also establish sufficient conditions for partial characterization of optimal policy and integrate them into PFA/VFA. In an empirical study based on 720 benchmark instances, we conduct a competitive analysis using upper bounds with perfect information and we show that PFA and VFA greatly outperform two alternative myopic approaches. Overall, PFA provides best solutions, while VFA (which benefits from a two-stage stochastic optimization model) achieves a better tradeoff between solution quality and computing time.
Real-world behavior is often shaped by complex interactions between multiple agents. To scalably study multi-agent behavior, advances in unsupervised and self-supervised learning have enabled a variety of different behavioral representations to be learned from trajectory data. To date, there does not exist a unified set of benchmarks that can enable comparing methods quantitatively and systematically across a broad set of behavior analysis settings. We aim to address this by introducing a large-scale, multi-agent trajectory dataset from real-world behavioral neuroscience experiments that covers a range of behavior analysis tasks. Our dataset consists of trajectory data from common model organisms, with 9.6 million frames of mouse data and 4.4 million frames of fly data, in a variety of experimental settings, such as different strains, lengths of interaction, and optogenetic stimulation. A subset of the frames also consist of expert-annotated behavior labels. Improvements on our dataset corresponds to behavioral representations that work across multiple organisms and is able to capture differences for common behavior analysis tasks.
We showcase a variety of functions and classes that implement sampling procedures with improved exploration of the parameter space assisted by machine learning. Special attention is paid to setting sane defaults with the objective that adjustments required by different problems remain minimal. This collection of routines can be employed for different types of analysis, from finding bounds on the parameter space to accumulating samples in areas of interest. In particular, we discuss two methods assisted by incorporating different machine learning models: regression and classification. We show that a machine learning classifier can provide higher efficiency for exploring the parameter space. Also, we introduce a boosting technique to improve the slow convergence at the start of the process. The use of these routines is better explained with the help of a few examples that illustrate the type of results one can obtain. We also include examples of the code used to obtain the examples as well as descriptions of the adjustments that can be made to adapt the calculation to other problems. We finalize by showing the impact of these techniques when exploring the parameter space of the two Higgs doublet model that matches the measured Higgs Boson signal strength. The code used for this paper and instructions on how to use it are available on the web.
We propose united implicit functions (UNIF), a part-based method for clothed human reconstruction and animation with raw scans and skeletons as the input. Previous part-based methods for human reconstruction rely on ground-truth part labels from SMPL and thus are limited to minimal-clothed humans. In contrast, our method learns to separate parts from body motions instead of part supervision, thus can be extended to clothed humans and other articulated objects. Our Partition-from-Motion is achieved by a bone-centered initialization, a bone limit loss, and a section normal loss that ensure stable part division even when the training poses are limited. We also present a minimal perimeter loss for SDF to suppress extra surfaces and part overlapping. Another core of our method is an adjacent part seaming algorithm that produces non-rigid deformations to maintain the connection between parts which significantly relieves the part-based artifacts. Under this algorithm, we further propose "Competing Parts", a method that defines blending weights by the relative position of a point to bones instead of the absolute position, avoiding the generalization problem of neural implicit functions with inverse LBS (linear blend skinning). We demonstrate the effectiveness of our method by clothed human body reconstruction and animation on the CAPE and the ClothSeq datasets.
We consider power means of independent and identically distributed (i.i.d.) non-integrable random variables. The power mean is a homogeneous quasi-arithmetic mean, and under some conditions, several limit theorems hold for the power mean as well as for the arithmetic mean of i.i.d. integrable random variables. We establish integrabilities and a limit theorem for the variances of the power mean of i.i.d. non-integrable random variables. We also consider behaviors of the power mean when the parameter of the power varies. Our feature is that the generator of the power mean is allowed to be complex-valued, which enables us to consider the power mean of random variables supported on the whole set of real numbers. The complex-valued power mean is an unbiased strongly-consistent estimator for the joint of the location and scale parameters of the Cauchy distribution.
The literature on treatment choice focuses on the mean of welfare regret. Ignoring other features of the regret distribution, however, can lead to an undesirable rule that suffers from a high chance of welfare loss due to sampling uncertainty. We propose to minimize the mean of a nonlinear transformation of welfare regret. This paradigm shift alters optimal rules drastically. We show that for a wide class of nonlinear criteria, admissible rules are fractional. Focusing on mean square regret, we derive the closed-form probabilities of randomization for finite-sample Bayes and minimax optimal rules when data are normal with known variance. The minimax optimal rule is a simple logit based on the sample mean and agrees with the posterior probability for positive treatment effect under the least favorable prior. The Bayes optimal rule with an uninformative prior is different but produces quantitatively comparable mean square regret. We extend these results to limit experiments and discuss our findings through sample size calculations.
In model extraction attacks, adversaries can steal a machine learning model exposed via a public API by repeatedly querying it and adjusting their own model based on obtained predictions. To prevent model stealing, existing defenses focus on detecting malicious queries, truncating, or distorting outputs, thus necessarily introducing a tradeoff between robustness and model utility for legitimate users. Instead, we propose to impede model extraction by requiring users to complete a proof-of-work before they can read the model's predictions. This deters attackers by greatly increasing (even up to 100x) the computational effort needed to leverage query access for model extraction. Since we calibrate the effort required to complete the proof-of-work to each query, this only introduces a slight overhead for regular users (up to 2x). To achieve this, our calibration applies tools from differential privacy to measure the information revealed by a query. Our method requires no modification of the victim model and can be applied by machine learning practitioners to guard their publicly exposed models against being easily stolen.
Advances in artificial intelligence often stem from the development of new environments that abstract real-world situations into a form where research can be done conveniently. This paper contributes such an environment based on ideas inspired by elementary Microeconomics. Agents learn to produce resources in a spatially complex world, trade them with one another, and consume those that they prefer. We show that the emergent production, consumption, and pricing behaviors respond to environmental conditions in the directions predicted by supply and demand shifts in Microeconomics. We also demonstrate settings where the agents' emergent prices for goods vary over space, reflecting the local abundance of goods. After the price disparities emerge, some agents then discover a niche of transporting goods between regions with different prevailing prices -- a profitable strategy because they can buy goods where they are cheap and sell them where they are expensive. Finally, in a series of ablation experiments, we investigate how choices in the environmental rewards, bartering actions, agent architecture, and ability to consume tradable goods can either aid or inhibit the emergence of this economic behavior. This work is part of the environment development branch of a research program that aims to build human-like artificial general intelligence through multi-agent interactions in simulated societies. By exploring which environment features are needed for the basic phenomena of elementary microeconomics to emerge automatically from learning, we arrive at an environment that differs from those studied in prior multi-agent reinforcement learning work along several dimensions. For example, the model incorporates heterogeneous tastes and physical abilities, and agents negotiate with one another as a grounded form of communication.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.