Nova Reader - Subject

Density estimation using deep generative neural networks

2021-04-08T00:00

Density estimation is among the most fundamental problems in statistics. It is notoriously difficult to estimate the density of high-dimensional data due to the “curse of dimensionality.” Here, we introduce a new general-purpose density estimator based on deep generative neural networks. By modeling data normally distributed around a manifold of reduced dimension, we show how the power of bidirectional generative neural networks (e.g., cycleGAN) can be exploited for explicit evaluation of the data density. Simulation and real data experiments suggest that our method is effective in a wide range of problems. This approach should be helpful in many applications where an accurate density estimator is needed.

Density estimation is one of the fundamental problems in both statistics and machine learning. In this study, we propose Roundtrip, a computational framework for general-purpose density estimation based on deep generative neural networks. Roundtrip retains the generative power of deep generative models, such as generative adversarial networks (GANs) while it also provides estimates of density values, thus supporting both data generation and density estimation. Unlike previous neural density estimators that put stringent conditions on the transformation from the latent space to the data space, Roundtrip enables the use of much more general mappings where target density is modeled by learning a manifold induced from a base density (e.g., Gaussian distribution). Roundtrip provides a statistical framework for GAN models where an explicit evaluation of density values is feasible. In numerical experiments, Roundtrip exceeds state-of-the-art performance in a diverse range of density estimation tasks.

Confidence intervals for policy evaluation in adaptive experiments

2021-04-05T00:00

Randomized controlled trials are central to the scientific process, but they can be costly. For example, a clinical trial may assign patients to treatments that are detrimental to them. Adaptive experimental designs, such as multiarmed bandit algorithms, reduce costs by increasing the probability of assigning promising treatments over the course of the experiment. However, because observations collected by these methods are dependent and their distribution is nonstationary, statistical inference can be challenging. We propose a treatment-effect estimator that has an asymptotically unbiased and normal test statistic under straightforward, relatively weak conditions on the adaptive design. This estimator generalizes for a variety of parameters of interest.

Adaptive experimental designs can dramatically improve efficiency in randomized trials. But with adaptively collected data, common estimators based on sample means and inverse propensity-weighted means can be biased or heavy-tailed. This poses statistical challenges, in particular when the experimenter would like to test hypotheses about parameters that were not targeted by the data-collection mechanism. In this paper, we present a class of test statistics that can handle these challenges. Our approach is to adaptively reweight the terms of an augmented inverse propensity-weighting estimator to control the contribution of each term to the estimator’s variance. This scheme reduces overall variance and yields an asymptotically normal test statistic. We validate the accuracy of the resulting estimates and their CIs in numerical experiments and show that our methods compare favorably to existing alternatives in terms of mean squared error, coverage, and CI size.

Task-specific information outperforms surveillance-style big data in predictive analytics

2021-03-31T00:00

Increasingly, human behavior can be monitored through the collection of data from digital devices revealing information on behaviors and locations. In the context of higher education, a growing number of schools and universities collect data on their students with the purpose of assessing or predicting behaviors and academic performance, and the COVID-19–induced move to online education dramatically increases what can be accumulated in this way, raising concerns about students’ privacy. We focus on academic performance and ask whether predictive performance for a given dataset can be achieved with less privacy-invasive, but more task-specific, data. We draw on a unique dataset on a large student population containing both highly detailed measures of behavior and personality and high-quality third-party reported individual-level administrative data. We find that models estimated using the big behavioral data are indeed able to accurately predict academic performance out of sample. However, models using only low-dimensional and arguably less privacy-invasive administrative data perform considerably better and, importantly, do not improve when we add the high-resolution, privacy-invasive behavioral data. We argue that combining big behavioral data with “ground truth” administrative registry data can ideally allow the identification of privacy-preserving task-specific features that can be employed instead of current indiscriminate troves of behavioral data, with better privacy and better prediction resulting.

Stable reliability diagrams for probabilistic classifiers

2021-02-17T00:00

Probabilistic classifiers assign predictive probabilities to binary events, such as rainfall tomorrow, a recession, or a personal health outcome. Such a system is reliable or calibrated if the predictive probabilities are matched by the observed frequencies. In practice, calibration is assessed graphically in reliability diagrams and quantified via the reliability component of mean scores. Extant approaches rely on binning and counting and have been hampered by ad hoc implementation decisions, a lack of reproducibility, and inefficiency. Here, we introduce the CORP approach, which uses the pool-adjacent-violators algorithm to generate optimally binned, reproducible, and provably statistically consistent reliability diagrams, along with a numerical measure of miscalibration based on a revisited score decomposition.

A probability forecast or probabilistic classifier is reliable or calibrated if the predicted probabilities are matched by ex post observed frequencies, as examined visually in reliability diagrams. The classical binning and counting approach to plotting reliability diagrams has been hampered by a lack of stability under unavoidable, ad hoc implementation decisions. Here, we introduce the CORP approach, which generates provably statistically consistent, optimally binned, and reproducible reliability diagrams in an automated way. CORP is based on nonparametric isotonic regression and implemented via the pool-adjacent-violators (PAV) algorithm—essentially, the CORP reliability diagram shows the graph of the PAV-(re)calibrated forecast probabilities. The CORP approach allows for uncertainty quantification via either resampling techniques or asymptotic theory, furnishes a numerical measure of miscalibration, and provides a CORP-based Brier-score decomposition that generalizes to any proper scoring rule. We anticipate that judicious uses of the PAV algorithm yield improved tools for diagnostics and inference for a very wide range of statistical and machine learning methods.

Bayesian estimation of SARS-CoV-2 prevalence in Indiana by random testing

2021-01-13T00:00

Infection with the novel coronovirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in a worldwide pandemic of COVID-19 disease. Efforts to design local, regional, and national responses to the virus are constrained by a lack of information on the extent of the epidemic as well as inaccuracies in newly developed diagnostic tests. In this study we analyze data from testing randomly selected Indiana state residents for infection or previous exposure to SARS-CoV-2 and derive estimates of the statewide COVID-19 prevalence in an attempt to address potential biases arising from nonresponse and diagnostic testing errors.

From 25 to 29 April 2020, the state of Indiana undertook testing of 3,658 randomly chosen state residents for the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus, the agent causing COVID-19 disease. This was the first statewide randomized study of COVID-19 testing in the United States. Both PCR and serological tests were administered to all study participants. This paper describes statistical methods used to address nonresponse among various demographic groups and to adjust for testing errors to reduce bias in the estimates of the overall disease prevalence in Indiana. These adjustments were implemented through Bayesian methods, which incorporated all available information on disease prevalence and test performance, along with external data obtained from census of the Indiana statewide population. Both adjustments appeared to have significant impact on the unadjusted estimates, mainly due to upweighting data in study participants of non-White races and Hispanic ethnicity and anticipated false-positive and false-negative test results among both the PCR and antibody tests utilized in the study.