Smith Hall
Smith Hall
Nonparametric Identified Methods to Handle Nonignorable Missing Data
Update 4/25/2019: Location of this seminar has been moved to SMI 211.
Bayesian Hierarchical Modeling of Demographic and Climate Change Indicators
Bayesian hierarchical modeling is a powerful tool for demography and climate science. In this talk we will focus on its use for accounting for uncertainty about past demographic quantities in population projections. Since the 1940s, population projections have in most cases been produced using the deterministic cohort component method. However, in 2015, for the first time, in a major advance, the United Nations issued official probabilistic population projections for all countries based on Bayesian hierarchical models for total fertility and life expectancy.
Generalized Score Matching for NonNegative Data
A common challenge in estimating parameters of probability density functions is the intractability of the normalizing constant. While in such cases maximum likelihood estimation (MLE) may be implemented using numerical integration, the approach becomes computationally intensive. In contrast, the score matching method of Hyvärinen (2005) avoids direct calculation of the normalizing constant and yields closedform estimates for exponential families of continuous distributions on the mdimensional Euclidean space R^m.
Green Dot Bystander Intervention Training
Green Dot is a movement, a program, and an action. The aim of Green Dot is to prevent and reduce sexual assault & relationship violence at UW by engaging students as leaders and active bystanders who step in, speak up, and interrupt potential acts of violence. The Green Dot movement is about gaining a critical mass of students, staff and faculty who are willing to do their small part to actively and visibly reduce powerbased personal violence at UW.
Sequential changepoint detection for a network of Hawkes processes
Hawkes processes has been a popular point process model for capturing mutual excitation of discrete events. In the network setting, this can capture the mutual influence between nodes, which has a wide range of applications in neural science, social networks, and crime data analysis. In this talk, I will present a statistical changepoint detection framework to detect in realtime, a change in the influence using streaming discrete events.
A likelihood ratio rest for shapeconstrained density functions
The celebrated Grenander (1956) estimator is the maximum likelihood estimator of a decreasing density function. In contrast to alternative nonparametric density estimators, Grenander estimator does not require any smoothing parameters and is often viewed as a fully automatic procedure. However, the monotonic density assumption might be questionable. While testing qualitative constraints such as monotonicity are difficult in general, we show that a likelihood ratio test statistic Kₙ has an incredibly simple asymptotic null distribution: n¹
Rerandomization and ANCOVA
Randomization is a basis for inferring treatment effects with minimal additional assumptions. Appropriately using covariates in randomized experiments will further yield more precise estimators. In his seminal work Design of Experiments, R. A. Fisher suggested blocking on discrete covariates in the design stage and conducting the analysis of covariance (ANCOVA) in the analysis stage. In fact, blocking can be embedded into a wider class of experimental design called rerandomization, and the classical ANCOVA can be extended to more general regressionadjusted estimators.
Randomized Experiments on Amazon’s Supply Chain
At Amazon’s Inventory Planning and Control Laboratory (IPC Lab) we run randomized controlled trials (RCTs) that evaluate the efficacy of inproduction buying and supply chain policies on important business metrics. Our customers are leading supply chain researchers and business managers within Amazon, and our mission is to help them best answer the question, ‘Should I roll out my policy?’ In this talk we discuss how we navigate multiple obstacles to fulfilling our mission.
Model compression as constrained optimization, with application to neural nets
Deep neural nets have become in recent years a widespread practical technology, with impressive performance in computer vision, speech recognition, natural language processing and many other applications. Deploying deep nets in mobile phones, robots, sensors and IoT devices is of great interest. However, stateoftheart deep nets for tasks such as object recognition are too large to be deployed in these devices because of the computational limits they impose in CPU speed, memory, bandwidth, battery life or energy consumption.
Causal Inference with Unmeasured Confounding: an Instrumental Variable Approach
Causal inference is a challenging problem because causation cannot be established from the observational data alone. Researchers typically rely on additional sources of information to infer causation from association. Such information may come from powerful designs such as randomization, or background knowledge such as information on all confounders. However, perfect designs or background knowledge required for establishing causality may not always be available in practice.
Testing One Hypothesis Multiple Times: a simple tool for generalized inference
The identification of new rare signals in data, the detection of a sudden change in a trend, and the selection of competing models, are some among the most challenging problems in statistical practice. Manifold Data Analysis with Applications to HighResolution 3D ImagingMany scientific areas are faced with the challenge of extracting information from large, complex, and highly structured data sets. A great deal of modern statistical work focuses on developing tools for handling such data. In this work we presents a new subfield of functional data analysis, FDA, which we call Manifold Data Analysis, or MDA. MDA is concerned with the statistical analysis of samples where one or more variables measured on each unit is a manifold, thus resulting in as many manifolds as we have units. The adaptive immune system: a grand and beautiful stochastic processAntibodies must recognize a great diversity of antigens to protect us from infectious disease. The binding properties of antibodies are determined by the sequences of their corresponding B cell receptors (BCRs). These BCR sequences are created in "draft" form by VDJ recombination, which randomly selects and deletes from the ends of V, D, and J genes, then joins them together with additional random nucleotides. A MonteCarlo Approach for Computing Strategies with Convex ConstraintsWe consider a mathematical model for a financial market and consider a trader who wants to optimize, by suitable trading, the value of his or her portfolio. The constraint in this optimization is given by a convex functional known as a convex risk measure. We propose a MonteCarlo algorithm, who inputs are the joint law of the stock prices and the parameters of the convex risk measure, and whose outputs are the numerical values of the optimal trading strategy. We also prove the optimality of the output. Recent Advances in Causal Modelling Using Directed GraphsStatistics at GoogleI will give a brief overview of statistics at Google, covering topics like Experimentation, measuring long term effects of treatments in an online system, Google Consumer Surveys and if time allows Causal Impact. I will also touch on how we handle big data at Google, what it's like to work here and some tips for statisticians on interviewing at companies like Google. Functional Principal Component Analysis for Longitudinal Data Truncated by EventTimeBoth functional and longitudinal data are data recorded over a time period for each subject in the study. However, the approaches to analyze them are intrinsically different, partly due to the difference in the sampling plans. Functional data refer to situations where the entire trajectory is observed for each subject, or when measurements are recorded for each subject at a dense grid of time points. Longitudinal data, however, are often recorded intermittently, leading to varying measurement schedules and numbers of measurements across subjects. Cosmology in the Era of Big Data: Understanding Our Universe a Bit at a TimeWith the development of new detectors, telescopes and computational facilities, astrophysics has entered an era of data intensive science. During the last decade, astronomers have surveyed the sky across many decades of the electromagnetic spectrum, collecting hundreds of terabytes of astronomical images for hundreds of millions of sources. Over the next decade, data volumes will reach tens of petabytes, and provide accurate measurements for billions of sources. Some Challenges in Environmental StatisticsEnvironmental statistics is a rich field for statistical problems. I will sketch four different problem areas, all with very different approaches. The first one has to do with statistical assessment of air quality standards. Starting from a classical NeymanPearson approach, recent work has moved into analysis of maxima of Gaussian processes. The second problem deals with estimating trends in extreme climate events. Optimal Inference After Model SelectionTo perform inference after model selection, we propose controlling the selective type I error; i.e., the error rate of a test given that it was performed. By doing so, we recover longrun frequency properties among selected hypotheses analogous to those that apply in the classical (nonadaptive) context. Our proposal is closely related to data splitting and has a similar intuitive justification, but is more powerful. Risk and the Eighteenth Century French Lottery: Napoleon Meets his ChiSquareDid Casanova practice risky sex? What did \"Powerball\" have to do with the Fall of the Bastille? Just how riskadverse was Robespierre? How did the sansculottes lose their culottes? In the eighteenth century in France, citizens and royalty faced a multitude of risks, from sexually transmitted disease to decapitation. An unusual data source on the French Lottery provides a window on how financial risk was addressed in that tumultuous time, and how the emerging calculus of probabilities affected its perception. Nonparametric Graphical Model: Foundation and TrendsWe consider the problem of learning the structure of a nonGaussian graphical model. We introduce two strategies for constructing tractable nonparametric graphical model families. One approach is through semiparametric extension of the Gaussian or exponential family graphical models that allows arbitrary graphs. Another approach is to restrict the family of allowed graphs to be acyclic, enabling the use of fully nonparametric density estimation in high dimensions. Measures of niche overlap in EcologyIn Ecology, the niche of a species is usually defined as a multidimensional hypervolume in which a species maintains a viable population (Hutchinson 1957). The community structure may be shaped by resource partitioning between cooccurring species, so quantifying the degree of this partitioning (i.e. niche overlap) is very important when studying species coexistence (Geange et al. 2010). The niche space is often described by multiple axes or variables. Philosophy of Probability and its Relationship (?) to StatisticsI will explain why the frequency statistics has absolutely nothing in common with the frequency philosophy of probability. If time permits, I will explain why the Bayesian statistics has absolutely nothing in common with the subjective philosophy of probability. My presentation will be an unbiased estimator of the truth, with subjective probability 90%. Causal Inference from Imperfect Studies with Nonignorable Treatment AssignmentThis second lecture will focus on more sophisticated methods applicable when too few covariates are available to make it plausible that treatment assignment is ignorable (i.e., conditionally randomized given the covariates). The template setting involves randomized experiments with noncompliance where \"useeffectiveness\" (i.e., the effect of exposure to the treatment, not the effect of assignment to the treatment) is the estimand. Computational and Statistical Convergence for Graph Estimation: A General FrameworkThe general theme of my research in recent years is spatiotemporal modeling and sparse recovery with high dimensional data under measurement error. In this talk, I will discuss several computational and statistical convergence results on graph and sparse vector recovery problems. Our methods are applicable to many application domains such as neuroscience, geoscience and spatiotemporal modeling, genomics, and network data analysis. I will present theory, simulation and data examples. Part of this talk is based on joint work with Mark Rudelson. Approximate Bayesian Inference and Optimal Design in the Sparse Linear ModelThe sparse linear model, where latent parameters are endowed with a Laplace prior, has seen many successful applications in Statistics, Machine Learning, and Computational Biology, such as identification of gene regulatory networks from microarray expression data, or sparse coding of images with overcomplete basis sets. Prior work has either approximated Bayesian inference by expensive Markov chain Monte Carlo, or replaced it by point estimation. We show how to obtain a good approximation to Bayesian inference efficiently, using the Expectation Propagation method. Latent IBP Compound Dirichlet Allocation: Sparse Topic Models Fit for Natural LanguagesI will introduce the fourparameter IBP compound Dirichlet process (ICDP), a stochastic process that generates sparse nonnegative vectors with potentially an unbounded number of entries. If we repeatedly sample from the ICDP we can generate sparse matrices with an infinite number of columns and powerlaw characteristics. We apply the fourparameter ICDP to sparse nonparametric topic modelling to account for the very large number of topics present in large text corpora and the powerlaw distribution of the vocabulary of natural languages. Random Effects Models for Social Network AnalysisSocial network data often have a special dependence structure, since they usually contain information about the strength of an individual's relation (e.g., friendship) with more than one other person. On most cases, one of the research questions concerns the effect of personal attributes on the occurrence or strength of a relation. Thus, a (cross)nested data structure is obtained which is suitable for the analysis with multilevel models or with related random effects models. On Using Predictive Models for DecisionsWe often use predictive models to make a decision afterwards. For instance, we might estimate the number of patients at a medical clinic and then designate resources to serve those patients. The Strength of Evidence Found by Searching a DatabaseThe United Kingdom Home Office holds approximately 1 million DNA profiles in its database of known offenders. Suppose that a partial DNA profile is recovered from the scene of a crime. The probability of drawing this profile from a randomly selected individual in England and Wales is estimated to be 1/1,000,000. The crime scene profile is compared with each profile in the offender database and is found to match the profile of one person, S. S was not in custody at the time the crime took place, but no other evidence linking S to the scene of the crime is found. A Bayesian Multivariate Functional Dynamic Linear ModelWe present a Bayesian approach for modeling multivariate, dependent functional data. To account for the three dominant structural features in the datafunctional, time dependent, and multivariate componentswe extend hierarchical dynamic linear models for multivariate time series to the functional data setting. We also develop Bayesian spline theory in a more general constrained optimization framework. A Flexible Framework for Bayesian Learning and Estimation Using Gaussian Graphical ModelsIn many areas of economic analysis, economic theory restricts the shape as well as other characteristics of functions used to represent economic constructs. Obvious examples are the monotonicity and curvature conditions that apply to utility, profit, and cost functions. Commonly, these regularity conditions are imposed either locally or globally. Here we extend and improve upon currently available estimation methods for imposing regularity conditions by imposing regularity on a connected subset of the regressor space. Characterizing Selection Bias Using Experimental DataStatistical Inference for ExponentialFamily Random Graph Models with Additional Structure: Theoretical and Computational AdvancesModels of network data have witnessed a surge of interest in statistics and related areas. Such data arise in the study of insurgent and terrorist networks, contact networks facilitating the spread of infectious diseases, social networks, the World Wide Web, and other areas. A GCV Approach for Bandwidth Selection in Positron Emission Tomography Image ReconstructionThe problem of bandwidth estimation for smoothed least squares (SLS) image reconstruction  such as filtered backprojection (FBP) in Positron Emission Tomography (PET)  has been extensively studied in the statistics literature. Here, I extend the generalized crossvalidation (GCV) strategy for ridge regression (Golub et al, 1979) and develop it to determine the optimal smoothing parameter in FBP reconstruction. Results on eigendecomposition of symmetric one and twodimensional circulant matrices are derived. New Algorithms for MEstimation of Multivariate Location and ScatterThe talk starts with an overview of multivariate Mfunctionals of location and scatter, including symmetrized Mfunctionals of scatter. Then we discuss general properties of the underlying loglikelihood function. After that we review the currently known algorithms, fixedpoint or iteratively reweighted moments. It is explained why these algorithms are intrinsically suboptimal: Then an alternative strategy, based on a "partial Newton" approach, is developed. Numerical examples and, if time permits, applications of Mestimators to Independent Component Analysis are presented. Causal Inference with General Treatment Regimes: Generalizing the Propensity ScoreIn this talk, we develop the theoretical properties of the propensity function which is a generalization of the propensity score of Rosenbaum and Rubin (1983). Methods based on the propensity score have long been used for causal inference in observational studies; they are easy to use and can effectively reduce the bias caused by nonrandom treatment assignment. Although treatment regimes are often not binary in practice, the propensity score methods are generally confined to binary treatment scenarios. Kalman Filtering from an Optimization PerspectiveWe review that classical notion of Kalman filters for state estimation in dynamical systems. We then reformulate the estimation problem as an optimization problem and show how this perspective allows one to overcome many of the perceived barriers to extending the basic model to a wide range of novel settings. In particular, we show how to extend the model to nonlinear settings involving state constraints, nonGaussian densities, outliers, sparsity, trend shifts, and state dependent covariances. Missing Data in MorphometricsMorphometric data sets have not only the usual parameter structures (mean shape, sample covariance) but also other geometric functions of the mean form that can structure prior knowledge. When information from data is absent or weak, these auxiliary formalisms can supply reasonable "expectations" in a context similar to the classic EM alternating algorithm. On oddnumbered steps, population parameters are estimated by leastsquares or ML; on evennumbered steps, individual missing data are estimated. Conquering the Complexity of Time: Mining from Big Time Series DataMany emerging applications of big data involve time series data. In this talk, I will discuss a collection of machine learning and data mining approaches to effectively analyze and model largescale time series and spatiotemporal data. Experiment results will be shown to demonstrate the effectiveness of our models in healthcare and climate applications. Bayesian Nonparametric Modelling with the Dirichlet Process Regression SmootherIn this paper we discuss the problem of Bayesian fully nonparametric regression. The paper is concerned with two issues: 1) a new construction of priors for nonparametric regression is discussed and a specific prior, the Dirichlet Process Regression Smoother, is proposed, and 2) we consider the problem of centring a dependent nonparametric prior over a class of regression models and propose fully nonparametric regression models with flexible location structures. Computational methods are developed for all models described. Results are presented for simulated and actual data examples. Some LogLinear and LogNonlinear Models for Ordinal Scales with Midpoints, with an Application to Public Opinion on the EnvironmentNo SeminarNo Seminar Feature Selection Through Lasso: Model Selection Consistency and the BLasso AlgorithmInformation technology advances are making data collection possible in most if not all fields of science and engineering and beyond. Statistics as a scientific discipline is challenged and enriched by the new opportunities resulted from these highdimensional data sets. Often data reduction or feature selection is the first step towards solving these massive data problems. However, data reduction through model selection or l_0 constrained least squares optimization leads to a combinatorial search which is computationally infeasible for massive data problems. Exploring dynamic complex systems using timevarying networksExtracting knowledge and providing insights into the complex Statistical Rules of ThumbA statistical rule of thumb is defined as a widely applicable guide to statistical practice with sound theoretical basis. Characteristics include intuitive appeal, elegance, and transparency. A rule states not only what is important but, by implication of what is not included, makes an assertion about what is less important. This talk is based on the recently published book, Statistical Rules of Thumb, Wiley and Sons, March 2002. A Population Background for Nonparametric DensityBased ClusteringDespite its popularity, the investigation of some theoretical aspects of clustering has been relatively sparse. One of the main reasons for this lack of theoretical results is surely the fact that, whereas for other statistical problems the theoretical population goal is clearly defined (as in regression or classification), for some of the clustering methodologies it is difficult to specify the population goal to which the databased clustering algorithms should try to get close. Georges Matheron, the Father of GeostatisticsGeorges Matheron has been an enormously influential figure in defining the principles and basic methodology of what is considered Geostatistics and for similary (co) defining the field of Mathematical Morphology. And yet, while his name is now wellknown, widespread recognition of his work in Geostatistics, at least in the Englishspeaking world, was late in coming. Most people know Geostatistics through the work of his students. I will briefly review Matheron\'s career and name some of his major contributions in Geostatistics and Mathematical Morphology. BeforeAfterControlImpact Analysis in EcologyBeforeAfterControlImpact (BACI) designs are used to study ecological responses in large experimental units (e.g., lakes, forests and mesocosms) for which replication is difficult or impossible. Two units are monitored over time; one unit receives an intervention at some intermediate time, while the other is left as an undisturbed control. The preintervention differences in the response between units are compared to the postintervention differences, with a large disparity interpreted as evidence of an effect of the intervention. Nonparametric Estimation and Comparison for NetworksScientific questions about networks are often comparative: we want to know whether the difference between two networks is just noise, and, if not, how their structures differ. I'll describe a general framework for network comparison, based on testing whether the distance between models estimated from separate networks exceeds what we'd expect based on a pooled estimate. Statistical Foundations of DistanceBased PhylogeneticsWe will review some of the popular methods for distancebased phylogeny reconstruction with a focus on the statistical theory underlying the methods. In particular, we discuss least squares interpretations of the minimum evolution principle and neighborjoining, and connections to Felsenstein\'s quantitative character models. Gibbs Sampling for Subsequence Resemblance: Application to Rhetorical SequencesImpacts of Climate Change on Species Distributions: Empirical and Statistical ChallengesOne of the greatest challenges ecologists face is predicting how climate change will affect the organisms with which we share our planet. Ecological theory predicts that species current distributions are determined by their climatic niches (i.e. fitness as a function of climate). Statistical models relating species geographic distributions to climate (SDM’s – species distribution models) are therefore used to predict shifts in species distributions with climate change. On the Geometry of LogLinear ModelsRecent advances in Algebraic Statistics have suggested a more general approach to the study of loglinear models that relies on the tools and language of algebraic and polyhedral geometry. In this talk, the problem of the existence of the Maximum Likelihood Estimate (MLE) of the cell mean vector of a contingency table, fundamental for assessment of fit, model selection and interpretation, is considered. Geometric and combinatorial conditions for the existence of the MLE are given, by combining tools from polyhedral geometry and the theory of linear exponential families. Exploring the Structure of Networks and CommunitiesNetworks are all around us: social networks allow for information and influence flow through society, viruses become epidemics by spreading through networks, and networks of neurons allow us think and function. With the recent technological advances and the development of online social media we can study networks that were once essentially invisible to us. In this talk we discuss how computational perspectives and machine learning models can be developed to abstract networked phenomena like: How will a community or a social network evolve in the future? Modeling Atmospheric Circulation Changes Over the North PacificA major difficulty in investigating the nature of atmospheric circulation changes over the North Pacific is the shortness of historical time series. An approach to this problem is through comparison of models. In this talk we contrast two stochastic models and a \'signal plus noise\' model for the winter averaged sea level pressure time series for the Aleutian low (the North Pacific (NP) index) and for air temperatures from Sitka, Alaska. The two stochastic models are a first order autoregressive (AR(1)) model and a fractionally differenced (FD) model. PDW Methods for Support Recovery in Nonconvex HighDimensional ProblemsThe primaldual witness (PDW) technique is a nowstandard proof strategy for establishing variable selection consistency for sparse highdimensional estimation problems when the objective function and regularizer are convex. The method proceeds by optimizing the objective function over the parameter space restricted to the true support of the unknown vector, then using a dual witness to certify that the resulting solution is also a global optimum of the unrestricted problem. Logic Regression and Statistical Issues Related to the Protein Folding ProblemAdvisors: Michael LeBlanc & Charles Kooperberg Using Radical Environmentalist Texts to Uncover Network Structure and Network FeaturesIn their efforts to call attention to environmental problems, communicate with likeminded groups, and mobilize support for their activities, radical environmentalist organizations produce an enormous amount of text. These texts, like radical environmental groups themselves, are often (i) densely connected and (ii) highly variable in advocated protest activities. Given a corpus of radical environmentalist texts, can one uncover the underlying network structure of environmental (and related leftist) groups? StructureBased Characteristics and Time Series ClusteringWe propose a new method for clustering time series. A univariate time series can be represented by a fixedlength vector whose components are statistical features of the time series, capturing the global structure. These descriptive vectors, then being clustered using a standard fast clustering algorithms. A further search mechanism is used to find the best selection from the features for some specific problem domain or data set. We demonstrate the effectiveness and simplicity of our proposed method by clustering some benchmark datasets with empirical results. The Logic of Cause and Effect: Unifying Counterfactual, Graphical and Structural ModelsA systematic handling of causality requires a mathematical language in which causal relationships receive symbolic representation, clearly distinct from statistical associations. Two such languages have been proposed in the past: path analysis and structural equations models, used extensively in economics and the social sciences, and LewisNeymanRubin\'s counterfactual (or potentialresponse) model, used sporadically in philosophy and statistics. From Big Data to Precision Oncology using Machine LearningWhile targeting key drivers of tumor progression (e.g., BCR/ABL, HER2, and BRAFV600E) has had a major impact in oncology, most patients with advanced cancer continue to receive drugs that do not work in concert with their specific biology. This is exemplified by acute myeloid leukemia (AML), a disease for which treatments and cure rates (in the range of 20%) have remained stagnant. Effectively deploying an everexpanding array of cancer therapeutics holds great promise for improving these rates but requires methods to identify how drugs will affect specific patients. Modeling of Pathways and Regulatory Dynamics, With Applications to the SaltLoving Extremophile Halobacterium NRC1I will describe various efforts that we at the Institute for Systems Biology have undertaken to model the pathways and dynamics of systems in organisms from yeast to human. I will focus on our system for network inference and modeling of the regulatory network of Halobacterium, an organism that thrives in hypersaline environments. Bayesian Reconstruction of TwoSex Populations by Age: Estimating Sex Ratios at Birth and Sex Ratios of MortalityWe will describe Bayesian population reconstruction, a recent method for estimating past populations by age for all countries, including developing countries where data on past populations are fragmentary and of variable quality. Such reconstructions are needed for the World Population Prospects, a comprehensive set of demographic statistics for all countries issued by the United Nations and updated every two years. Copulas and Tail Dependence and Applications in FinanceAlthough the subject of copulas has a history going back to the 1950s, it is now enjoying a period of fashionability and much of this can be explained by new applications for the theory in the modelling of multivariate financial time series. Copulas are a useful tool for building multivariate distributions with interesting \"dependence structures\" and, in particular, dependence structures that differ markedly from that of the multivariate normal distribution, which is still widely used in financial applications. PDW Methods for Support Recovery in Nonconvex HighDimensional ProblemsThe primaldual witness (PDW) technique is a nowstandard proof strategy for establishing variable selection consistency for sparse highdimensional estimation problems when the objective function and regularizer are convex. The method proceeds by optimizing the objective function over the parameter space restricted to the true support of the unknown vector, then using a dual witness to certify that the resulting solution is also a global optimum of the unrestricted problem. Identifiability of Linear Structural Equation ModelsStructural equation models are multivariate statistical models that are defined by specifying noisy functional relationships among random variables. This talk treats the classical case of linear relationships and additive Gaussian noise terms. Each linear structural equation model is associated with a graph and corresponds to a polynomially parametrized set of positive definite covariance matrices. Model Selection Procedures in Nonparametric RegressionConsider the regression model Y=g0(X)+E, where E is the error term, and g0:R^k > R is the unknown regression function to be estimated from independent observations of (X,Y). Furthermore we have a countable collection of models (classes of candidate regression functions of finite VC dimension) of growing complexity. The larger the model, the better the approximation error, but the worse the estimation error. In order to balance both errors, we propose to estimate g0 by means of penalised least squares, where the penalty is proportional to the VCdimension of the model. Latentvariable graphical modeling via convex optimizationSuppose we have a graphical model with sample observations of only a subset of the variables. Can we separate the extra correlations induced due to marginalization over the unobserved, hidden variables from the structure among the observed variables? In other words is it still possible to consistently perform model selection despite the unobserved, latent variables? Survival Analysis and LengthBiased Sampling: An Application to Survival with DementiaWhen survival data are colleted as part of a prevalent cohort study, the recruited cases have already experienced their initiating event. These prevalent cases are then followed for a fixed period of time at the end of which the subjects will either have failed or have been censored. When interests lies in estimating the survival distribution, from onset, of subjects with the disease, one must take into account that the survival times of the cases in a prevalent cohort study are left truncated. The Walking Dog Model, Tetrad Differences, and Sibling ResemblanceIn this talk, I will try to trace some of the ideas that led from Herbert Costner\'s early work with multiple indicator models to simple models of sibling resemblance in social and economic standing, and to more elaborate models that combine direct and indirect measurement of family influence. Estimation of a Twocomponent Mixture Model with Applications to Multiple TestingWe consider estimation and inference in a two component mixture model where the distribution of one component is completely unknown. We develop methods for estimating the mixing proportion and the unknown distribution nonparametrically, given i.i.d. data from the mixture model. We use ideas from shape restricted function estimation and develop "tuning parameter free" estimators that are easily implementable and have good finite sample performance. We establish the consistency of our procedures. Regularized Covariance Matrix EstimationI will review and discuss some of the different themes of regularized estimation of the population covariance matrix: 1. Why estimate it and in what norm? Profile Likelihood Estimation in SemiParametric ModelsThis talk presents an alternative profile likelihood estimation theory. By introducing a new parametrization, we improve on the seminal work of Murphy and van der Vaart (2000) in 2 ways: we prove the no bias condition in a general semiparametric model context, and deal with the direct quadratic expansion of the profile likelihood rather than an approximate one. In addition, we discuss a difficulty which we encounter in the profile likelihood estimation. ModelBased Clustering of Magnetic Resonance DataIn radiology, magnetic resonance imaging (MRI) and magnetic resonance spectroscopic imaging (MRSI) play an increasingly important role. However, the wealth of data available to the radiologist makes it more difficult to extract the relevant information. One way to summarise information from several congruent images is to show a segmented image, i.e. an image where pixels are clustered. Partial Identification and Confidence Sets for Functionals of the Joint Distribution of "Potential Outcomes"Authors: Yanqin Fan, Emmanuel Guerre, and Dongming Zhu Querying Probabilistic DataA major challenge in data management is how to manage uncertain data. Many reasons for the uncertainty exists: the data may be extracted automatically from text, it may be derived from the physical world such as RFID data, it may be integrated using fuzzy matches, or may be the result of complex stochastic models. Whatever the reason for the uncertainty, a data management system needs to offer predictable performance to queries over large instances of uncertain data. Gini Association and the PseudoLorenz CurveWe were motivated by the problem of assessing the influence on the inequality in income by the corresponding inequality in some other related variable (say, the number of years of formal education completed). More generally, consider the pseudoLorenz curve of a nonnegative r.v. Y relative to (i.e., with respect to the ordering of) another related nonnegative r.v. X. It is shown that this pseudoLorenz curve L(Y/X) always lies above the Lorenz curve L(Y) of Y. Modeling hierarchical variance with Kronecker structure, with application to quality measures in Medicare AdvantageStudying covariance matrices in hierarchical models can reveal meaningful relationships among variables, but these become difficult to interpret as the number of variables grows. Conventional factor analysis reduces the dimension by mapping onto a set of onedimensional factors, but does not accommodate variables with a crossclassified layout. For such applications, we develop hierarchical models with Kroneckerproduct (separable) covariance structure at the second level. Point Process Models for Astronomy: Quasars, Coronal Mass Ejections, and Solar FlaresI will be presenting a talk on my dissertation research which consisted of the statistical analysis of two interesting astronomical applications involving point process data. Local Discriminant Bases and Their AapplicationsFor signal and image classification problems, such as the ones in medical or geophysical diagnostics and military applications, extracting relevant features is one of the most important tasks. As an attempt to automate the feature extraction procedure and to understand what the critical features for classification are, we developed the socalled local discriminant basis (LDB) method which rapidly selects an orthonormal basis suitable for signal/image classification problems from a large collection of orthonormal bases (e.g., wavelet packets and local trigonometric bases). Novel Approaches to Snowball / RespondentDriven Sampling That Circumvent the Critical ThresholdWeb crawling, snowball sampling, and respondentdriven sampling (RDS) are three types of network driven sampling techniques that are popular when it is difficult to contact individuals in the population of interest. This talk will first review previous research which has shown that if participants refer too many other participants, then under the standard Markov model in the RDS literature, the standard approaches do not provide "square root n" consistent estimators. In fact, there is a critical threshold where the design effect of network sampling grows with the sample size. De Finetti's Ultimate FailureThe most scientific and least controversial claim of de Finetti's subjective philosophy of probability is that the rules of Bayesian inference can be derived from a system of axioms for rational decision making that does not presuppose existence of probability. In fact, de Finetti's argument is fatally flawed. The error is irreparable. The slides in PowerPoint and PDF are available at http://www.math.washington.edu/~burdzy/Philosophy/. Regular Variation and Extremes in Atmospheric ScienceDependence in the tail of the distribution can differ from that in the bulk of the distribution. A basic tenet of a univariate extreme value analysis is to discard the bulk of the data and only analyze the data considered to be extreme. This is true for multivariate problems as well. We will first introduce a framework for describing tail dependence. The probabilistic framework of regular variation has strong ties to classical extreme value theory and provides a framework for describing tail dependence. Statistical Factor Models and Predictive Approaches for Problems of Molecular CharacterisationI will discuss aspects of data analysis and modelling arising from a number of clinical studies that aim to integrate gene expression, and other forms of molecular data, into predictive modelling of clinical outcomes and disease states. Some of our work on empirical and model based approaches to defining underlying factor structure in largescale expression data, and the use of estimated factors in predictive regression and classification tree models, will be reviewed. Nonstationary Time Series Modeling and Estimation with Applications in OceanographyThis talk will focus on nonstationary time series, from both a methodological and applied perspective. On the methodology side, I will discuss new stochastic models for capturing structure in bivariate data, by representing the series as complexvalued. This representation allows for novel ways of capturing features that are multiscale, anisotropic and/or nonstationary. I will also present new methodology and theory for maximum likelihood inference in the frequencydomain, specifically by providing a method for removing estimation error from the Whittle likelihood. UPS Delivers Optimal Phase Diagram for High Dimensional VariableConsider a linear regression model Y = XÎ² + z; z ~ N(0, In); X = Xn,p; where both p and n are large but p > n. The vector Î² is unknown but is sparse in the sense that only a small proportion of its coordinates is nonzero, and we are interested in identifying these nonzero ones. We model the coordinates of Î² as samples from a twocomponent mixture (1Ïµ)Ï…0 + ÏµÏ€, and the rows of X as samples from N(0, 1/n Î©), where Ï…0 is the point mass at 0, Ï€ is a distribution, and Î© is a p by p correlation matrix which is unknown but is presumably sparse. Nonhomogeneous Hidden Markov Models for Downscaling Synoptic Atmospheric Patterns to Precipataion AmountsAdvisors: Peter Guttorp & Jim Hughes Low rank tensor completionMany problems can be formulated as recovering a lowrank tensor. Although an increasingly common task, tensor recovery remains a challenging problem because of the delicacy associated with the decomposition of higher order tensors. We investigate several convex optimization approaches to low rank tensor completion. Statistics at GoogleThis presentation will describe some of the problems faced and methods used by statisticians at Google: â€¢ A primary dimension of search quality is the relevance of search results to the search query. Preference rank allows us to convert pairwise comparisons into a ranking of search results. â€¢ Through the AdSense program, Google delivers targeted advertising on thirdparty web sites, which we refer to as publishers. Publisher scores are a method of ranking publishers by their effectiveness as an ad delivery platform. MS Thesis Presentation  Simple Transformation Techniques for Improved Nonparametric RegressionIn this paper, the authors propose and investigate two new methods for achieving less bias in nonparametric regression and use simulations to compare the bias, variance, and mean squared error from the second and preferred of these two methods to the biases, variances, and mean squared errors of the local constant, local linear, and local cubic nonparametric regression estimators. The two new methods proposed by the authors have bias of order h^4 where h is the estimatorâ€™s smoothing parameter, in contrast to the basic kernel estimatorâ€™s bias of order h^2. A SMART Stochastic Algorithm for Nonconvex OptimizationWe show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set and (2) simultaneously fits the trimmed model on the remaining uncontaminated data. To solve the resulting nonconvex optimization problem, we introduce a fast stochastic proximalgradient algorithm that incorporates prior knowledge through nonsmooth regularization. Identification of Minimal Sets of Covariates for Matching EstimatorsThe availability of large observational data bases allow empirical scientists to consider estimating treatment effects without conducting costly and/or unethical experiments where the treatment would be randomized. The NeymanRubin model (potential outcome framework) and the associated matching estimators have become increasingly popular, because they allow for the nonparametric estimation of average treatment effects. Bayesian Models for Integrative GenomicsNovel methodological questions are being generated in the biological sciences, requiring the integration of different concepts, methods, tools and data types. Bayesian methods that employ variable selection have been particularly successful for genomic applications, as they allow to handle situations where the amount of measured variables can be much greater than the number of observations. In this talk I will focus on models that integrate experimental data from different platforms together with prior knowledge. Markov Random Fields and Issues of ComputationMarkov Random Fields are extremely useful and generally applicable for probabilistic modelling of a wide range of systems. We\'ll review methods for performing inference calculations (most likely configuration and marginal probabilities) on MRFs. Unfortunately, for many tasks, these basic calculations are computationally infeasible. We\'ll discuss the limitations of standard computation methods and the graphtheoretic properties related to computational complexity. Prior Adjusted Default Bayes Factors for Testing (In)Equality Constrained HypothesesBayes factors have been proven to be very useful when testing statistical hypotheses with inequality (or order) constraints and/or equality constraints between the parameters of interest. Two useful properties of the Bayes factor are its intuitive interpretation as the relative evidence in the data between two hypotheses and the fact that it can straightforwardly be used for testing multiple hypotheses. The choice of the prior, which reflects one's knowledge about the unknown parameters before observing the data, has a substantial effect on the Bayes factor. Probabilistic Weather Forecasting Using Bayesian Model AveragingProbabilistic forecasts of wind vectors are becoming critical as interest grows in wind as a clean and renewable source of energy, in addition to a wide range of other uses, from aviation to recreational boating. Unlike other common forecasting problems, which deal with univariate quantities, statistical approaches to wind vector forecasting must be based on bivariate distributions. The prevailing paradigm in weather forecasting is to issue deterministic forecasts based on numerical weather prediction models. Ergodic Limit Laws for Stochastic Optimization ProblemsDepartment of Mathematics Optimization Seminar Solution procedures for stochastic programming problems, statistical estimation problems (constrained or not), stochastic optimal control problems and other stochastic optimization problems often rely on sampling. The justification for such an approach passes through 'consistency.' A comprehensive, satisfying and powerful technique is to obtain the consistency of the optimal solutions, statistical estimators, controls, etc., as a consequence of the consistency of the stochastic optimization problems themselves. Using SingleCell Transcriptome Sequencing to Infer Olfactory Stem Cell Fate TrajectoriesSinglecell transcriptome sequencing (scRNASeq), which combines highthroughput singlecell extraction and sequencing capabilities, enables the transcriptome of large numbers of individual cells to be assayed efficiently. Nonparametric Estimation of a Convex BathtubShaped Hazard FunctionIn the analysis of lifetime data, a key object of interest is the hazard function, or instantaneous failure rate. One natural assumption is that the hazard is bathtub, or Ushaped (i.e. first decreasing, then increasing). In particular, this is often the case in reliability engineering or human mortality. MS Thesis Presentation  Hierarchical Mixture of Experts and ApplicationsHME (Hierarchical Mixture of Experts) is a tree structured architecture for supervised learning. It is characterized by Soft multiway probabilistic splits, generally based on linear functions of input values, and by linear or logistic fit of the terminal nodes (in HME literature called Experts) rather then constant function as in CART. The statistical model underlying HME is a hierarchical mixture model, which allows for maximum likelihood estimation of the parameters using EM methods. Graph Structured Signal ProcessingSignal processing on graphs is a framework for nonparametric function estimation and hypothesis testing that generalizes spatial signal processing to heterogeneous domains. I will discuss the history of this line of research, highlighting common themes and major advances. I will introduce various graph wavelet algorithms, and highlight any known approximation theoretic guarantees. Recently, it has been determined that the fused lasso is theoretically competitive with wavelet thresholding under some conditions, meaning that the fused lasso is also a locally adaptive smoothing procedure. Bootstrap and Subsampling for NonStationary Spatial DataSubsampling and bootstrap methods have been suggested in the literature to nonparametrically estimate the variance and distribution of statistics computed from spatial data. Usually stationary data are required to ensure that the methods work. However, in empirical applications the assumption of stationarity often must be rejected. This talk presents consistent bootstrap and subsampling methods to estimate the variance and distributions of statistics based on nonstationary spatial lattice data. Applications to forestry are also discussed. Controlling False Discovery Rate Via KnockoffsIn many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are associated with the response, while controlling the false discovery rate (FDR) to ensure that our results are reliable and replicable. The knockoff filter is a variable selection procedure for linear regression, proven to control FDR exactly under any type of correlation structure in the regime where n>p (sample size > number of variables). Point Process Transformations and Applications to Wildfire DataThis talk will review some ways of transforming point processes, including smoothing, thinning, superposition, rescaling, and tessellation. Ways in which each of these may be used in the analysis of point process data will be examined, especially in relation to the problem of estimating wildfire hazard. We will explore in particular an important computational geometry problem involving tessellations, namely the estimation of point locations from piecewise constant image data via Dirichlet tessellation inversion. Flexible, Reliable, and Scalable Nonparametric LearningApplications of statistical machine learning increasingly involve datasets with rich hierarchical, temporal, spatial, or relational structure. Bayesian nonparametric models offer the promise of effective learning from big datasets, but standard inference algorithms often fail in subtle and hardtodiagnose ways. We explore this issue via variants of a popular and general model family, the hierarchical Dirichlet process. Estimation of the Relative Risk and Risk DifferenceI will first review wellknown differences between odds ratios, relative risks and risk differences. These results motivate the development of methods, analogous to logistic regression, for estimating the latter two quantities. I will then describe simple parametrizations that facilitate maximumlikelihood estimation of the relative risk and riskdifference. Further, these parametrizations allow for doublyrobust gestimation of both quantities. (Joint work with James Robins, Harvard School of Public Health) Curve Fitting and Neuron Firing PatternsReversiblejump Markov chain Monte Carlo may be used to fit scatterplot data with cubic splines having unknown numbers of knots and knot locations. Key features of the implementation my colleagues and I have investigated are (i) a fully Bayesian formulation that puts priors on the spline coefficients and (ii) MetropolisHastings proposal densities that attempt to place knots close to one another. Simulation results indicate this methodology can produce fitted curves with substantially smaller mean squarederror than competing methods. On generalizations of the loglinear modelRelational models generalize loglinear models for multivariate categorical data in three aspects. The sample space does not have to be a Cartesian product of the ranges of the variables, the effects allowed in the model do not have to be associated with cylinder sets, and the existence of an overall effect present in every cell is not assumed. After discussing examples which motivate these generalizations, the talk will consider estimation and testing in relational models. Survey of Generalized Inverses and Their Use in Stochastic ModellingIn many stochastic models, in particular Markov chains in discrete or continuous time and Markov renewal processes, a Markov chain is present either directly or indirectly through some form of embedding. The analysis of many problems of interest associated with these models, eg. stationary distributions, moments of first passage time distributions and moments of occupation time random variables, often concerns the solution of a system of linear equations involving I  P, where P is the transition matrix of a finite, irreducible, discrete time Markov chain. MS Thesis Presentation  A NonParametric Approach for Handling Repeated Measures in Cancer ExperimentsIn longitudinal studies, the usual modeling assumptions for multivariate analyses don\'t always hold up so well. One way to treat this is to use nonparametric approaches. In the paper I will be presenting on, the authors analyzed tumor volume in rats as a function of lipids in their diet. The data was highly heteroscedastic and strongly correlated with time. To compare lipid diets, randomization Ftests were used. Then, local polynomial smoothing was used to create tumor growth curves for each diet, as well as confidence intervals that account for the serially correlated data. Yule's "Nonsense Correlation" Solved!In this talk, I will discuss how I recently resolved a longstanding open statistical problem. The problem, formulated by the British statistician Udny Yule in 1926, is to mathematically prove Yule's 1926 empirical finding of ``nonsense correlation.” We solve the problem by analytically determining the second moment of the empirical correlation coefficient of two independent Wiener processes. Using tools from Fredholm integral equation theory, we calculate the second moment of the empirical correlation to obtain a value for the standard deviation of the empirical correlation of nearly .5. NonStationary Analysis and Radial Localisation in 2DImage analysis has in the last decade experienced a revolution via the development of new tools for the representation and analysis of local image features. At the heart of these developments is the construction of suitable local representations of structure, via decompositions in a set of localized functions. The chosen decomposition then forms the setting for further analysis and/or estimation methods. In particular, compression of a given representation ensures that most decomposition coefficients are of negligible magnitude, and this often simplifies the analysis considerably. Clustering Based on NonParametric Density Estimation: A ProposalCluster analysis based on nonparametric density estimation represents an approach to the clustering problem whose roots date back several decades, but it is only in recent times that this approach could actually be developed. The talk presents one proposal within this approach which is among the few ones which have been brought up to the operational stage. Overdetermined Estimating Equations with Applications to Panel DataPanel data has important advantages over purely crosssectional or timeseries data in studying many economic problems, because it contains information about both the intertemporal dynamics and the individuality of the entities being investigated. A commonly used class of models for panel studies identifies the parameters of interest through an overdetermined system of estimating equations. Two important problems that arise in such models are the following: (1) It may not be clear a priori whether certain estimating equations are valid. Optimal Design of Experiments in the Presence of Network InterferenceCausal inference research in statistics has been largely concerned with estimating the effect of treatment (e.g. personalized tutoring) on outcomes (e.g., test scores) under the assumption of "lack of interference"; that is, the assumption that the outcome of an individual does not depend on the treatment assigned to others. Moreover, whenever its relevance is acknowledged (e.g., study groups), interference is typically dealt with as an uninteresting source of variation in the data. Two Related Problems Involving Gaussian Markov Random FieldsGaussian Markov Random Fields (GMRFs) has been around for a long time; however, it is first in the recent years that its computational benefits in Bayesian inference has become clear. In this talk, I\'ll discuss two related problems which involves GMRFs. The first is the problem of constructing Gaussian fields on triangulated manifolds. By viewing this as finding the solution of a stochastic partial differential equation (SPDE), the GMRFs appear as the solutions when solving the SPDE using the \"finite element\" approach. ComputationallyIntensive Inference in Molecular Population GeneticsModern molecular genetics generates extensive data which document the genetic variation in natural populations. Such data give rise to challenging statistical inference problems both for the underlying evolutionary parameters and for the demographic history of the population. These problems are of considerable practical importance and have attracted recent attention, with the development of algorithms based on importance sampling (IS) and Markov chain Monte Carlo (MCMC). A Bayesian information criterion for singular modelsWe consider approximate Bayesian model choice for model selection problems that involve models whose Fisherinformation matrices may fail to be invertible along other competing submodels. Such singular models do not obey the regularity conditions underlying the derivation of Schwarz's Bayesian information criterion (BIC) and the penalty structure in BIC generally does not reflect the frequentist largesample behavior of their marginal likelihood. Assessment of Scaling in High Frequency Data: Convex Rearrangements in the Wavelet DomainWe overview the notion of regular scaling in data and estimators of this regular scaling on several examples involving high frequency measurements. Next we discuss the importance of wavelet domains and ability of wavelets to precisely estimate regular Statistical Problems in Large NetworksNatural modeling of large networks leads to exponential models with sufficient statistics being such things as the number of triangles or the degree sequence. These look like standard problems but some surprises have emerged. For some models, it is possible to estimate n parameters based on a sample of size one. For other models, with two parameters, maximum likelihood is inconsistent. Many of these models show phase transitions. The new tools required include the emerging theory of graph limits. This is joint work with Sourav Chatterjee and Allan Sly From safe screening rules to working sets for faster Lassotype solversConvex sparsity promoting regularizations are now ubiquitous to regularize inverse problems in statistics, in signal processing and in machine learning. By construction, they yield solutions with few nonzero coefficients. This point is particularly appealing for Working Set (WS) strategies, an optimization technique that solves simpler problems by handling small subsets of variables, whose indices form the WS. Such methods involve two nested iterations: the outer loop corresponds to the definition of the WS and the inner loop calls a solver for the subproblems. Random Effects Graphical Regression Models for Biological Monitoring DataAn emerging area of research in ecology is the analysis of functional species assemblages. In essence, the analysis of functional assemblages is concerned with determining and predicting the composition of individuals categorized using different life history traits instead of strict taxa names. We propose a statespace model for the analysis of multiple trait compositions along with sitespecific covariate information. A sitespecific random effects term allows for modeling extra variability including spatial variability in trait compositions. Computational Considerations on NeuroengineeringNeuroengineering is an emerging interdisciplinary field with the goal of developing effective, robust devices that interact with the nervous system. These devices may act in closed loop with the nervous system to augment, repair, or even replace aspects of its basic function. Neuroengineering presents a set of interesting computational challenges that may require diverse solutions. For instance, How do we perform efficient computations on large quantities of neural data with severely limited computing resources? The Covariance Structure of Circular RanksThe linear representation of order statistics is a random permutation matrix which can be applied to obtain the usual covariance structure of ranks and other induced order statistics. In this talk, the algebraic structure of the standard case will be identified and extended to the ordering of observations indexed by circular, uniformly spaced, coordinates. These data are characteristic, for example, of corneal curvature maps used to assess regular astigmatism in the optics of the human eye. Causal Discovery with Confidence Using Invariance PrinciplesWhat is interesting about causal inference? One of the most compelling aspects is that any prediction under a causal model is valid in environments that are possibly very different to the environment used for inference. For example, variables can be actively changed and predictions will still be valid and useful. This invariance is very useful but still leaves open the difficult question of inference. We propose to turn this invariance principle around and exploit the invariance for inference. Estimating Common Functional Principal Components in a Linear Mixed Effects Model FrameworkThe emerging area of statistical science known as functional data analysis is concerned with evaluating information on curves or functions. In recent years much of the research emphasis has focused on extending statistical methods from classical settings into the functional domain. For example, functional principal component analysis (FPCA) is analogous to the traditional PCA, except that the observed data are entire functions rather than multivariate vectors. Constrained Nonparametric Estimation via Mixtures, with an Application in Cancer GeneticsWe discuss modeling probability measures constrained to a convex set. We represent measures in such sets as mixtures of simple, known extreme measures, and so the problem of estimating a constrained measure becomes one of estimating an unconstrained mixing measure. Such convex constraints arise in many modeling situations, such as empirical likelihood and modeling under stochastic ordering constraints. A Bayesian information criterion for singular modelsWe consider approximate Bayesian model choice for model selection problems that involve models whose Fisherinformation matrices may fail to be invertible along other competing submodels. Such singular models do not obey the regularity conditions underlying the derivation of Schwarz's Bayesian information criterion (BIC) and the penalty structure in BIC generally does not reflect the frequentist largesample behavior of their marginal likelihood. Robust Inference Using Higher Order Influence FunctionSuppose we obtain $n$ i.i.d copies of a random vector $O$ with unknown distribution $F(\\\\theta)$, $\\\\theta \\\\in Theta$. Our goal is to construct honest $100 (1  \\\\alpha)$% asymptotic confidence intervals (CI) (whose width shrinks to zero with increasing $n$ at the fastest possible rate), through higher order influence functions, for a functional $\\\\psi(\\\\theta)$ in a model that places no restrictions on $F$; other than, perhaps, bounds on both the $L_p$ norms and the roughness (more generally, the complexity) of certain density and conditional expectation functions. â€œInsuranceâ€ Against Incorrect Inference after Variable SelectionAmong statisticians variable selection is a common and very dangerous activity. This talk will survey the dangers and then propose two forms of insurance to guarantee against the damages from this activity. Confidence Sets for Phylogenetic Trees
