学术报告: Surprise sampling: an optimal subsampling design (复旦大学 郁文 副教授)
发布时间: 2018-11-05 浏览次数: 1988

主  题:Surprise sampling: an optimal subsampling design


内容简介:Sampling  for surprise is a working principle of efficient sampling for the  saving of computational workload among other purposes. A sample is  deemed surprising if it has large error of pilot prediction or large  absolute score, and will be sampled with larger sampling probability, as  it in general contains more information than non-surprising samples.  Such sampling schemes are particularly useful when dealing with  imbalanced data. Following the working principle, we propose a sample  design called surprise sampling. It caters to the specific forms of a  variety of objectives. The estimation procedure is valid even if the  model is misspecified and/or the pilot estimator is inconsistent. The  proposed surprise sampling includes as a special case the local  case-control sampling (Fithian and Hastie, 2014), which high efficiency  by utilizing a clever adjustment pertained only to the logistic model.  The proposed estimator also performs no worse than that of (Fithian and  Hastie, 2014) under same model specification. We present theoretical  justifications of the claimed advantages and optimality of the  estimation and the sampling design. Numerical studies are carried out  and the evidence in support of the theory is shown.


报告人:郁文     副教授

时  间:2018-11-14    14:00

地  点:竞慧东楼302