Learning with Lower Information Costs

Sivan Sabato

In this talk I will consider learning with lower information costs, focusing on linear regression. Linear regression is one of the most widely used methods for prediction and forecasting, with widespread uses in many fields such as natural sciences, economy and medicine. I will show how to improve the information costs of linear regression in two settings. First, I will present a new estimation algorithm for the standard supervised regression setting. This is the first efficient estimator that enjoys minimax optimal sample complexity, up to log factors, for general heavy tailed distributions. The technique is general and can be applied to a larger class of smooth and strongly convex losses. Second, I will consider the challenge of using crowd sourcing for labeling in tasks that usually require experts, and show how to achieve this using linear regression combined with a feature multi-selection approach.

Based on Joint work with Daniel Hsu and Adam Kalai.

Sivan Sabato is a post-doctoral researcher at Microsoft Research New England. Her main research interests are in statistical machine learning theory and its applications. Sivan received her M.Sc. in Computer Science from the Technion, and her Ph.D. in Computer Science from the Hebrew University of Jerusalem. She is an alumna of the Adams fellowship program for outstanding Ph.D. students, and has been awarded several honors, including the Wolf Prize for outstanding M.Sc. thesis, the Google Anita Borg Scholarship, and the Intel Excellence Award.

Start Time


Building Map