NeurIPS 2019 Outstanding Paper Award

Source: Deep Learning on Medium

NeurIPS 2019 Outstanding Paper Award

The NeurIPS 2019 has come to an end at Vancouver. The most noteworthy paper in every year event is the best paper awards. However, in this year, the committee has named it the Outstanding Paper Award with the following criteria:

  • Potential to endure — Focused on the main game, not sidelines. Likely that people will still care about this in decades to come.
  • Insight — Provides new (and hopefully deep) understanding; does not just show a few percentage points of improvement.
  • Creativity / Unexpectedness / Wow — Looks at the problem in a creatively new way; a result that really surprises the reader.
  • Revolutionary — Will radically change the way people will think in the future.
  • Rigour — Unimpeachable rigour and care.
  • Elegance — Beautiful, clean, slick, polished.
  • Realism — Does not over-claim the significance.
  • Scientific — Actually falsifiable.
  • Reproducible — The results are actually reproducible; code available, and it works on a wide range of machines; data available; full detailed proofs.

They also agreed on some criteria that they would like to avoid:

  • Inefficient — Steering away from work that only stand out due to resource profligacy (achieved a higher league table ranking largely by virtue of squandering huge resources)
  • Trendiness — An approach taken because an idea is fashionable but could be accessed in a different more effective way using other approaches.
  • Over Complicated — The paper engaged in unnecessary complexity.

Finally, they determined it appropriate to introduce an additional Outstanding New Directions Paper Award, to highlight work that distinguished itself in setting a novel avenue for future research.

Distribution-Independent PAC Learning of Halfspaces with Massart Noise, by Ilias Diakonikolas, Themis Gouleakis, Christos Tzamos

We study the problem of {\em distribution-independent} PAC learning of halfspaces in the presence of Massart noise. Specifically, we are given a set of labeled examples (\bx,y) drawn from a distribution \D on \Rd+1 such that the marginal distribution on the unlabeled points \bx is arbitrary and the labels y are generated by an unknown halfspace corrupted with Massart noise at noise rate η<1/2. The goal is to find a hypothesis h that minimizes the misclassification error \pr(\bx,y)∼\D[h(\bx)≠y]. We give a \poly(d,1/\eps) time algorithm for this problem with misclassification error η+\eps. We also provide evidence that improving on the error guarantee of our algorithm might be computationally hard. Prior to our work, no efficient weak (distribution-independent) learner was known in this model, even for the class of disjunctions. The existence of such an algorithm for halfspaces (or even disjunctions) has been posed as an open question in various works, starting with Sloan (1988), Cohen (1997), and was most recently highlighted in Avrim Blum’s FOCS 2003 tutorial.

Panel Comment:
The paper studies the learning of linear threshold functions for binary classification in the presence of unknown, bounded label noise in the training data. It solves a fundamental, and long-standing open problem by deriving an efficient algorithm for learning in this case. This paper makes tremendous progress on a long-standing open problem at the heart of machine learning: efficiently learning half-spaces under Massart noise. To give a simple example highlighted in the paper, even weak learning disjunctions (to error 49%) under 1% Massart noise was open. This paper shows how to efficiently achieve excess risk equal to the Massart noise level plus epsilon (and runs in time poly(1/epsilon), as desired). The algorithmic approach is sophisticated and the results are technically challenging to establish. The final goal is to be able to efficiently get excess risk equal to epsilon (in time poly(1/epsilon)).

The Paper: