The document discusses weak supervision, which uses unreliable labels to train machine learning models. Weak supervision works by creating many labeling functions that assign probabilistic labels to data, rather than definitive labels. These functions can be rules, user reviews, model predictions, and other heuristics. A generative model then learns the accuracy of labeling functions to determine the true labels. This technique can achieve good results with only a small number of true labels. However, it is difficult to evaluate and influence the importance of labeling functions without labels. The document promotes creating many diverse labeling functions to take advantage of weak supervision.
20. ... or maybe there is?
https://hazyresearch.github.io/snorkel/blog/superglue.html
“In many datasets, especially in real-world applications, there are
subsets of the data that our model underperforms on, or that we care
more about performing well on than others”
“... expert-heads are used to learn slice-specific representations. Then, an
attention mechanism is learned over expert heads to determine when and
how to combine these representations.”
yay!
22. In the low-density setting, sparsity of labels will mean that
there is limited room for even an optimal weighting of the
labeling functions to diverge much from the majority vote.
Conversely, as the label density grows, the majority vote will
eventually be optimal. It is the middle-density regime where
we expect to most benefit from applying the generative model.
31. https://hazyresearch.github.io/snorkel/
🏊 Dive deep into snorkel
🏊♀
C
urrentresearch
Stream-line your labelling
Sea what it is all about
Betta e-fish-in-sea
Scale-ableoppo-tuna-ty
The results shore are unbebreathable
32. You shoald at least breefly
mullet over despite its
fishues and crab-e-ats