Bommasani et al 2021 – On the Opportunities and Risks of Foundation Models
From bommasani21_oppor_risks_found_model And Gradient post: Has AI found a new foundation?
A massive report (actually an anthology of reports) on the current (in 2021) state of NLP (actually, Transformer -ology) research.
The intro is a good read for getting to know the landmark papers in the field. The overview is good to take in most of the main talking points.
1. opening arguments
- What is a foundation model? It's the current dominant trend in research: train a huge model (Transformer) and then fine-tune it on some downstream task
- e.g. Devlin et al 2018 - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- this trend is characterized by:
- homogenization – it's now a given that any applied ML paper is going to use this approach
- emergence – this approach is successful because of emergent properties of the trained weights. In contrast to manual feature engineering, all the power of these foundation models are implicitly specified by the training task
- What could go wrong? These models are being used in many downstream tasks – medicine, law, etc. where we would want guarantees on fairness, etc.
- What could go wrong? The industry players poised to make the most research gains in this field are incentivized to ignore externalities that would be detrimental to the publie, e.g. unfairness
- Instead, academia should take the lead in development
- However, the training of these models is now outside the capability of most academic computing centers
- So, the government should fund public research to provide academia with these resources