
Mostafa Dehghani
Google DeepMind
multi-task learning
transformers
pretraining
convolutions
adapters
hyper networks
parameter-efficient fine-tuning
2
presentations
8
number of views
Presentations

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Rabeeh Karimi mahabadi and 3 other authors

Are Pretrained Convolutions Better than Pretrained Transformers?
Yi Tay and 6 other authors