K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
A study conducted in the UK from 2009 to 2010 by leading scientists explored neonatal resuscitation practices in various neonatal units, aiming to assess adherence to international guidelines and identify differences between tertiary and non-tertiary care providers...
Read on arXiv
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
One Sentence Abstract
K-Adapter, a framework for injecting multiple types of knowledge into pre-trained models like RoBERTa, retains original parameters fixed and uses neural adapters as plug-ins, demonstrating improved performance in relation classification, entity typing, and question answering tasks.
Simplified Abstract
Researchers are working on improving a tool called BERT, which is used to make machines understand language better. However, when BERT is asked to learn new things, it sometimes forgets what it previously learned. To solve this problem, the researchers developed a new approach called K-Adapter.
K-Adapter is like a toolbox that adds extra parts to BERT when it needs to learn new things. These extra parts, or "adapters," work like plug-ins that can be connected to BERT without affecting its original parts. This way, BERT can learn new things without forgetting what it already knew.
In this study, the researchers used K-Adapter to help BERT learn two different types of knowledge: factual knowledge from Wikipedia and linguistic knowledge from how words are connected in sentences.
The results showed that when BERT used K-Adapter, it performed better in tasks like understanding relationships between words, identifying the type of an entity, and answering questions. The researchers also found that K-Adapter helps BERT learn a wider range of knowledge than before.
K-Adapter's code is now available for others to use and build on, which could lead to even better and more accurate language understanding tools in the future.
Study Fields
Main fields:
- Natural Language Processing (NLP)
- Knowledge Infusion
- Pre-trained Models
Subfields:
- K-Adapter Framework Development
- RoBERTa as Backbone Model
- Neural Adapters for Infused Knowledge
- Distributed Training
- Factual Knowledge (from text-triplets on Wikipedia and Wikidata)
- Linguistic Knowledge (via dependency parsing)
- Relation Classification
- Entity Typing
- Question Answering
- Performance Improvements
- Knowledge Versatility
- Code Availability
Study Objectives
- Investigate the problem of injecting knowledge into large pre-trained models like BERT and RoBERTa
- Evaluate existing methods that update the original parameters of pre-trained models when injecting knowledge
- Address the issue of flushed away historically injected knowledge when multiple kinds of knowledge are injected
- Propose K-Adapter, a framework that retains the original parameters of pre-trained models and supports versatile knowledge-infused models
- Utilize RoBERTa as the backbone model and develop neural adapters for each kind of infused knowledge
- Train adapters efficiently in a distributed way with no information flow between them
- Inject two kinds of knowledge in a case study: factual knowledge from automatically aligned text-triplets on Wikipedia and Wikidata, and linguistic knowledge from dependency parsing
- Evaluate K-Adapter's performance on three knowledge-driven tasks: relation classification, entity typing, and question answering
- Demonstrate that each adapter improves performance and the combination of both adapters brings further improvements
- Analyze that K-Adapter captures versatile knowledge compared to RoBERTa
- Share 111Codes publicly at https://github.com/microsoft/k-adapter
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Conclusions
- The paper addresses the issue of multiple kinds of knowledge being injected into pre-trained models like BERT and RoBERTa, causing historically injected knowledge to be lost.
- The authors propose K-Adapter, a framework that fixes the original parameters of the pre-trained model and uses a neural adapter for each kind of infused knowledge, allowing efficient training in a distributed way.
- As a case study, the authors inject two types of knowledge: factual knowledge from automatically aligned text-triplets on Wikipedia and Wikidata, and linguistic knowledge via dependency parsing.
- K-Adapter's results on three knowledge-driven tasks show that each adapter improves performance, and their combination leads to further improvements.
- The authors also find that K-Adapter captures more versatile knowledge than the base RoBERTa model.
- The code for K-Adapter is available publicly at https://github.com/microsoft/k-adapter
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat..
References
- University of AI
Received 20 Oct 2011, Revised 9 Dec 2011, Accepted 5 Jan 2012, Available online 12 Jan 2012.





