Training BERT #3 - Next Sentence Prediction (NSP)

Next sentence prediction (NSP) is one-half of the training process behind the BERT model (the other being masked-language modeling - MLM). Where MLM teaches BERT to understand relationships between words - NSP teaches BERT to understand relationships between sentences. In the original BERT paper, it was found that without NSP, BERT performed worse on every single metric -  so it’s important. Now, when we use a pre-trained BERT model, training with NSP and MLM has already been done, so why do we need to know about it? Well, we can actually further pre-train these pre-trained BERT models so that they better understand the language used in our specific use-cases. To do that, we can use both MLM and NSP. So, in this video, we’ll go into depth on what NSP is, how it works, and how we can implement it in code. Training with NSP: 🤖 70% Discount on the NLP With Transformers in Python course:
Back to Top