• HURU School

BERT Model by Google AI



Bidirectional Encoder Representations from Transformers (BERT) is a transformer-

based machine learning technique for natural language processing (NLP) pre-

training developed by Google.


Jacob Devlin and his Google colleagues developed and released BERT in 2018. Google said in 2019 that it has started implementing BERT in its search engine, and by the end of 2020, it was being used in nearly all English-language searches.


BERT is a transformer language model that was self-supervised pre trained on a

sizable sample of English data. This means that an automatic method was used to generate inputs and labels from those texts after it had been pre trained on just the raw texts without any human labeling (which explains why it may use a ton of data that is readily available to the public).

BERT was pre-trained on two tasks:

1. Masked language modeling (MLM): taking a sentence, the model randomly

masks 15% of the words in the input then run the entire masked sentence

through the model and has to predict the masked words. This allows the model

to learn a bidirectional representation of the sentence. BERT is the first deeply

bidirectional, unsupervised language representation, pre-trained using only a

plain text data sets.


2. Next sentence prediction (NSP): During pretraining, the model combines two

masked words as inputs. They occasionally match sentences that were next to

one another in the original text. The model must then determine whether or

not the two sentences followed one another.


Applications of BERT

Sequence to sequenced based language generation tasks such as:

  • Question answering

  • Summarization

  • Response generation

NLP Tasks

  • Sentiment classification

  • Word Sense ambiguity

This article was written by Chepkorir Diana from the AI Class of 2022. You can learn more about AI models by registering for the AI Picodegree course at HURU School

3 views0 comments

Recent Posts

See All