Chapter 3: Feature Engineering for NLP

Section 5 of 8-~ 12 min read-Synced from Cuantum content

b) Text Frequency-Inverse Data Frequency

c) Token Frequency-Indexed Data Frequency

d) Term Frequency-Indexed Document Frequency

Which model is based on predicting context words given a target word or predicting a target word given context words?
a) TF-IDF

b) Bag of Words

c) Word2Vec

d) BERT

What is a key advantage of BERT over traditional word embeddings like Word2Vec and GloVe?
a) BERT is simpler to implement.

b) BERT generates context-aware embeddings.

c) BERT is based on frequency counts.

d) BERT uses a smaller model size.