Natural Language Processing with Python Updated EditionChapter 43

Chapter 3: Feature Engineering for NLP

Section 5 of 8-~ 12 min read-Synced from Cuantum content
  1. What does TF-IDF stand for?
  2. a) Term Frequency-Inverse Document Frequency

b) Text Frequency-Inverse Data Frequency

c) Token Frequency-Indexed Data Frequency

d) Term Frequency-Indexed Document Frequency

  1. Which model is based on predicting context words given a target word or predicting a target word given context words?
  2. a) TF-IDF

b) Bag of Words

c) Word2Vec

d) BERT

  1. What is a key advantage of BERT over traditional word embeddings like Word2Vec and GloVe?
  2. a) BERT is simpler to implement.

b) BERT generates context-aware embeddings.

c) BERT is based on frequency counts.

d) BERT uses a smaller model size.

  1. Which library is commonly used to implement BERT embeddings in Python?
  2. - a) scikit-learn
  • b) nltk
  • c) transformers
  • d) gensim