NLP with Transformers: Fundamentals and Core ApplicationsChapter 95

5. Step 2: Loading and Exploring the Dataset

Section 5 of 10-~ 12 min read-Synced from Cuantum content

For this project, we’ll use the IMDb dataset, which contains movie reviews labeled as positive or negative. Optionally, you can include a neutral label for a three-class classification.

Code Example: Load Dataset

# Load IMDb datasetdataset = load_dataset("imdb") # Split dataset into train and test setstrain_data = dataset["train"]test_data = dataset["test"] # Display an example reviewexample = train_data[0]print(f"Review: {example['text']}")print(f"Label: {'Positive' if example['label'] == 1 else 'Negative'}")