Machine Learning HeroChapter 81
Chapter 3: Data Preprocessing and Feature Engineering
Section 2 of 4-~ 12 min read-Synced from Cuantum content
- What is the purpose of data cleaning in data preprocessing?
- - a) To improve model performance by transforming features
- b) To identify and handle missing data, remove duplicates, and correct errors
- c) To scale data to a consistent range
- d) To reduce the dimensionality of the dataset
- Which technique is typically used for handling missing data?
- - a) One-hot encoding
- b) Data augmentation
- c) Imputation
- d) PCA
- Feature engineering involves which of the following?
- - a) Creating new features from existing ones
- b) Reducing noise from the data
- c) Increasing the number of samples in the dataset
- d) Both a and b
- Why is it important to scale numerical features?
- - a) To remove outliers from the dataset
- b) To ensure features with different ranges contribute equally to model performance
- c) To increase the size of the dataset
- d) To remove noise from the dataset
- What is the Train-Test Split used for?
- - a) Creating synthetic data samples
- b) Separating data into training and testing sets for model validation
- c) Increasing the number of features in the dataset
- d) Standardizing features to the same scale