Classification Algorithms II Class 12 Questions Answers
Objective Type Questions
1. What are K-Nearest Neighbors used for?
a) Regression
b) Classification
c) Both of the above
d) None of the above
Answer: c) Both of the above
2. Which of the following statement is false?
a) K-Nearest Neighbors uses proximity for prediction.
b) K-Nearest Neighbors takes a majority vote on K data points.
c) K-Nearest Neighbors can’t be used for regression.
d) None of the above
Answer: c) K-Nearest Neighbors can’t be used for regression.
3. What are some of the advantages of K-NN?
a) Easy to interpret
b) No extra training step
c) Lots of memory needed
d) Both a and b
Answer: d) Both a and b
4. K-NN works well with imbalanced data sets.
a) True
b) False
Answer: b) False
5. Cross-validation helps us do the following.
a) Removes bias
b) Helps to test tune parameters
c) Both a and b
Answer: c) Both a and b
Classification Algorithms II Class 12 Questions Answers
Standard Questions
1. Write a short note on the advantages of using K-Nearest Neighbors.
K-Nearest Neighbors (KNN) offers numerous advantages:
It is user-friendly and can quickly predict new data without training phase delays; versatile enough to suit classification and regression tasks; robust against noisy data and outliers.
Nonparametric approaches do not rely on specific data distribution assumptions and excel at multiclass classification effectively.
2. How does K- Nearest Neighbors work internally?
KNN calculates the distance between any point in a query data set and all points in its training set, then selects its nearest neighbors based on this calculated distance (usually Euclidean distance).
Classification requires selecting the majority class among K nearest neighbors as being assigned to the query point; while regression predicts an average target value from these same K nearest neighbors.
Classification Algorithms II Class 12 Questions Answers
3. What is cross-validation?
Cross-validation is a resampling technique used to assess a model’s performance, and involves splitting up data sets (folds) for training and testing purposes. Cross-validation provides more reliable estimates of model performance by mitigating partitioning effects on evaluation results.
4. Write a short note on the disadvantages of using K-Nearest Neighbors.
Computationally expensive when applied to large datasets, as it involves calculating distances for all data points.
Highly sensitive to K value which can greatly alter prediction accuracy. Not recommended for high dimensional datasets due to “curse of dimensionality.”
Prone to bias when classes are unbalanced, since predictions from the majority class may predominate predictions from smaller classes. It requires meticulous preprocessing and normalization of features to avoid large-scale features monopolizing distance calculations.
Classification Algorithms II Class 12 Questions Answers
Higher Order Thinking Skills(HOTS)
Please answer the questions below in no less than 200 words.
1. Describe how we can use K- Nearest Neighbors for multi-class classification.
K-Nearest Neighbors (KNN) can be extended for multi-class classification by considering all ‘K’ nearest neighbors when classifying data points into classes; then selecting the majority class as its predicted class. There are various approaches available for multi-class classification:
- One-vs-All (OvA) or One-vs-Rest (OvR) Approach: For each class, a binary classifier is trained to distinguish itself from others and for prediction, the class with highest confidence score from its individual binary classifiers is chosen as final prediction.
- One-vs-One (OvO) approach: In this approach, binary classifiers are trained for every pair of classes; during prediction, each classifier casts its vote for either class; the one receiving more votes is selected as the final prediction.
- Weighted Voting: Apply weights to each vote cast from each neighbor within K based on its proximity to the query point, as closer neighbors could have greater influence in shaping prediction outcomes.
To use KNN effectively for multi-class classification, it is crucial to select an appropriate “K” value, handle imbalanced classes properly and preprocess the data appropriately in order to guarantee accurate distance calculations.
Classification Algorithms II Class 12 Questions Answers
2. How does cross-validation help us remove bias in a model?
Cross-validation can help mitigate bias in models by providing more reliable estimates of their performance on unseen data. Bias can occur due to using only one train-test split for both sets, which might not adequately represent all data and lead to overfitting or underfitting; cross-validation addresses this by repeatedly partitioning it into different train-test sets.
Under k-fold cross-validation, data is divided into multiple subsets (folds) and evaluated and trained multiple times using each fold as either its test set or training set; performance metrics from each iteration are then averaged out for more reliable estimates of model accuracy.
Cross-validation helps minimize randomness in data partitioning by averaging results across multiple partitions to produce a more stable assessment of generalization ability, giving more reliable evaluation of any model’s generalization ability and selecting optimal hyperparameters and features for unseen data sets. It can also identify overfitting by testing on multiple subsets of data; making its results more adaptable and increasing overall reliability.