1. In regression analysis, multicollinearity refers to:

• The perfect linear relationship between the dependent and independent variables.
• The presence of outliers in the dataset that affect the regression coefficients.
• High intercorrelation among the independent variables, leading to unstable estimates of the regression coefficients.
• The variance in the residuals of the regression model.
`Answer :- For Answer Click Here `

2. What type of data transformation technique scales data to a specific range, such as 0 to 1?

• Database normalization
• Aggregation
• Smoothing techniques
• Standardization/Normalization
`Answer :- For Answer Click Here `

3. Which of the following statements about the coefficient of determination (R-squared) is true?

• A higher R-squared value always indicates a lower model performance.
• A higher R-squared value always indicates better model performance, regardless of the number of predictor variables.
• R-squared ranges from 0 to 1 and represents the percentage of variation in the dependent variable explained by the independent variables.
• R-squared can only take positive values and is unaffected by the presence of multicollinearity in the regression model.
`Answer :- For Answer Click Here `

4. What does Ordinary Least Squares (OLS) aim to minimize in the context of linear regression?

• The sum of squared errors between the predicted and observed values of the dependent variable.
• The sum of squared residuals between the predicted and observed values of the independent variable.
• The total variance of the independent variables
• The sum of squared errors between the predicted and observed values of the independent variable.
`Answer :- `

5. The coefficient of determination (R-squared) value of 0.98 in a regression model implies:

• The model has a high level of multicollinearity
• 98% of the variability in the dependent variable is explained by the independent variable.
• The regression model is overfitting the data by 98 %
• The residuals in the model are normally distributed with z value of 0.98
`Answer :- `

6. Prediction error in a model refers to:

• The difference between actual and predicted values.
• The degree of overfitting in the model.
• The number of features used in the model.
• The variability of the target variable.
`Answer :-  For Answer Click Here `

7. Which of the following statements is wrong with regards to Overfitting in a machine learning model?

• The model is too simple to capture the underlying patterns in the data
• The model performs well on training data but poorly on unseen data
• The model fits the noise in the training data.
• None of the above
`Answer :- `

8. Underfitting in a machine learning model results in:

• Low bias and high variance.
• High bias and low variance.
• High bias and high variance
• Low bias and low variance.
`Answer :- `

9. When should one focus on reducing bias in a machine learning model?

• When the model performs well on the training data but poorly on test data
• When the model shows high variability in predictions.
• When the model consistently overfits the training data.
• When the model doesn’t fit the data well, and works poorly in explanatory/predictive performance
`Answer :- `

10. What is the bias-variance trade-off in machine learning?

• Balancing the computational resources used in training with model accuracy
• Aiming to minimize the difference between predicted and actual values in a model.
• Finding the equilibrium between model complexity and its ability to generalize to unseen data.
• Choosing the best algorithm that minimizes both bias and variance simultaneously.
`Answer :-  For Answer Click Here `

11. Training error refers to:

• Error calculated on the training dataset.
• Error due to overfitting.
• Error calculated on the testing dataset
• Error due to underfitting.
`Answer :- `

12. What does Leave-One-Out Cross-Validation (LOOCV) do?

• It iteratively uses all but one sample as the test set and the remaining sample as the training set.
• It divides the dataset into k subsets and uses each subset as the testing set in turn.
• It creates a validation set from a small portion of the data.
• It iteratively uses all but one sample as the training set and the remaining sample as the testing set.
`Answer :- `

13. What is the primary purpose of cross-validation in machine learning?

• To fit the model to the training data efficiently.
• To evaluate the model’s performance on unseen data
• To increase model complexity for better predictions.
• To reduce the number of features in the dataset.
`Answer :- `

14. What are the three sources of error in predicted Y in machine learning?

• Measurement error, data preprocessing error, and feature selection error
• Model complexity error, parameter tuning error, and overfitting error.
• Reducible error due to inaccurate estimation of f, irreducible error due to randomness, and test data variation.
• Training error, validation error, and testing error.
`Answer :-  For Answer Click Here `

15. Which of the following statements most accurately distinguishes supervised learning from unsupervised learning in machine learning?

• Supervised learning requires labelled data for training models to predict specific outcomes, while unsupervised learning uncovers patterns or structures in data without predefined outcomes.
• Supervised learning primarily deals with clustering data points based on similarities, while unsupervised learning focuses on predicting future trends based on historical data.
• Supervised learning utilizes human supervision to label data for analysis, while unsupervised learning relies on algorithms to classify data into distinct categories.
• Supervised learning involves training models without any prior knowledge of the dataset, while unsupervised learning requires prior information about the characteristics of the data.
`Answer :-  For Answer Click Here `