Which of these methods can be used to address heteroscedasticity?
Adding more independent variables
Transforming the dependent variable
Removing outliers
All of the above
Who is credited with developing the foundational principles of linear regression?
Marie Curie
Albert Einstein
Sir Francis Galton
Isaac Newton
What graphical tool is commonly used to visualize the relationship between two continuous variables in linear regression?
Pie chart
Bar chart
Scatter plot
Histogram
Which of the following is NOT an assumption of linear regression?
Multicollinearity
Homoscedasticity
Linearity
Normality of residuals
In forward selection, what criteria is typically used to decide which feature to add at each step?
The feature that is least correlated with the other features
The feature with the highest p-value
The feature that results in the largest improvement in model performance
The feature that results in the smallest increase in R-squared
What is the main difference between forward selection and backward elimination in linear regression?
Forward selection starts with no features and adds one by one, while backward elimination starts with all features and removes one by one.
Forward selection starts with all features and removes one by one, while backward elimination starts with no features and adds one by one.
Forward selection is used for classification, while backward elimination is used for regression.
There is no difference; both techniques achieve the same outcome.
What does the assumption of independence in linear regression refer to?
Independence between the independent and dependent variables
Independence between the errors and the dependent variable
Independence between the observations
Independence between the coefficients of the regression model
What function from scikit-learn is used to perform Linear Regression?
linear_model.LogisticRegression()
preprocessing.StandardScaler()
linear_model.LinearRegression()
model_selection.train_test_split()
Why is normality of errors an important assumption in linear regression?
It is necessary for the calculation of the regression coefficients
It guarantees the homoscedasticity of the errors
It ensures the linearity of the relationship between variables
It validates the use of hypothesis testing for the model's coefficients
Can the R-squared value be negative?
No, it always ranges between 0 and 1.
Yes, if the model fits the data worse than a horizontal line.
Yes, if there is a perfect negative correlation between the variables.
No, it is always positive.