1 Which of the following is NOT true about linear regression?
Review Later
Linear regression allows us to predict new values of the independent variable.
Linear regression allows us to model how the target variable ch
...
1 Which of the following is NOT true about linear regression?
Review Later
Linear regression allows us to predict new values of the independent variable.
Linear regression allows us to model how the target variable changes with the independent variables.
In linear regression, the target variable is a continuous quantity.
Linear regression is used to predict new values of the target variable.
2
The ordinary least squares (OLS) algorithm ________________ .
Review Later
Maximizes the sum of square residuals
Minimizes the sum of square residuals
Minimizes the square of the sum of residuals
Maximizes the square of the sum of residuals
3
Overfitting occurs when _____________.
Review Later
The sum of square residuals is too large
Our model does not have enough complexity
The average of the errors is positive
Our model becomes too specific to the training data
4.
Using multiple linear regression to add in more independent variables ___________.
Review Later
can help explain more variation in the target variable
allows us to fit a non-linear model to the data
allows us to add more observational data to the model
reduces the overfitting of the data
5.
Multicollinearity is the phenomenon where _________________.
Review Later
the independent variables are strongly correlated with the residuals
the target variable is strongly correlated with the residuals
the independent variables are strongly correlated with other independent variables
the target variable is strongly correlated with an independent variable
6
Which of the following is NOT an assumption of ordinary least squares (OLS):
Review Later
Homoscedasticity of Errors
Endogeneity
Random Sampling
Linearity
7
Which assumption of OLS assumes that there is no correlation between the error and the independent variables?
Review Later
Zero Mean Errors
Multicollinearity
Endogeneity
Autocorrelation of Errors
8
A regression analysis between sales (S) (in $1000) and price (P) (in dollars) resulted in the following equation:
S = 50,000 - 8P
The above equation implies that an ___________.
Review Later
increase of $1 in price is associated with a decrease of $8 in sales
increase of $1 in price is associated with a decrease of $8000 in sales
increase of $1 in price is associated with a decrease of $42,000 in sales
increase of $8 in price is associated with an increase of $8,000 in sales
9
Which of the following is the formula for the mean square error?
Review Later
10
Suppose we build a model to predict a store's sales with three independent variables; customers per day, average daily temperature, and number of products available. If we calculate the p-values for these variables as below, which variables are significant and should be kept in the model? Select all that apply.
Variable p-Value
Customers per day (I) 0.0
Average daily temperature (II) 0.54
Number of products available (III) 0.03
Review Later
Variable I
Variable II
Variable III
11
Suppose we have produced a simple linear regression model with the following form:
y = 0.65x + 2.9
We then calculate the coefficient of determination as 0.92 and a p-value of 0.1. Which of the following best describes our model?
Review Later
The model explains a high amount of variance, and the slope is statistically significant
The model explains a high amount of variance but the slope is statistically insignificant
The model explains a low amount of variance, but the slope is statistically significant
The model explains a low amount of variance but the slope is statistically significant
12
Which of the following evaluation metrics is relative to the total error?
Review Later
Mean absolute error
Mean square error
Root mean square error
Coefficient of determination
13
Which method of regression produces a probability distribution as opposed to a point estimate?
Review Later
Bayesian Regression
Poisson Regression
LASSO Regression
Logistic Regression
14
You are given a dataset of air pollution readings from several locations in an urban setting. The measurements are taken every hour and include information about traffic flow. To perform regression on this longitudinal data, what kind of regression technique would you use?
Review Later
Repeated Measures Regression
LASSO Regression
Log-Log Regression
Polynomial Regression
15
You are working with customer data from a large video-on-demand provider, which contains numerical fields with information such as average number of hours watched per month, number of logins per month, time spent browsing per month etc.
In this data, there is a flag that indicates whether the customer canceled the service or not (1 for yes, 0 for no). You are looking to build a model from this data to classify what current customers will cancel.
What type of model would you use?
Review Later
Random Effects
Poisson Regression
Logistic Regression
Bayesian Regression
[Show More]