Which of the Following Tools Are Used for Reading a Map? Check All That Apply.
Introduction
Linear Regression is still the most prominently used statistical technique in data science industry and in academia to explain relationships between features.
A total of 1,355 people registered for this skill examination. It was peculiarly designed for you to test your noesis on linear regression techniques. If yous are one of those who missed out on this skill test, hither are the questions and solutions. Y'all missed on the real time exam, merely can read this article to find out how many could have answered correctly.
Here is the leaderboard for the participants who took the examination.
Overall Distribution
Below is the distribution of the scores of the participants:
You can access the scores here. More 800 people participated in the skill test and the highest score obtained was 28.
Helpful Resources
Hither are some resources to make it depth cognition in the subject.
-
v Questions which can teach you Multiple Regression (with R and Python)
-
Going Deeper into Regression Analysis with Assumptions, Plots & Solutions
-
7 Types of Regression Techniques you should know!
Are you lot a beginner in Machine Learning? Do y'all want to master the concepts of Linear Regression and Automobile Learning? Here is a beginner-friendly course to assist you in your journey –
- Certified AI & ML Blackbelt+ Plan
- Practical Auto Learning Course
Skill test Questions and Answers
1) True-Fake: Linear Regression is a supervised automobile learning algorithm.
A) Truthful
B) FALSE
Solution: (A)
Yep, Linear regression is a supervised learning algorithm because information technology uses true labels for grooming. Supervised learning algorithm should have input variable (x) and an output variable (Y) for each example.
2) Truthful-False: Linear Regression is mainly used for Regression.
A) Truthful
B) Faux
Solution: (A)
Linear Regression has dependent variables that have continuous values.
three) True-Simulated: It is possible to design a Linear regression algorithm using a neural network?
A) TRUE
B) Imitation
Solution: (A)
True. A Neural network can be used equally a universal approximator, so information technology tin can definitely implement a linear regression algorithm.
4) Which of the post-obit methods do we apply to notice the best fit line for data in Linear Regression?
A) Least Square Fault
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least foursquare errors of the model to identify the line of best fit.
five) Which of the following evaluation metrics tin can be used to evaluate a model while modeling a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use hateful squared error metric to evaluate the model performance. Remaining options are use in instance of a classification problem.
6) True-False: Lasso Regularization tin can be used for variable selection in Linear Regression.
A) True
B) FALSE
Solution: (A)
True, In case of lasso regression we use absolute penalisation which makes some of the coefficients nix.
seven) Which of the following is truthful about Residuals ?
A) Lower is ameliorate
B) Higher is improve
C) A or B depend on the state of affairs
D) None of these
Solution: (A)
Residuals refer to the error values of the model. Therefore lower residuals are desired.
viii) Suppose that we take North contained variables (X1,X2… Xn) and dependent variable is Y. At present Imagine that you are applying linear regression past fitting the all-time fit line using least square error on this data.
Yous found that correlation coefficient for 1 of it'southward variable(Say X1) with Y is -0.95.
Which of the following is true for X1?
A) Relation between the X1 and Y is weak
B) Relation betwixt the X1 and Y is strong
C) Relation betwixt the X1 and Y is neutral
D) Correlation tin't judge the relationship
Solution: (B)
The accented value of the correlation coefficient denotes the forcefulness of the human relationship. Since absolute correlation is very high information technology means that the relationship is strong between X1 and Y.
9) Looking at above ii characteristics, which of the following option is the correct for Pearson correlation between V1 and V2?
If you lot are given the two variables V1 and V2 and they are post-obit below two characteristics.
1. If V1 increases then V2 also increases
2. If V1 decreases then V2 beliefs is unknown
A) Pearson correlation will be shut to i
B) Pearson correlation will be close to -1
C) Pearson correlation volition be shut to 0
D) None of these
Solution: (D)
Nosotros cannot annotate on the correlation coefficient by using merely statement 1. We need to consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation coefficient would not be close to i in such a case.
ten) Suppose Pearson correlation betwixt V1 and V2 is zippo. In such case, is it right to conclude that V1 and V2 do not have any relation betwixt them?
A) TRUE
B) Simulated
Solution: (B)
Pearson correlation coefficient between two variables might be zero even when they have a relationship between them. If the correlation coefficient is zero, information technology only means that that they don't move together. We can take examples similar y=|x| or y=x^two.
eleven) Which of the following offsets, exercise we utilise in linear regression's least square line fit? Suppose horizontal axis is independent variable and vertical axis is dependent variable.
A) Vertical offset
B) Perpendicular start
C) Both, depending on the state of affairs
D) None of in a higher place
Solution: (A)
Nosotros always consider residuals equally vertical offsets. We calculate the direct differences between bodily value and the Y labels. Perpendicular start are useful in case of PCA.
12) Truthful- False: Overfitting is more likely when you have huge amount of data to train?
A) TRUE
B) FALSE
Solution: (B)
With a minor grooming dataset, it's easier to find a hypothesis to fit the training data exactly i.due east. overfitting.
13) Nosotros can also compute the coefficient of linear regression with the assist of an analytical method called "Normal Equation". Which of the following is/are true about Normal Equation?
- We don't have to cull the learning rate
- It becomes ho-hum when number of features is very large
- Thers is no need to iterate
A) i and 2
B) 1 and three
C) 2 and 3
D) 1,two and three
Solution: (D)
Instead of gradient descent, Normal Equation can as well exist used to find coefficients. Refer this article for read more about normal equation.
14) Which of the following argument is true about sum of residuals of A and B?
Beneath graphs show 2 fitted regression lines (A & B) on randomly generated data. Now, I want to find the sum of residuals in both cases A and B.
Note:
- Scale is same in both graphs for both axis.
- 10 centrality is independent variable and Y-axis is dependent variable.
A) A has college sum of residuals than B
B) A has lower sum of residual than B
C) Both have same sum of residuals
D) None of these
Solution: (C)
Sum of residuals volition always be zero, therefore both have aforementioned sum of residuals
Question Context xv-17:
Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge regression with penality x.
fifteen) Choose the option which describes bias in best manner.
A) In case of very large x; bias is depression
B) In case of very large x; bias is high
C) We tin can't say nigh bias
D) None of these
Solution: (B)
If the penalisation is very big it means model is less circuitous, therefore the bias would be high.
sixteen) What will happen when you apply very large penalisation?
A) Some of the coefficient will become absolute zero
B) Some of the coefficient will arroyo naught but not absolute zero
C) Both A and B depending on the state of affairs
D) None of these
Solution: (B)
In lasso some of the coefficient value become goose egg, just in case of Ridge, the coefficients get close to zero but not zero.
17) What will happen when you apply very big penalisation in case of Lasso?
A) Some of the coefficient will become naught
B) Some of the coefficient will be approaching to zip but not accented zero
C) Both A and B depending on the state of affairs
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalization, and then some of the coefficients volition become zero.
xviii) Which of the post-obit statement is true well-nigh outliers in Linear regression?
A) Linear regression is sensitive to outliers
B) Linear regression is non sensitive to outliers
C) Can't say
D) None of these
Solution: (A)
The slope of the regression line will change due to outliers in nigh of the cases. Then Linear Regression is sensitive to outliers.
19) Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you institute that there is a relationship betwixt them. Which of the post-obit determination do you brand most this state of affairs?
A) Since the there is a relationship means our model is not good
B) Since the there is a relationship means our model is expert
C) Can't say
D) None of these
Solution: (A)
There should not be any relationship between predicted values and residuals. If at that place exists any relationship betwixt them,information technology ways that the model has non perfectly captured the information in the data.
Question Context 20-22:
Suppose that you have a dataset D1 and you design a linear regression model of caste three polynomial and you institute that the training and testing fault is "0" or in some other terms it perfectly fits the data.
20) What volition happen when you fit caste 4 polynomial in linear regression?
A) In that location are high chances that degree 4 polynomial will over fit the data
B) In that location are high chances that degree 4 polynomial volition under fit the information
C) Can't say
D) None of these
Solution: (A)
Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so it will again perfectly fit the data. In such case training error will be aught but test mistake may not be zero.
21) What volition happen when yous fit degree 2 polynomial in linear regression?
A) It is loftier chances that degree ii polynomial will over fit the data
B) It is loftier chances that degree ii polynomial volition nether fit the data
C) Can't say
D) None of these
Solution: (B)
If a caste 3 polynomial fits the information perfectly, it'southward highly probable that a simpler model(caste 2 polynomial) might nether fit the data.
22) In terms of bias and variance. Which of the post-obit is true when you fit caste ii polynomial?
A) Bias will exist high, variance will be loftier
B) Bias will be low, variance will be high
C) Bias will be loftier, variance will be low
D) Bias volition be low, variance will be low
Solution: (C)
Since a degree 2 polynomial volition be less complex every bit compared to caste 3, the bias volition be high and variance will be low.
Question Context 23:
Which of the post-obit is true about beneath graphs(A,B, C left to right) between the cost part and Number of iterations?
23) Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of the following is truthful about l1,l2 and l3?
A) l2 < l1 < l3
B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these
Solution: (A)
In instance of loftier learning rate, step will be loftier, the objective function will subtract quickly initially, just it will not observe the global minima and objective function starts increasing later on a few iterations.
In example of low learning rate, the step volition be modest. So the objective part will decrease slowly
Question Context 24-25:
We have been given a dataset with n records in which we have input attribute equally ten and output aspect as y. Suppose we use a linear regression method to model this data. To exam our linear regressor, we split the data in training set and test set up randomly.
24) Now we increase the training prepare size gradually. Every bit the training set size increases, what practice you expect will happen with the mean training error?
A) Increase
B) Decrease
C) Remain constant
D) Tin can't Say
Solution: (D)
Training error may increase or subtract depending on the values that are used to fit the model. If the values used to train incorporate more than outliers gradually, then the fault might just increase.
25) What do you wait volition happen with bias and variance as yous increase the size of training data?
A) Bias increases and Variance increases
B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
Eastward) Tin can't Say Simulated
Solution: (D)
As we increase the size of the training information, the bias would increase while the variance would decrease.
Question Context 26:
Consider the following data where 1 input(X) and one output(Y) is given.
26) What would be the root mean square preparation fault for this data if you run a Linear Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than nothing
C) Equal to 0
D) None of these
Solution: (C)
We tin can perfectly fit the line on the following data so mean error will be zero.
Question Context 27-28:
Suppose yous have been given the following scenario for grooming and validation error for Linear Regression.
Scenario | Learning Rate | Number of iterations | Training Mistake | Validation Mistake |
one | 0.1 | 1000 | 100 | 110 |
ii | 0.two | 600 | 90 | 105 |
three | 0.3 | 400 | 110 | 110 |
4 | 0.4 | 300 | 120 | 130 |
5 | 0.4 | 250 | 130 | 150 |
27) Which of the following scenario would give you the right hyper parameter?
A) 1
B) ii
C) iii
D) four
Solution: (B)
Pick B would be the ameliorate selection because it leads to less training as well equally validation mistake.
28) Suppose you got the tuned hyper parameters from the previous question. Now, Imagine yous want to add together a variable in variable infinite such that this added characteristic is important. Which of the post-obit thing would you lot observe in such case?
A) Grooming Error will decrease and Validation error will increase
B) Training Fault will increase and Validation error will increase
C) Grooming Error volition increment and Validation error will decrease
D) Training Mistake will decrease and Validation error will decrease
E) None of the to a higher place
Solution: (D)
If the added feature is of import, the grooming and validation error would subtract.
Question Context 29-thirty:
Suppose, you lot got a situation where you detect that your linear regression model is under plumbing fixtures the information.
29) In such situation which of the following options would you lot consider?
- Add together more variables
- Outset introducing polynomial degree variables
- Remove some variables
A) 1 and 2
B) two and 3
C) 1 and 3
D) 1, 2 and 3
Solution: (A)
In case of under plumbing equipment, you need to induce more variables in variable infinite or you tin can add some polynomial degree variables to brand the model more complex to be able to fir the information better.
30) Now situation is same as written in previous question(under plumbing equipment).Which of following regularization algorithm would you adopt?
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won't use whatever regularization methods considering regularization is used in case of overfitting.
End Notes
I tried my best to make the solutions as comprehensive equally possible but if you lot have whatsoever questions / doubts please drop in your comments beneath. I would love to hear your feedback about the skilltest. For more such skilltests, cheque out our current hackathons.
Source: https://www.analyticsvidhya.com/blog/2017/07/30-questions-to-test-a-data-scientist-on-linear-regression/
0 Response to "Which of the Following Tools Are Used for Reading a Map? Check All That Apply."
Post a Comment