Which of the Following Tools Are Used for Reading a Map? Check All That Apply.

Introduction

Linear Regression is still the most prominently used statistical technique in data science industry and in academia to explain relationships between features.

A total of 1,355 people registered for this skill examination. It was peculiarly designed for you to test your noesis on linear regression techniques. If yous are one of those who missed out on this skill test, hither are the questions and solutions. Y'all missed on the real time exam, merely can read this article to find out how many could have answered correctly.

Here is the leaderboard for the participants who took the examination.

Overall Distribution

Below is the distribution of the scores of the participants:

You can access the scores here. More 800 people participated in the skill test and the highest score obtained was 28.

Helpful Resources

Hither are some resources to make it depth cognition in the subject.

  • v Questions which can teach you Multiple Regression (with R and Python)

  • Going Deeper into Regression Analysis with Assumptions, Plots & Solutions

  • 7 Types of Regression Techniques you should know!

Are you lot a beginner in Machine Learning? Do y'all want to master the concepts of Linear Regression and Automobile Learning? Here is a beginner-friendly course to assist you in your journey –

  • Certified AI & ML Blackbelt+ Plan
  • Practical Auto Learning Course

Skill test Questions and Answers

1) True-Fake: Linear Regression is a supervised automobile learning algorithm.

A) Truthful
B) FALSE

2) Truthful-False: Linear Regression is mainly used for Regression.

A) Truthful
B) Faux

three) True-Simulated: It is possible to design a Linear regression algorithm using a neural network?

A) TRUE
B) Imitation

4) Which of the post-obit methods do we apply to notice the best fit line for data in Linear Regression?

A) Least Square Fault
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B

five) Which of the following evaluation metrics tin can be used to evaluate a model while modeling a continuous output variable?

A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error

6) True-False: Lasso Regularization tin can be used for variable selection in Linear Regression.

A) True
B) FALSE

Solution: (A)

True, In case of lasso regression we use absolute penalisation which makes some of the coefficients nix.

seven) Which of the following is truthful about Residuals ?

A) Lower is ameliorate
B) Higher is improve
C) A or B depend on the state of affairs
D) None of these

Solution: (A)

Residuals refer to the error values of the model. Therefore lower residuals are desired.

viii) Suppose that we take North contained variables (X1,X2… Xn) and dependent variable is Y. At present Imagine that you are applying linear regression past fitting the all-time fit line using least square error on this data.

Yous found that correlation coefficient for 1 of it'southward variable(Say X1) with Y is -0.95.

Which of the following is true for X1?

A) Relation between the X1 and Y is weak
B) Relation betwixt the X1 and Y is strong
C) Relation betwixt the X1 and Y is neutral
D) Correlation tin't judge the relationship

Solution: (B)

The accented value of the correlation coefficient denotes the forcefulness of the human relationship. Since  absolute correlation is very high information technology means that the relationship is strong between X1 and Y.

9) Looking at above ii characteristics, which of the following option is the correct for Pearson correlation between V1 and V2?

If you lot are given the two variables V1 and V2 and they are post-obit below two characteristics.

1. If V1 increases then V2 also increases

2. If V1 decreases then V2 beliefs is unknown

A) Pearson correlation will be shut to i
B) Pearson correlation will be close to -1
C) Pearson correlation volition be shut to 0
D) None of these

Solution: (D)

Nosotros cannot annotate on the correlation coefficient by using merely statement 1.  We need to consider the both of these two statements. Consider V1 as x and V2 as |x|. The correlation coefficient would not be close to i in such a case.

ten) Suppose Pearson correlation betwixt V1 and V2 is zippo. In such case, is it right to conclude that V1 and V2 do not have any relation betwixt them?

A) TRUE
B) Simulated

Solution: (B)

Pearson correlation coefficient between two variables might be zero even when they have a relationship between them. If the correlation coefficient is zero, information technology only means that that they don't move together. We can take examples similar y=|x| or y=x^two.

eleven) Which of the following offsets, exercise we utilise in linear regression's least square line fit? Suppose horizontal axis is independent variable and vertical axis is dependent variable.

A) Vertical offset
B) Perpendicular start
C) Both, depending on the state of affairs
D) None of in a higher place

Solution: (A)

Nosotros always consider residuals equally vertical offsets. We calculate the direct differences between bodily value and the Y labels. Perpendicular start are useful in case of PCA.

12) Truthful- False: Overfitting is more likely when you have huge amount of data to train?

A) TRUE
B) FALSE

Solution: (B)

With a minor grooming dataset, it's easier to find a hypothesis to fit the training data exactly i.due east. overfitting.

13) Nosotros can also compute the coefficient of linear regression with the assist of an analytical method called "Normal Equation". Which of the following is/are true about Normal Equation?

  1. We don't have to cull the learning rate
  2. It becomes ho-hum when number of features is very large
  3. Thers is no need to iterate

A) i and 2
B) 1 and three
C) 2 and 3
D) 1,two and three

Solution: (D)

Instead of gradient descent, Normal Equation can as well exist used to find coefficients. Refer this article for read more about normal equation.

14) Which of the following argument is true about sum of residuals of A and B?

Beneath graphs show 2 fitted regression lines (A & B) on randomly generated data. Now, I want to find the sum of residuals in both cases A and B.

Note:

  1. Scale is same in both graphs for both axis.
  2. 10 centrality is independent variable and Y-axis is dependent variable.

A) A has college sum of residuals than B
B) A has lower sum of residual than B
C) Both have same sum of residuals
D) None of these

Solution: (C)

Sum of residuals volition always be zero, therefore both have aforementioned sum of residuals

Question Context xv-17:

Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge regression with penality x.

fifteen) Choose the option which describes bias in best manner.
A) In case of very large x; bias is depression
B) In case of very large x; bias is high
C) We tin can't say nigh bias
D) None of these

Solution: (B)

If the penalisation is very big it means model is less circuitous, therefore the bias would be high.

sixteen) What will happen when you apply very large penalisation?

A) Some of the coefficient will become absolute zero
B) Some of the coefficient will arroyo naught but not absolute zero
C) Both A and B depending on the state of affairs
D) None of these

Solution: (B)

In lasso some of the coefficient value become goose egg, just in case of Ridge, the coefficients get close to zero but not zero.

17) What will happen when you apply very big penalisation in case of Lasso?
A) Some of the coefficient will become naught
B) Some of the coefficient will be approaching to zip but not accented zero
C) Both A and B depending on the state of affairs
D) None of these

Solution: (A)

As already discussed, lasso applies absolute penalization, and then some of the coefficients volition become zero.

xviii) Which of the post-obit statement is true well-nigh outliers in Linear regression?

A) Linear regression is sensitive to outliers
B) Linear regression is non sensitive to outliers
C) Can't say
D) None of these

Solution: (A)

The slope of the regression line will change due to outliers in nigh of the cases. Then Linear Regression is sensitive to outliers.

19) Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you institute that there is a relationship betwixt them. Which of the post-obit determination do you brand most this state of affairs?

A) Since the there is a relationship means our model is not good
B) Since the there is a relationship means our model is expert
C) Can't say
D) None of these

Solution: (A)

There should not be any relationship between predicted values and residuals. If at that place exists any relationship betwixt them,information technology ways that the model has non perfectly captured the information in the data.

Question Context 20-22:

Suppose that you have a dataset D1 and you design a linear regression model of caste three polynomial and you institute that the training and testing fault is "0" or in some other terms it perfectly fits the data.

20) What volition happen when you fit caste 4 polynomial in linear regression?
A) In that location are high chances that degree 4 polynomial will over fit the data
B) In that location are high chances that degree 4 polynomial volition under fit the information
C) Can't say
D) None of these

Solution: (A)

Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so it will again perfectly fit the data. In such case training error will be aught but test mistake may not be zero.

21) What volition happen when yous fit degree 2 polynomial in linear regression?
A) It is loftier chances that degree ii polynomial will over fit the data
B) It is loftier chances that degree ii polynomial volition nether fit the data
C) Can't say
D) None of these

Solution: (B)

If a caste 3 polynomial fits the information perfectly, it'southward highly probable that a simpler model(caste 2 polynomial) might nether fit the data.

22) In terms of bias and variance. Which of the post-obit is true when you fit caste ii polynomial?


A) Bias will exist high, variance will be loftier
B) Bias will be low, variance will be high
C) Bias will be loftier, variance will be low
D) Bias volition be low, variance will be low

Solution: (C)

Since a degree 2 polynomial volition be less complex every bit compared to caste 3, the bias volition be high and variance will be low.

Question Context 23:

Which of the post-obit is true about beneath graphs(A,B, C left to right) between the cost part and Number of iterations?

23) Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of the following is truthful about l1,l2 and l3?

A) l2 < l1 < l3

B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these

Solution: (A)

In instance of loftier learning rate, step will be loftier, the objective function will subtract quickly initially, just it will not observe the global minima and objective function starts increasing later on a few iterations.

In example of low learning rate, the step volition be modest. So the objective part will decrease slowly

Question Context 24-25:

We have been given a dataset with n records in which we have input attribute equally ten and output aspect as y. Suppose we use a linear regression method to model this data. To exam our linear regressor, we split the data in training set and test set up randomly.

24) Now we increase the training prepare size gradually. Every bit the training set size increases, what practice you expect will happen with the mean training error?

A) Increase
B) Decrease
C) Remain constant
D) Tin can't Say

Solution: (D)

Training error may increase or subtract depending on the values that are used to fit the model. If the values used to train incorporate more than outliers gradually, then the fault might just increase.

25) What do you wait volition happen with bias and variance as yous increase the size of training data?

A) Bias increases and Variance increases
B) Bias decreases and Variance increases
C) Bias decreases and Variance decreases
D) Bias increases and Variance decreases
Eastward) Tin can't Say Simulated

Solution: (D)

As we increase the size of the training information, the bias would increase while the variance would decrease.

Question Context 26:

Consider the following data where 1 input(X) and one output(Y) is given.

26) What would be the root mean square preparation fault for this data if you run a Linear Regression model of the form (Y = A0+A1X)?

A) Less than 0
B) Greater than nothing
C) Equal to 0
D) None of these

Solution: (C)

We tin can perfectly fit the line on the following data so mean error will be zero.

Question Context 27-28:

Suppose yous have been given the following scenario for grooming and validation error for Linear Regression.

Scenario Learning Rate Number of iterations Training Mistake Validation Mistake
one 0.1 1000 100 110
ii 0.two 600 90 105
three 0.3 400 110 110
4 0.4 300 120 130
5 0.4 250 130 150

27) Which of the following scenario would give you the right hyper parameter?

A) 1
B) ii
C) iii
D) four

Solution: (B)

Pick B would be the ameliorate selection because it leads to less training as well equally validation mistake.

28) Suppose you got the tuned hyper parameters from the previous question. Now, Imagine yous want to add together a variable in variable infinite such that this added characteristic is important. Which of the post-obit thing would you lot observe in such case?

A) Grooming Error will decrease and Validation error will increase

B) Training Fault will increase and Validation error will increase
C) Grooming Error volition increment and Validation error will decrease
D) Training Mistake will decrease and Validation error will decrease
E) None of the to a higher place

Solution: (D)

If the added feature is of import, the grooming and validation error would subtract.

Question Context 29-thirty:

Suppose, you lot got a situation where you detect that your linear regression model is under plumbing fixtures the information.

29) In such situation which of the following options would you lot consider?

  1. Add together more variables
  2. Outset introducing polynomial degree variables
  3. Remove some variables

A) 1 and 2
B) two and 3
C) 1 and 3
D) 1, 2 and 3

Solution: (A)

In case of under plumbing equipment, you need to induce more variables in variable infinite or you tin can add some polynomial degree variables to brand the model more complex to be able to fir the information better.

30) Now situation is same as written in previous question(under plumbing equipment).Which of following regularization algorithm would you adopt?

A) L1
B) L2
C) Any
D) None of these

Solution: (D)

I won't use whatever regularization methods considering regularization is used in case of overfitting.

End Notes

I tried my best to make the solutions as comprehensive equally possible but if you lot have whatsoever questions / doubts please drop in your comments beneath. I would love to hear your feedback about the skilltest. For more such skilltests, cheque out our current hackathons.

bennettfultarly.blogspot.com

Source: https://www.analyticsvidhya.com/blog/2017/07/30-questions-to-test-a-data-scientist-on-linear-regression/

0 Response to "Which of the Following Tools Are Used for Reading a Map? Check All That Apply."

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel