Robert Bosch Data Scientist interview 2019

Last Updated : 24 Apr, 2019
Telephonic :- Q.1. What is Decision Tree ? How to split ? How does decision tree work ? Q.2. What does each node contain in a Decision Tree ? Q.3. What is Entropy and Genie Index and how does it help ? Q.4. What is Random Forest ? What is Random in Random Forest ? How to calculate OOB Error ? Q.5. How does random forest work ? Q.6. Explain the entire process from the point you get the data till you reach the final stage of prediction. Q.7. How does knn work ? Which distance algorithm to use in knn when data is categorical ? Q.8. You have 10 documents. Each topic has been tagged with a topic. Once a new document comes, how to tag it to one of those topics ? Primary focus : Candidate should be good in coding and he should also have sound knowledge on ML algorithms. Face to Face :- Coding round in R 1. Create a data frame of this form Date Value 01/01/2019 12:00 xx . . . . . . 01/31/2019 11:59 . Value can be randomly generated 2. Transpose the data frame into this form Date Hour1 Hour2 Hour3 . . . Value 01/01/2019 12:00 13:00 14:00 . . . xx 02/01/2019 12:00 13:00 14:00 . . . xx . . . . . . . . . . . . . . . . . . 31/01/2019 12:00 13:00 14:00 . . . xx Technical Interview Q.1. If I want to find a relationship between Price and Sales should I use regression or correlation ? Answer : Simple linear regression can be used to understand the relationship between the dependent variable (Sales) and independent variable (Price). Assumption = No other parameters are present. Correlation coefficient or Standardized covariance (-1 < r < 1) will tell us : 1. If there is positive or negative correlation. 2. It gives strength and relationship between 2 variables. Q.2. If I have multiple features in my dataset, how do I know which ones to include for my model building ? Answer. Check coefficient of determination i.e. R squared. It is the percentage of variation in the y variable that is explainable by x variable. If r squared is 0 that means you can't predict y from x. If r squared is 1 that means you can predict y from x without any errors. I had answered dimensionality reduction technique like Principal Component Analysis. Q.3. Questions on SSE, RMSE, MAPE. Q.4. More questions on end to end process of data analysis. Q.5. I was asked few problems on practical scenarios : a) If I want to improve the traffic conditions what are the data I would ask for. b) Which algorithm to use when kind of questions.
Comment