# Applied Predictive Technologies Interview Questions

## Differentiate Between Data Analytics and Data Science

### Data Science

Data Analytics use data to draw meaningful insights and solves problems.

Data Science is used in asking questions, writing algorithms, coding and building statistical models.

Data analytics tools include data mining, data modelling, database management and data analysis.

Machine Learning, Hadoop, Java, Python, software development etc., are the tools of Data Science.

Use the existing information to uncover the actionable data.

As a result, data Science discovers new Questions to drive innovation.

Check data from the given information using a specialised system and software.

This field uses scientific methods and algorithms to extract knowledge from unstructured data.

### Ola Cabs Interview Rounds and Process

• Q2. Tell me one cool thing you got to do at IIT Add Answer
• Q3. Why did you choose IIT and do you regret it? Whats different here? Add Answer
• Q4. What is a portfolio? How do you measure risk? Add Answer
• Q5. What is beta? What is Value at risk? What is formula for beta? Add Answer
• Q6. What is covariance? How does it measure sensitivity? What is volatility? Add Answer
• Q7. What is WACC? How do value a company? Suggest a method that can help you decide on project undertaking? Add Answer
• ### What is variance in Data Science?

Variance is the value which depicts the individual figures in a set of data which distributes themselves about the mean and describes the difference of each value from the mean value. Data Scientists use variance to understand the distribution of a data set.

### 6 How do you work towards a random forest?

The underlying principle of this technique is that several weak learners combine to provide a strong learner. The steps involved are:

• Build several decision trees on bootstrapped training samples of data
• On each tree, each time a split is considered, a random sample of mm predictors is chosen as split candidates out of all pp predictors
• Rule of thumb: At each split m=pâm=p
• Predictions: At the majority rule
• This exhaustive list is sure to strengthen your preparation for data science interview questions.

### For the given points, how will you calculate the Euclidean distance in Python?

The Euclidean distance can be calculated as follows:

euclidean_distance = sqrt( (plot1[0]-plot2[0])**2 + (plot1[1]-plot2[1])**2 )

Check out the Simplilearns video on “Data Science Interview Question” curated by industry experts to help you prepare for an interview.

### 4 How can we select an appropriate value of k in k-means?

Selecting the correct value of k is an important aspect of k-means clustering. We can make use of the elbow method to pick the appropriate k value. To do this, we run the k-means algorithm on a range of values, e.g., 1 to 15. For each value of k, we compute an average score. This score is also called inertia or the inter-cluster variance.

This is calculated as the sum of squares of the distances of all values in a cluster. As k starts from a low value and goes up to a high value, we start seeing a sharp decrease in the inertia value. After a certain value of k, in the range, the drop in the inertia value becomes quite small. This is the value of k that we need to choose for the k-means clustering algorithm.