# Ancestry Data Science Intern Interview Questions

Ancestry is a web-based platform that helps users create their own family tree and preserve their family history. The company offers a variety of services, including DNA testing and analysis, to help people connect with their ancestors and learn more about their family history.

If you’re interested in working for Ancestry, you’ll need to be prepared to answer some specific interview questions about the company and its services. In this article, we’ll give you some tips on how to answer Ancestry interview questions so you can make a good impression on the hiring manager.

The interview process at Ancestry can vary depending on the position you are applying for. For some positions, such as Senior Software Engineer or Product Manager, the process may include multiple rounds of interviews with different members of the team. For other positions, such as Customer Service Representative or Administrative Assistant, the process may be shorter and only include one or two interviews. Overall, the interview process is generally fairly lengthy, taking anywhere from a few weeks to a few months to complete.

### Your organization has a website where visitors randomly receive one of two coupons. It is also possible that visitors to the website will not receive a coupon. You have been asked to determine if offering a coupon to website visitors has any impact on their purchase decisions. Which analysis method should you use?

• One-way ANOVAÂ
• K-means clustering
• Association rulesÂ
• Students t-testÂ
• The answer is A: One-way ANOVA

### 1 How do you find RMSE and MSE in a linear regression model?

RMSE and MSE are two of the most common measures of accuracy for a linear regression model.Â

RMSE indicates the Root Mean Square Error.Â

MSE indicates the Mean Square Error.

### How is logistic regression done?

Logistic regression measures the relationship between the dependent variable (our label of what we want to predict) and one or more independent variables (our features) by estimating probability using its underlying logistic function (sigmoid).

The shown below depicts how logistic regression works:

The formula and graph for the sigmoid function are as shown:

### 6 What is a bias-variance trade-off?

Bias: Due to an oversimplification of a Machine Learning Algorithm, an error occurs in our model, which is known as Bias. This can lead to an issue of underfitting and might lead to oversimplified assumptions at the model training time to make target functions easier and simpler to understand.

Some of the popular machine learning algorithms which are low on the bias scale are –

Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Decision Trees.

Algorithms that are high on the bias scale –

Variance: Because of a complex machine learning algorithm, a model performs really badly on a test data set as the model learns even noise from the training data set. This error that occurs in the Machine Learning model is called Variance and can generate overfitting and hyper-sensitivity in Machine Learning models.

While trying to get over bias in our model, we try to increase the complexity of the machine learning algorithm. Though it helps in reducing the bias, after a certain point, it generates an overfitting effect on the model hence resulting in hyper-sensitivity and high variance.

Bias-Variance trade-off: To achieve the best performance, the main target of a supervised machine learning algorithm is to have low variance and bias.Â

The following things are observed regarding some of the popular machine learning algorithms –

• The Support Vector Machine algorithm (SVM) has high variance and low bias. In order to change the trade-off, we can increase the parameter C. The C parameter results in a decrease in the variance and an increase in bias by influencing the margin violations allowed in training datasets.
• In contrast to the SVM, the K-Nearest Neighbors (KNN) Machine Learning algorithm has a high variance and low bias. To change the trade-off of this algorithm, we can increase the prediction influencing neighbors by increasing the K value, thus increasing the model bias.
• Markov Chains defines that a stateâs future probability depends only on its current state.Â

Markov chains belong to the Stochastic process type category.

The below diagram explains a step-by-step model of the Markov Chains whose output depends on their current state.

A perfect example of the Markov Chains is the system of word recommendation. In this system, the model recognizes and recommends the next word based on the immediately previous word and not anything before that. The Markov Chains take the previous paragraphs that were similar to training data-sets and generates the recommendations for the current paragraphs accordingly based on the previous word.