1 What is your approach to developing a new analytical product as a data engineer?
The hiring managers want to know your role as a data engineer in developing a new product and evaluate your understanding of the product development cycle. As a data engineer, you control the outcome of the final product as you are responsible for building algorithms or metrics with the correct data. Â
Your first step would be to understand the outline of the entire product to comprehend the complete requirements and scope. Your second step would be looking into the details and reasons for each metric. Think about as many issues that could occur, and it helps you to create a more robust system with a suitable level of granularity.
2 What happens when Block Scanner detects a corrupted data block?
It is one of the most typical and popular interview questions for data engineers. You should answer this by stating all steps followed by a Block scanner when it finds a corrupted block of data.Â
Firstly, DataNode reports the corrupted block to NameNode.NameNode makes a replica using an existing model. If the system does not delete the corrupted data block, NameNode creates replicas as per the replication factor.Â
What do you mean by RDBMS?
Ans: RDBMS is nothing but – Relational DataBase Management System. It is the software that allows storing, managing, querying, and retrieving data from a relational database. And RDBMS interacts with users and the database; it can also carry out administrative tasks such as – managing data storage, accessing data, and assessing database performance.
10 Compare Extended Stored Procedures and CLR Integration?
Ans:
Extended Stored Procedure | CLR Integration |
They support the functionalities that cannot work with T-SQL stored procedures. | CLR provides managed code with services such as cross-language integration, object lifetime management, code access security, and debugging and profiling support. |
Developers need to write server-side logic that is complex in a way | Provides an alternative method to writing codes simply. Logic is expressed in the form of table-valued functions |
It compromises the integrity of the SQL server process | It doesn’t compromise the integrity of the SQL server process |
It supports all the versions of the SQL server | It doesn’t support older versions of the SQL server |
Codes can be written in C/C++ programming languages | Codes can be written in .NET programming languages |
1 What are the different types of Hypothesis testing?
Hypothesis testing is the procedure used by statisticians and scientists to accept or reject statistical hypotheses. There are mainly two types of hypothesis testing:
Example: There is no association between a patientâs BMI and diabetes.
Example: There could be an association between a patientâs BMI and diabetes.
1 What is Time Series analysis?
Time Series analysis is a statistical procedure that deals with the ordered sequence of values of a variable at equally spaced time intervals. Time series data are collected at adjacent periods. So, there is a correlation between the observations. This feature distinguishes time-series data from cross-sectional data.
Below is an example of time-series data on coronavirus cases and its graph.