# Apriori Algorithm Interview Questions

### Conviction

Conviction of a rule can be defined as follows:

conv(x => y) =

conv({wine, chips} => {bread} ) =

Its value range is [0, +∞].

• Conv(x => y) = 1 means that x has no relation with y.
• Greater the conviction higher the interest in the rule.
• Now that we know the methods to find out the interesting rules, let us go back to the example. Before we get started, let us fix the support threshold to 50 per cent.

If you have any doubts or queries related to Data Science, do a post on Data Science Community.

## How Does the Apriori Algorithm Work?

The key concept in the Apriori algorithm is that it assumes all subsets of a frequent itemset to be frequent. Similarly, for any infrequent itemset, all its supersets must also be infrequent.

Grab high-paying analytics jobs with the help of these Top Data Science Interview Questions!

Let us try and understand the working of an Apriori algorithm with the help of a very famous business scenario, market basket analysis.

Here is a dataset consisting of six transactions in an hour. Each transaction is a combination of 0s and 1s, where 0 represents the absence of an item and 1 represents the presence of it.

 Transaction ID Wine Chips Bread Milk 1 1 1 1 1 2 1 0 1 1 3 0 0 1 1 4 0 1 0 0 5 1 1 1 1 6 1 1 0 1

We can find multiple rules from this scenario. For example, in a transaction of wine, chips, and bread, if wine and chips are bought, then customers also buy bread.

In order to select the interesting rules out of multiple possible rules from this small business scenario, we will be using the following measures:

## Using the famous Apriori algorithm in Python to do frequent itemset mining for basket analysis

In this article, you’ll learn everything you need to know about the Apriori algorithm. The Apriori algorithm can be considered the foundational algorithm in basket analysis. Basket analysis is the study of a client’s basket while shopping.

The goal is to find combinations of products that are often bought together, which we call frequent itemsets. The technical term for the domain is Frequent Itemset Mining.

Basket analysis is not the only type of analysis when we use frequent items sets and the Apriori algorithm. In theory, it could be used for any topic in which you want to study frequent itemsets.

Although I want to keep this article more applied than technical, it is important to understand the basics underlying the Apriori algorithm.

It is important to notice here that there are multiple things to take into account:

• How can you find frequent itemsets in a dataset?
• How can you find frequent itemsets in a dataset efficiently?
• ### Support

Support of item x is nothing but the ratio of the number of transactions in which item x appears to the total number of transactions.

Support(wine) =

Support(wine) =

## How to organize your data for the Apriori algorithm?

Let’s start at the beginning: you have a data set in which customers are buying multiple products. Your goal is to find out which combinations of products are frequently bought together.

You need to organize the data in such a way that you have a set of products on each line. Each of those sets contains products that were bought in the same transaction.

The most basic solution would be to loop through all the transactions and inside the transactions loop through all the combinations of products and count them. Unfortunately, this is going to take way too much time, so we need something better.

Two scientists Agrawal and Srikant were the first to propose a solution to this in their 1994 paper called Fast Algorithms for Mining Association Rules. Their first solution is the famous Apriori algorithm.