Testing the Efficacy of Your Website’s New Feature (Part 2)
Technically known as A|B testing or hypothesis testing
Following up on Part 1, I’m now going to walk through our simple A|B test using Python!
Here is an example of what our website’s data looks like:
import pandas as pd
import numpy as npdf = pd.read_csv('ad_actions.csv')df.tail()

Our example dataset indicates testing was done on the control group and experiment group at the same time (no historical data), which is not a problem!
It’s necessary to explore the data a bit. This is to ensure there aren’t any anomalies or noise in your collected data for example, which would have to been dealt with or adjusted for.
print(df.group.value_counts(),'\n')print(df.action.value_counts())#####
control 4264
experiment 3924
Name: group, dtype: int64 view 6328
click 1860
Name: action, dtype: int64
So there are 6328 total views of our ads and 1860 total clicks.
What about the number of viewers who didn’t click? Let’s assign these value counts above to variables and make the simple calculation:
click_ids = set(df[df.action=='click']['id'].unique())
view_ids = set(df[df.action=='view']['id'].unique())print("Number of viewers: {} \t Number of clickers: {}".format(len(view_ids), len(click_ids)))
print("Number of viewers who didn't click: {}".format(len(view_ids-click_ids)))#####
Number of viewers: 6328 Number of clickers: 1860
Number of viewers who didn't click: 4468
Ok, now we want to run the test on our data. First, we will assign a count for each click and view and convert them to binary values: : 1 if the user clicked the ad, 0 if they didn’t click.
df['count'] = 1
df.head(20)

Control group binary frame:
control = df[df.group == 'control'].pivot(index='id', columns='action', values='count')
control = control.fillna(value=0)
control.head()

Notice that users who clicked had to view the homepage, so there is a view count for every user.
Experimental group binary frame:
experiment = df[df.group == 'experiment'].pivot(index='id', columns='action', values='count')
experiment = experiment.fillna(value=0)
experiment.head()

Now that we’ve converted clicks and views into binary variables, let’s get the average click-through-rate.
print('Sample sizes: Control:{} Experiment:{}'.format(len(control), len(experiment)))
print('Total clicks: Control:{} Experiment:{}'.format(control.click.sum(), experiment.click.sum()))
print('Average click-through-rate: Control:{} Experiment:{}'.format(control.click.mean(), experiment.click.mean()))#####
Sample sizes: Control:3332 Experiment:2996
Total clicks: Control:932.0 Experiment:928.0
Average click-through-rate:
Control:0.2797118847539016 Experiment:0.3097463284379172
Stay tuned for part 3, where we will calculate the p-value. Coming soon!