Testing the Efficacy of Your Website’s New Feature (Part 2)

Technically known as A|B testing or hypothesis testing

Photo by Markus Winkler on Unsplash

Following up on Part 1, I’m now going to walk through our simple A|B test using Python!

Here is an example of what our website’s data looks like:

import pandas as pd
import numpy as np
df = pd.read_csv('ad_actions.csv')df.tail()

Our example dataset indicates testing was done on the control group and experiment group at the same time (no historical data), which is not a problem!

It’s necessary to explore the data a bit. This is to ensure there aren’t any anomalies or noise in your collected data for example, which would have to been dealt with or adjusted for.

print(df.group.value_counts(),'\n')print(df.action.value_counts())#####
control 4264
experiment 3924
Name: group, dtype: int64
view 6328
click 1860
Name: action, dtype: int64

So there are 6328 total views of our ads and 1860 total clicks.

What about the number of viewers who didn’t click? Let’s assign these value counts above to variables and make the simple calculation:

click_ids = set(df[df.action=='click']['id'].unique())
view_ids = set(df[df.action=='view']['id'].unique())
print("Number of viewers: {} \t Number of clickers: {}".format(len(view_ids), len(click_ids)))
print("Number of viewers who didn't click: {}".format(len(view_ids-click_ids)))
#####
Number of viewers: 6328 Number of clickers: 1860
Number of viewers who didn't click: 4468

Ok, now we want to run the test on our data. First, we will assign a count for each click and view and convert them to binary values: : 1 if the user clicked the ad, 0 if they didn’t click.

df['count'] = 1
df.head(20)

Control group binary frame:

control = df[df.group == 'control'].pivot(index='id', columns='action', values='count')
control = control.fillna(value=0)
control.head()
control group

Notice that users who clicked had to view the homepage, so there is a view count for every user.

Experimental group binary frame:

experiment = df[df.group == 'experiment'].pivot(index='id', columns='action', values='count')
experiment = experiment.fillna(value=0)
experiment.head()
experimental group

Now that we’ve converted clicks and views into binary variables, let’s get the average click-through-rate.

print('Sample sizes: Control:{}  Experiment:{}'.format(len(control), len(experiment)))
print('Total clicks: Control:{} Experiment:{}'.format(control.click.sum(), experiment.click.sum()))
print('Average click-through-rate: Control:{} Experiment:{}'.format(control.click.mean(), experiment.click.mean()))
#####
Sample sizes: Control:3332 Experiment:2996
Total clicks: Control:932.0 Experiment:928.0
Average click-through-rate:
Control:0.2797118847539016 Experiment:0.3097463284379172

Stay tuned for part 3, where we will calculate the p-value. Coming soon!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store