Python Challenges, Episode 6

A coding challenge walkthrough, number 6.

Photo by Brett Jordan on Unsplash

I’ll be posting a few of these individually, as they are basically required practice if you are an entry-level data analyst or data scientist looking for your first position.

Think from the hiring manager’s perspective. They see you might have some projects and coding experience, but that doesn’t completely cut it for them. As one respectable senior data scientist once told me, who will the hiring person choose: the one who can code the right answers after scouring through stack overflow and google or the one who can code the answer just from the knowledge in his head? I think you know the answer to that.

These questions might seem mindless and unreasonably challenging at times, but your answers to them demonstrate the skills in your head. And once you start to get the right answers, you will feel good. How you approach these problems is also key to completing them.

Here is a question on HackerRank (medium difficulty) that is a decent example of something you might see in a technical interview or coding challenge:

You are given an integer N followed by N email addresses. Your task is to finish a function which will print a list containing only valid email addresses in lexicographical order.

Valid email addresses must follow these rules:

It must have the username@websitename.extension format type.

The username can only contain letters, digits, dashes and the two underscores [a — z], [A — Z][0–9][_ , -].

The website name can only have letters and digits[a — z], [A — Z][0–9] .

The extension can only contain letters [a — z], [A — Z].

The maximum length of the extension is 3.

The proper way to attack these problems is to think through them before you really even start coding. It helps to reread the problem slowly and break your solution down into steps based on what you are reading.

Here we input an integer, say 3, and then have 3 corresponding email addresses input for validation. The emails have to follow that format, which is sort of a hint to designate three parts for the email addresses: username, website, and extension. So we can assume the first part is splitting the email string into these three parts.

Once we manage to split the email address, we can do run these parts into some sort of if/then statement and say if they are not in that specified criteria, then the email address is not valid. We can return only the email addresses that are within the criteria.

To show it first, here is a working solution I’ve glued together from my original code and other users’ discussion code. This managed to work with all edge cases.

def fun(email):
try:
check = email.replace(".", "@")
username, website, extension = check.split("@")
except ValueError:
return False

if username.replace("-", "").replace("_", "").isalnum() is False:
return False
elif website.isalnum() is False:
return False
elif extension.isalpha() is False or len(extension) > 3:
return False
else:
return True

Let’s break it down.

Beginning the frame for practicing in a notebook:

def validate(email):
#write function here
n = int(input())
emails = [input() for email in range(n)]
valid = list(filter(validate, emails))
print(sorted(valid))

This will open up two input boxes in your IDE to input the first integer, n, and corresponding email addresses.

Now we want to split the email strings. When I did this, I knew I could split a split on a designated value, but couldn't figure out a way to split a string by two values in the same line. So I came up with a work-around. (I‘m getting the feeling that workarounds are a norm in data analytics, so don’t be afraid to get creative at first to get a working solution!)

try:
check = email.replace(".", "@")
username, website, extension = check.split("@")
except ValueError:
return False

First I actually replace the period with an “@” so that there ends up being two of them. This way I can use .split(“@”) and it will split the emails twice, into three parts!

I wrap this in a try/except statement to stop any crappyemail@email addresses from getting through.

After this block, we apply some convenient python methods that account for these exact value ranges specified for each part of the email in the problem:[a — z], [A — Z][0–9]. If we attached.isalnum() to any string, it will return a boolean value depending on whether it is “alphanumeric” meaning it is made up of [a — z], [A — Z] or [0–9]. (There are additional methods like this too. For example, .isalpha() will determine if the string has just alphabetic values: [a — z], [A — Z].

For username:

if username.replace("-", "").replace("_", "").isalnum() is False:
return False

We can check the underscores off the list by temporarily replacing them in the string if they exist. Then we wrap this in another conditional if statement because we want only strings that satisfy the conditions. If they don’t satisfy, we return False and then only return if True later.

For website:

elif website.isalnum() is False:
return False

We can basically do the same thing within the same if statement. For website, we don’t want underscores or anything other than [a — z], [A — Z] or [0–9], which is just alnum or alphanumeric again.

To continue an if statement we can use elif .

For the last part extension:

elif extension.isalpha() is False or len(extension) > 3:
return False

Notice that in the problem, the extension is only alpha but also can’t be more than 3 characters. This would be something like “crapemail@hotmail.com1”. We can link more than one condition with a simple or after the elif.

As you can see we’ve now checked off all the criteria for a valid email address according to the problem. Now, all we want to do is simple return the emails that have filtered through our criteria:

else:
return True

That’s it!

It’s often the case that a seemingly complex challenge can be easily solved with a little extra knowledge of the python library and its methods. This can really be learned most quickly by reverse engineering other people’s answers if you have tried for long enough with no avail.

For example, in Hackerrank, if you are doing a question and have no idea how to approach it, spend a few minutes thinking about what you know you can code it in Python. If you can’t write out steps for how to approach it at least, don’t feel bad by checking the discussion section of Hackerrank! By seeing how people (who are probably more experienced than you if you are reading this(: ) code, you can learn quickly.

I hope this helps you in your journey to becoming comfortable with Python, especially if you have an interview coming up! Practice, practice, practice.

Peace!

I’m a recent Data Science graduate with a B.S. in Environmental Science. Currently seeking job opportunities. Constantly learning!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store