An introduction to a fundamental lesson in Data Science.

One of the most important concepts we can know in Data Science is Bayes’ Theorem. Named after the British mathematician and Presbyterian minister Thomas Bayes (18th century), Bayes’ theory is built on conditional probability. Bayes allows us to deductively reason our way into estimating unknown probabilities. His logic was true in that because we simply don’t know our reality perfectly, we can only improve our knowledge and assumptions when presented with new evidence.

Photo by Tim Mossholder on Unsplash

Applications of the Bayes’ Theorem can take a variety of forms, such as detecting spam emails to detecting…


Another question of the challenge series answered step-by-step

Back to my series of coding challenge questions. I’ll be posting a few of these individually, as they are basically required practice if you are an entry-level data analyst or data scientist looking for your first position.

Think from the hiring manager’s perspective. They see you might have some projects and coding experience, but that doesn’t completely cut it for them. As one respectable senior data scientist once told me, who will the hiring person choose: the one who can code the right answers after scouring through stack overflow and google or…


and why doing these questions are worth your time

Back to my series of coding challenge questions. I’ll be posting a few of these individually, as they are basically required practice if you are an entry-level data analyst or data scientist looking for your first position.

Think from the hiring manager’s perspective. They see you might have some projects and coding experience, but that doesn’t completely cut it for them. As one respectable senior data scientist once told me, who will the hiring person choose: the one who can code the right answers after scouring through stack overflow and google…


and why doing these questions are worth your time

Back to my series of coding challenge questions. I’ll be posting a few of these individually, as they are basically required practice if you are an entry-level data analyst or data scientist looking for your first position.

Think from the hiring manager’s perspective. They see you might have some projects and coding experience, but that doesn’t completely cut it for them. As one respectable senior data scientist once told me, who will the hiring person choose: the one who can code the right answers after scouring through stack overflow and google…


and why doing these questions are worth your time

If you are like me, a noob and entry-level data scientist, you probably have sighed at the thought of the dreaded coding challenge. There are so many possible questions, how could you know what will be on the test? Data science and analytics knowledge is one thing. But being able to prove your skills in your language of choice (mine is Python) is something impossible to avoid.

Think from the hiring manager’s perspective. They see you might have some projects and coding experience, but that doesn’t completely cut it for them…


The proper way to program

In addition to relearning various aspects of programming foundations, I’d like to take an opportunity to write about Object-Oriented Programming (OOP). Object-Oriented Programming is a programming paradigm. A programming paradigm is just a way to classify a programming language based on its features.

I’ll just by typing it as OOP from now on, for time’s sake.

OOP is based on the concept of classes and objects, which contain the code and data that is structured into efficient, reusable blocks. This was a solution for old programmers who originally just wrote procedural programming, in long sequences…


Use cases include proper sampling!

As data scientists, we are constantly designing experiments that require surveying a population of something. Oftentimes, however, the population we are testing, whether it be patients with illness, likely voters in a district, or animal populations, the population is just too big to test them all. Or testing them all would require extensive time and or too much money spent, which is not possible.

So what to do in that case? We sample from the population. It sounds simple, but there are important aspects of sampling that must be considered when doing it to ensure…


An essential lesson in machine learning fundamentals.

The Bias vs Variance Trade-off is an essential concept to grasp if you want to learn machine learning. Understanding its relation to overfitting and underfitting is necessary to build an accurate machine learning model. It is also often a topic covered during data science interviews, which is a good reason to go over it. Practice makes perfect!

When talking about machine learning, bias and variance are referring to measures of prediction error produced by your predictive model. By prediction error, I mean the range of error between your model’s predicted values and its…


Avoiding Overfitting in Regression Models

Photo by Michael Cox on Unsplash

In this post, I’m going to cover another very common technical interview question regarding regression that I, myself, could always brush up on:

Describing L1 vs L2 regularization methods in regression modeling.

When working with complex data, we tend to create complex models. And too complex is not always great. Overly complex models are what we call “overfit”, where they perform very well on training data, yet fall short in performance on unseen testing data. This also means high variance and low bias, which I delve into further in another post.

One way to adjust…


AKA Git Squash

Photo by Florian Olivo on Unsplash

As I dive deeper into my job search, I’ve taken up a new data science project to widen my skill set and also better-relate my career goals. The main goal of this project is to work with satellite imagery and Concurrent Neural Networks. More specifically, I retrieved data from Kaggle and will be classifying cloud types in order for a better understanding of weather patterns.

I’ve downloaded a dataset that I’ve realized is larger than I am used to working with and have gotten absolutely stuck trying to push it to GitHub!

I have found the issue…

Orin Conn

I’m a recent Data Science graduate with a B.S. in Environmental Science. Currently seeking job opportunities. Constantly learning!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store