During the Anti-Trust hearing with Mark Zuckerberg in April 2018, we got to watch Zuckerberg explain the Internet to lawmakers of the United States. One Senator, Orrin Hatch seemed especially confused by how exactly Facebook managed to buy Zuckerberg his expensive suit. Zuckerberg explained how advertisements form the backbone of Facebook’s return revenue. Orrin however didn’t seem to catch the drift as he proceeded to ask, “How do you maintain a business model whereby your users do…

The Coca-Cola company has embraced the reuse of its bottles and all the environmental and monetary benefits that come with that. When customers buy a Coke drink in glass bottles, they are rewarded upon returning the empty bottle. This got me thinking about all the plastic bottles and cans that do no warrant a reward leading to them being tossed and wasted. There should be a way of automatically identifying Coca Cola bottles for reuse within the company.

Coca Cola bottles are easily discernable using the labels that have a large, “Coca Cola “ print on them. The print is…

Lung Cancer is the leading cancer killer in both men and women in America. It claims more lives than breast, colon, and prostate cancer combined. Like all the other kinds of cancer, a nodule in the lungs is what is indicative of cancer.

CT scans taken of the lungs are what the doctors usually use to determine the presence of nodules in the lungs. To note is that several images are drawn from a single patient such that a 3D image of the lungs can be formed. The images are of the saggital, coronal and axial view. Hence for a…

As a recent believer of the works of dimensionality reduction and noise filtering, I would be averse to keep myself from preaching the good word of PCA.

Principal Component Analysis, like most other ML models, is self descriptive. Take it to mean that we are taking the principal(or most important) components of the data that is provided to us. Thusly, PCA is essentially a dimentionality reduction algorithm.

PCA is also used for noise reduction, feature extraction and feature engineering.


from sklearn.decomposition import PCA

pca = PCA()


While it is entirely possible to learn ML with an arbitrary understanding of Mathematical concepts such as Linear Algebra, it is absolutely paramount to develop a deep understanding of the aforementioned concept so as to truly grasp the ins and outs of ML.

Higher dimension data can be reduced to lower dimensionality by zeroing some of the components(principal). The purpose is mainly to maintain maima data variance.

pca =…


One of the playground datasets for Data Scientists is the House Prices dataset. I will use this to demonstrate how to go about completing a data science project involving Regression.


Before taking up any DS project, one should always try to get an understanding of what exactly the output of the project should be. You can do that in a couple of ways:

1. Check the sample submission files

2. Determine the target data

Doing this helps you determine whether you will employ the services of a Supervised or Unsupervised ML technique. If there is a target data (like in…

Riddle me this, what probably has two thumbs but is definitely the reason why you are so screwed up mentally? The answer, your African parent.

I recently decided to go for therapy after I had suffered an intense bit of depression. I can’t lie, things were looking pretty bleak for me. Let’s delve a bit into what was happening with me. I had just got(Notice I used ‘got’ and not ‘gotten’) out of a two year relationship after finding out that my girlfriend had been having fun times with honestly my closest friend. My mind was in shambles since I…

A neural network can be described as a series of algorithms that solve a problem by mimicking the way the human brain works. Neural networks adapt to different inputs without having to change the algorithm.

Biological Inspiration

My lack of expertise in human Biology keeps me from delving too deep into the human brain. However, I believe it’s pretty much obvious that when we’re talking about neural networks, the inspiration comes from the neuron in human beings.

The main parts of a neuron are

  • Cell body
  • Axons
  • Dendrites



Neurons take in inputs (p).

The input is weighted using a weight function (w).

For this we will use the Breast Cancer Wisconsin Dataset.

The aim here is to classify tumors of the breast as either ‘Malignant’ or ‘Benign’.

Firstly, I feel it is important to decide whether we need a Supervised or Unsupervised Machine Learning technique. Supervised ML techniques are used when we need to feed the algorithm with the target dataset (Usually labelled y_train) whereas in Unsupervised ML one does not assign the algorithm the target dataset but instead allows it to form associations of its own and classifies the datasets using the aforementioned associations.

The aim is clearly to classify the…

Anyone who has ever had a conversation on programming with me knows how I love to spam the phrase, “You have to be ready not to know anything”. Looking back I can still remember being a total greenhorn (Forgive my use of this cliché) on matters concerning DS. I am by no measure an expert on the matter but I wouldn’t count myself as a slouch either.

A lot of experts recommend a top-down approach when trying to learn Data Science and Machine Learning and I feel the same. …

