Computer vision has gained quite a prominence in the industry with the advent of GPUs. In particular object recognition, detection, segmentation plays a pivotal role in a self-driving car 🚘 , automated identification 👮‍♀️ , information retrieval. Sometimes for image classification one needs to first detect individual objects and pass it to some classifier. Over a period of time, different algorithms were proposed for object detection such as R-CNN, Fast R-CNN, Faster R-CNN, YOLO, and many more. In this blog, we will primarily focus on these region-based algorithms, for YOLO one can see this blog.


The algorithm as…

What is groupby in python?

A group is an operation which involves various combination of splitting an object, applying transformation, and combining results.

We will go through each step but before we do that let's load a dataset. The below is a snapshot of all innings were two Australian batsmen scored a century.

Image Source Unsplash

The first step for most computer vision tasks such as classification, segmentation, or detection is to have a dataset for your problem. The dataset depends on the objective, which can be broadly divided into three parts.

  • Object Detection
  • Object Segmentation
  • Image Classification

Here I will walk you through some of the most popular datasets for computer vision.

  • LabelMe is the dataset maintained by MIT Computer Science and Artificial Intelligence Laboratory and has 187,240 images, 62,197 annotated images, and 658,992 labeled objects.
  • ImageNet is the fulcrum of all datasets on which deep learning models are trained and evaluated. …

Natural Language Processing

Image Source Unsplash

Natural Language Processing (NLP) is one of the most active areas of research. In this blog, I will walk you through some of the popular data sets which one can use for research and learning.

NLP can be broadly divided into seven subsections based on the problem one is solving.

  • Speech Recognition
  • Text Classification
  • Document Summarization
  • Q/A and Product Title Summarization
  • Sentiment Analysis
  • Recommender System
  • Machine Translation and Language Model.

Speech Recognition


The dataset consists of 1000 hours of 16kHz from audiobooks as a part of LibriVox project.

Free Spoken Digit Dataset

The dataset has recordings of spoken digits in wav files at 8kHz of 2500…

Computer Vision


In this post, we will cover the metric used for the evaluation of the object detection model. The metric is invariant of algorithms whether one uses RCNN, Fast-RCNN, Faster- RCNN, YOLO, etc.

The blog will be primarily divided into three sections, the first one covering 'what', ‘why' is IoU needed. The second one will be python implementation and finally wrapping up with its application in the context of bounding box selection via non-max suppression.

Once you build an object detector next thing you want to know is how accurate is your model. This is common across all machine learning projects…

Computer Vision

Source: Unsplash


In the previous blog, we created both COCO and Pascal VOC dataset for object detection and segmentation. So we are going to do a deep dive on these datasets.

Pascal VOC

PASCAL (Pattern Analysis, Statistical Modelling, and Computational Learning) is a Network of Excellence by the EU. They ran the Visual Object Challenge (VOC) from 2005 onwards till 2012.

The file structure obtained after annotations from VoTT is as below.

Computer Vision

Source: Unsplash


The first step for most computer vision tasks such as classification, segmentation, or detection is to have custom data for your problem set. There are multiple ways of creating labeled data; one such method is annotations.

The annotation technique manually creates regions in an image and assign a label.

Now to keep things simple, we will be using two tools Pixel Annotation tool and Microsoft VoTT. You can read more about this tool, Pixel and Microsoft VoTT.

Pixel Annotation Tools

Installation for macOS.

git clone

Then update brew using brew update.

Next, you need to install a cross-platform application development framework such…


An introduction to Unix/Linux shell scripting

(Photo by Jeremy Yap on Unsplash)


Unix is a multi-user operating system built around 1969 at AT&T Bell Labs. The main purpose of UNIX was multi-tasking.

  • Multi-user: Different users can share the same resources.
  • Multi-tasking: Execution of more than one process at the same time.

Unix is a commercial, whereas Linux (technically a kernel)is open-source. The Linux operating system easily compiles the Unix software with POSIX standards and compliance.

Unix Architecture is composed of Kernel, Shell, Applications./Programs.

In my previous blog, I walked you through all steps to run a jupyter notebook. If you’re a data scientist or developer and upgraded to macOS Catalina 10.15, then you might have faced some issues with the jupyter notebook. The latest version of Mac Catalina functionality is different than the previous s version. Follow the below steps to configure and run a jupyter notebook on the latest Catalina version.

There are four easy steps to configure a jupyter notebook.

Step I: Install the anaconda distribution.

The preferable way to go forward is to use a command-line installer instead of a…

Which path to choose !!


This blog aims to introduce readers to the concept of decision trees, intuition, and mathematics behind the hood. In the course of the journey, we will learn how to build a decision tree in python and certain limitations associated with this robust algorithm.

The name might appear quite fascinating, but tree algorithms are just simple rule-based algorithms we have been unknowingly using in our day to day life. This variant of supervised learning can be used both for classification as well as regression.

What is Decision Tree?

A decision tree is a type of supervised algorithm which uses the concept of a flow diagram…

Pushkar Pushp

Data Scientist | Deep Learning Practitioner | Machine Learning |Python | Cricket Blogger

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store