Introduction and Dependencies

How to perform basic image recognition with the use of Python




There are many applications for image recognition. One of the largest that people are most familiar with would be facial recognition, which is the art of matching faces in pictures to identities. Image recognition goes much further, however. It can allow computers to translate written text on paper into digital text, it can help the field of machine vision, where robots and other devices can recognize people and objects.

Here, our goal is to begin to use machine learning, in the form of pattern recognition, to teach our program what text looks like. In this case, we'll use numbers, but this could translate to all letters of the alphabet, words, faces, really anything at all. The more complex the image, the more complex the code will need to become. When it comes to letters and characters, it is relatively simplistic, however.

How is it done? Just like any problem, especially in programming, we need to just break it down into steps, and the problem will become easily solved. Let's break it down!

First, you are going to need some sample documents to help with this series, you can get the sample images here.

From there, extract the zip folder and move the "images" directory to wherever you're writing this script. Within it, you should have an "images" directory. Within that, you have some simple images that we'll be using and then you have a bunch of example numbers within the numbers directory.

Once you have that, you're going to need the Python programming language. This specific series was created using Python 2.7. You can go through this with Python 3, though there may be some minor differences.

You will also need Matplotlib, NumPy and PIL or Pillow. You can follow the video for installation, or you can also use pip install. At the time of my video, pip install wasn't really a method I would recommend. With any newer version of Python 2 or 3, you will get pip, and pip support on almost all packages is there now.

Pip is probably the easiest way to install packages Once you install Python, you should be able to open your command prompt, like cmd.exe on windows, or bash on linux, and type:

pip install numpy
pip install matplotlib

Having trouble still? No problem, there's a tutorial for that: pip install Python modules tutorial.

If you're still having trouble, feel free to contact us, using the contact in the footer of this website.

Once you have all of the dependencies, then you are ready to move on to the next part!

The next tutorial:





  • Introduction and Dependencies
  • Understanding Pixel Arrays
  • More Pixel Arrays
  • Graphing our images in Matplotlib
  • Thresholding
  • Thresholding Function
  • Thresholding Logic
  • Saving our Data For Training and Testing
  • Basic Testing
  • Testing, visualization, and moving forward