Handwriting Recognition using Tensor flow

CO17 315 Ashish Upadhyay
4 min readMar 19, 2021

Written by Ashish upadhyay & Amandeep Anand

Imagine if you want to type a long paragraph or maybe want to draw a circle, you want certain symbols to be drawn as it is in your system then projects like canvas drawing recognition will come handy. By drawing the symbols and texts we can supply the text to “text to speech” model or by drawing we can give the database to apps like paint or photoshop .These are a basic project that aims towards recognizing the certain symbols and texts or maybe drawings.

Our project “Handwriting Recognition” is a first step towards above mentioned points.

The Handwriting-Recognition is a project developed using Python django. The project is a web application that recognizes handwriting and converts it into text, by incorporating multiple machine learning models that were pre-trained using the EMNIST Dataset on Kaggle. These neural network models recognize all digits, all uppercase letters, and all lowercase letters that are visibly different from their uppercase counterparts.

Major libraries used: -

TensorFlow

TensorFlow is a free and open-source software library for machine learning. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. Tensorflow is a symbolic math library based on dataflow and differentiable programming. It is used for both research and production at Google. TensorFlow was developed by the Google Brain team for internal Google use. It was released under the Apache License 2.0 in 2015

What is deep learning?

Deep learning is a subset of machine learning in artificial intelligence that has networks capable of learning unsupervised from data that is unstructured or unlabeled. Also known as deep neural learning or deep neural network.

How the Incoming Data is Fed Into The Models:

->Example: A user writes and submits the handwriting, “Hey you”, on the client.

->The frontend takes the image data found in the canvas element and converts it into a binary blob.

->The blob is sent as a POST request to Django.

->The image is saved in Django and the filepath is loaded into cv2.

->The entire “Hey you” image is trimmed of excess pixels.

->“Hey you” is cut up on each character giving us the 6 images “H”, “e”, “y”, “y”, “o”, “u”.

  • Images are cut up where drawing lines in the x-direction are not continuous, and where the space of discontinuity is of a decent size. Small discontinuous spaces are left alone.
  • The algorithm will notice a very large discontinuous space in the x-direction between the two “y” letters, which is implied to be a text-space. We will store this knowledge in the variable space_location.

->Each image is trimmed of excess pixels. The height of each “raw” image is accounted for in the variable char_img_heights.

->Each image is padded with extra pixels in a way where the image becomes a square shape. This is so that the image will not be warped when the image is resized down during data normalization.

->Each image is normalized. Each image is converted to a numpy array, reshaped, and the pixel values range from 0 to 1 instead of 0 to 255.

->We loop through all of these images — each model makes a prediction at each image. The most popular prediction between the models will be added to the final character result, final_prediction.

  • Each model prediction for each image will be an output of a number between 0 through 46 which corresponds to the index of the 47 characters that each model was trained on. (Ex: an output of 17 corresponds to H in the mapping).
  • The prediction of each model is mapped and compared with the model group.
  • The most popular prediction between the models in the group will be the final prediction.
  • If the final prediction between the models is alphabetical, we make sure that the lowercase compliment is found inside of the mapping. If it is not, that means we have a letter where the lower and uppercase are similar, the only difference is the size. We need to make a decision on the output casing based on the size of the image, which we get from char_img_heights. This decision will be performed on the images “y”, “y”, “o” and “u”. The letter “y” gets a special constraint because its height is larger than the average lowercase letter.
  • While iterating, if the number of loop iterations equals a number inside space_location, a “ “ is appended to the final result. In this example, space_location will have [2] signaling that there’s a space after “y” — which will give us a “Hey “ at the end of the first “y” iteration.
  • >Django responds with final_prediction to React with “Hey you”, and React displays the result on the client.
Final Output will look like this

--

--