Digit Recognizer (Part I: Feature Extraction) Hongzhi Li, September 2000 Besides the three required features, I used four more important features in digit recognition: 1. ratio of the holes' area over the bounding box's area 2. number of horizental lines longer than half of the box's width 3. number of vertical lines longer than half of the box's height All the features related to spatial dimensions (like height, aera) should be describled using the relatively ratio. In this case, these scaled features would be a constant for different font size. So, like the required features, such as (region area)/(box area) and width/height of box, I employed relatively value to described the above three features. The first feature can be important when trying to recognize the digits with holes, like '4' and '0'. In most cases, the '0's ratio value should be larger than that of the '4'. The second feature together with the third one are really very important features. Obviously, different digits have the differnt number of these two features. Moreover, in my program, I can adjust the ratio value to select different size of lines. Like I can pick out the lines with the equal length of the width/height of the bounding box, when I set the parameter h/vlineRatio=1.0 in the program. This flexibilty will help more when implementing these two features. Of course, like the other features, we should be careful when dealing with different fonts, for example, Italics. The algorithm I used to check long lines is to find a connected horizental (or vertical) lines first. Then check the length of the line. If it's longer than an specific value (like half-width of the box), check its neighborhood to see whether there is a parallel line already picked out. If not, this is a real line. Otherwise, it's just a part of a thick line. Discard it! We still need to mark the pixels in this line as if they were a real line, so we will not think the parallel neighborhood will be a new line. 4. number of notchs (notch area)/(box area) Just like holes' number is an important feature to distinguish digits like '8' and '0', notchs' number can also be an useful auxiliary feature in digit recognition. The notch is sort of the oppsite of the hole. The hole is composed of the white pixels surrounded by black pixels, while the notch is the region of white pixels connected to the edge of the bounding box. If a region of white pixels is not a hole, it is must be a notch. Checking the notch can be helpful in the digit recognition. For example, in most cases, the digit '1' will split the region into two parts, thus the notch number should be two (one for left-side and the other for right-side). In contrast, the notch number for the digit '2' could be as more as four. The codes for checking holes and notchs can be put together. In my program, I used recursively call to find a piece of connected white pixels (a white region). If the white region hits the edge of the bounding box, it is a notch. Otherwise, it's a hole. For a notch, I will carefully check its neighbor pixels to make sure all the white pixels in the region are marked. So that no extra faked notchs will be picked out around the edge region. According to my analysis of all the digits image in /pgm_format.images, none of the above features can be used as a 'sharp recognizer' to figure out a digit at one glance. For example, the holes feature should work perfect when trying to recognize '8'. That's the only digit who has two holes. But extracted the 32 samples of '8' from 'eights. pgm' file, my results shows sometimes the program can only find one hole for some specific images. In order to improve the recognition, several features need to be used together. At last, I will show a piece of my codes in my program. The correctness of results is shown by checking: Area of (notch+hole+region)/Area of box = 1 Here's my Java codes in digitFeature.java to deal with holes and notchs: /******************************************************* *Cal number of holes/notchs and their area in the region *Black pixels = 0; white pixels = 1; *White edges will be marked as 2 to avoid extra holes picked out *All the other checked white pixels are marked as 0. */ public void getHole() { getPixels(); //Recover the pixel values of the region for (int rows = 0; rows < height; rows++) { for (int cols = 0; cols < width; cols++) { if (pixels[rows][cols] == 1) { //white pixel? ahole = true; wArea=0; checkHole(rows, cols); if (ahole) { holes++; //hole there? holeArea += wArea; //cal hole area } else { notchs++; //must be a notch! notchArea += wArea; //notch area } } } } hbRatio = (double)holeArea/boxArea; //hole area ratio nbRatio = (double)notchArea/boxArea; //notch area ratio } /**Check whether there is a hole in region when spreading from the seed * pixel (rows, cols). * There will be a hole if none of the 4-white neighbor hit the edge * of the bounding box. * Only 4-neighbor are checked for holes, to be asymmetry for 8-neighbor * region checking! */ public void checkHole(int rows, int cols) { if (pixels[rows][cols]==1) wArea++; //Increment hole/notch area if (!edge(rows, cols)) { pixels[rows][cols]=0; //erase this white pixel goNeighbor(rows, cols); //spread checking to neighbor } else { pixels[rows][cols]=2; //marked edge notch pixels as 2!! ahole = false; //white region hits the box, not a hole! goNotchEdge(rows, cols); } } /**Checking four neighborhood pixels for possible holes */ public void goNeighbor(int rows, int cols) { if (!edge(rows, cols)) { if (pixels[rows-1][cols] != 0) checkHole(rows-1, cols); if (pixels[rows+1][cols] != 0) checkHole(rows+1, cols); if (pixels[rows][cols-1] != 0) checkHole(rows, cols-1); if (pixels[rows][cols+1] != 0) checkHole(rows, cols+1); //0: black pixel. 1:white. 2:white edge //Can't only check ==1! Edge = 2!! } } /**Check the neighbor of a notch for the edge pixels * Black it if it's a white pixel */ public void goNotchEdge(int rows, int cols){ if (rows == 0 || rows == (height-1)) { //top/bottom if (cols > 0 && pixels[rows][cols-1] == 1) checkHole(rows, cols-1); //search edge if (cols < (width-1) && pixels[rows][cols+1] ==1) checkHole(rows, cols+1); if ((cols == 0 || cols == (width-1)) && rows == 0 && pixels[rows+1][cols] == 1) checkHole(rows+1, cols); //top corner? if ((cols == 0 || cols == (width-1)) && rows == (height-1) && pixels[rows-1][cols] == 1) checkHole(rows-1,cols); //bottom corner? if (rows == 0 && pixels[rows+1][cols] ==1) checkHole(rows+1, cols); //search inside if (rows == (height-1) && pixels[rows-1][cols] ==1) checkHole(rows-1, cols); } else { //left/right and no corner if (rows > 0 && pixels[rows-1][cols] == 1) checkHole(rows-1, cols); //serach edge if (rows < (height-1) && pixels[rows+1][cols] == 1) checkHole(rows+1, cols); if (cols == 0 && pixels[rows][cols+1] == 1) checkHole(rows, cols+1); //serach inside if (cols == (width-1) && pixels[rows][cols-1] == 1) checkHole(rows, cols-1); } } /**Check whether a pixel is at edge of the bounding box */ public boolean edge(int rows, int cols) { return (rows == 0 || rows == (height -1) || cols == 0 || cols == (width - 1)); }