A Study of Feature Design for Online Handwritten Chinese Character Recognition based on Continuous-Density Hidden Markov Models

ICDAR 2009 |

We present a new feature extraction approach to online Chinese handwriting recognition based on continuous-density hidden Markov models (CDHMM). Given an online handwriting sample, a sequence of time-ordered dominant points are extracted first, which include stroke-endings, points corresponding to local extrema of curvature, and points with a large distance to the chords formed by pairs of previously identified neighboring dominant points. Then, at each dominant point, a 6-dimensional feature vector is extracted, which consists of two coordinate features, two delta features, and two double-delta features. Its effectiveness has been confirmed by experiments for a recognition task with a vocabulary of 9119 Chinese characters and CDHMMs trained from about 10 million samples using both maximum likelihood and discriminative training criteria.