The research program EINNS (Entwicklung intelligenter neuronaler Netze zur Schrifterkennung – Development of Intelligent Neural Networks for Character Recognition) in the context of the R&D program “Information and Communication Technology” of the Free State of Bavaria, was performed by STABILO in cooperation with the FAU Erlangen-Nuremberg, Chair of Computer Science 14 (Machine Learning and Data Analytics). In EINNS, alternative approaches for character recognition and trajectory reconstruction were evaluated. Instead of using HMMs as a basis for writing recognition and an analytical approach for trajectory reconstruction, EINNS trained different neural networks for both problems based on raw pen sensor data.
At the beginning of the project, the approaches made in a master thesis in 2018 for the recognition of letter elements in STABILO were continued. In doing so, the size and regularization techniques of an LSTM network were adapted. To further validate sensor fusion even without optical tracking, letters, circles, and triangles were recorded with defined sizes. While this is not as effective as optical tracking of the pen tip, it has also helped to improve sensor fusion. The parameters from this improved sensor fusion were added as input parameters for the LSTM, but could not improve the result.
In a second step, an attempt was made to split the recognition process into several stages. Comparing a 13-class LSTM, which achieved 85% accuracy with a known writer using a standardized training and test dataset, and a composition of three LSTMs, the first of which was designed to distinguish between rounded and straight elements first and classify the results with two subsequent, independent LSTMs, only 82% accuracy was achieved. With an unknown writer (the training data had been written by a different person than the test data), recognition dropped to 68%. The only 13 classes can be explained by the approach with letter elements (single strokes and arcs), because it should be tested whether the recognition of these single elements and the subsequent composition to whole letters is already sufficient for handwriting recognition.
At the MaD-Lab of the FAU, we found that pen-based recognition systems are limited to the recognition of single characters and are mostly focused on single users. To further improve the state of the art in this area, we conducted experiments that enabled separate recognition of uppercase and lowercase characters. The focus of our work was not only on improving recognition rates, but also on ensuring that the developed system is capable of recognizing handwriting from unknown writers. This means that the presented models can be generalized to random writers that were not considered in the training data.
In the course of our research, we developed the first word recognizer that enables the recognition of complete words without the need to segment words into individual characters. We relied on methods adopted from the field of speech recognition that enable the recognition of temporally distributed, sequential characters. Since the participants in our studies were asked to write in natural handwriting without constraints, the models developed could be generalized to unknown writers as well as to new words that were not included in the training set.
Since our system uses inertial data, we lacked a visualization of the written text. Therefore, we investigated other methods to reconstruct the path of the pen tip. In our investigations, we found that all systems developed so far were based on classical physical methods that integrated acceleration signals to calculate the velocity, which was then reintegrated to obtain the location of the tip. However, due to sensor errors, these methods are unreliable because the generated location coordinates contain errors due to sensor drift. Therefore, we pursued a different approach using neural networks to automatically convert the raw data into location information without the need for signal integration. As a consequence, we have developed the first system of this kind for the field of handwriting recognition, which yields promising results.
Samples of pen trace reconstructions based on inertial data. The rightmost trace is the original.