The OnHW Dataset

The OnHWchars Dataset

The task associated with this dataset is to classify characters from the sensor data provided by the STABILO Digipen. This introduces unique problems to be solved by the research field of multivariate time series classification, as every writer has their personal way of writing and their own way of holding the pen. Furthermore, writer-dependent as well as writer-independent evaluations can be performed.

The dataset is described in detail in the paper referenced below.

The dataset consists of 31,275 handwritten characters of 52 classes (26 upper case, 26 lower case) written by 119 subjects. The sensor data (2 accelerometers, 1 gyroscope, 1 magnetometer, 1 force sensor) of the digital pen was saved while writing the letters on paper, along with the ground-truth characters and the corresponding timestamps.

The dataset can be downloaded here (876 MB).
For information on the content, see the readme.txt file.

As requested by researchers working with the dataset, we offer an alternative download that provides the data by writers, not split into training and test sets. This allows the possibility of distinguishing between different writers, and gives the opportunity to investigate different users independently. Download here(43.8 MB)

The STABILO Challenge for Ubicomp 2020 used a slightly adapted version of this dataset.

2020-09-14: Dataset release
2020-09-22: Improved readme file
2021-06-30: Updated writer-independent dataset with splits for easier reproducibility
2021-11-30: Added an alternative download containing writer-based splits

How to cite the dataset?

If you use the OnHw dataset, please cite:

Felix Ott*, Mohamad Wehbi*, Tim Hamann, Jens Barth, Björn Eskofier, and Christopher Mutschler. „The OnHW Dataset: Online Handwriting Recognition from IMU-Enhanced Ballpoint Pens with Machine Learning.“ In Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. (IMWUT), vol. 4, no. 3, article 92, Cancún, Mexico, Sept. 2020

@inproceedings{ott_onhw, author = {Felix Ott and Mohamad Wehbi and Tim Hamann and Jens Barth and Bj{\”o}rn Eskofier and Christopher Mutschler}, title = {{The OnHW Dataset: Online Handwriting Recognition from IMU-Enhanced Ballpoint Pens with Machine Learning}}, booktitle = {Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. (IMWUT)}, volume={{4(3),article 92}}, address = {Canc\'{u}n, Mexico}, month = sep, year = {2020} }

Related Work

The dataset was created in collaboration with EINNS and Schreibtrainer Research project, main contributors were the MaD-Lab and Fraunhofer IIS.

Additional data and a nice summary can be found here.


Can I demonstrate my algorithms live?

You or your research organization could consider purchasing a Digipen Development Kit.