The Dataset
The challenge is already finished, but you can download a very similar dataset: the onHW-Dataset.
Stage 1
In stage 1 of the open competition, your task is to classify 26 upper case letters. We recorded 100 volunteers that provided 13102 letters.
This is what you will be able to download:
- A folder containing 100 subfolders. Each subfolder contains one recording of one person. Each recording consists of three files:
sensor_data.csv
: Contains the raw time series data of a complete recording of two 3D accelerometers, a 3D gyroscope, a 3D magnetometer and a 1D force sensor sampled at 100 Hz. Furthermore, the last column of the file is a simple sample counter and the first column contains a timestamp indicating when the sample was transmitted to the recording device via Bluetooth. More info on the sensors.labels.csv
: Contains the information at which point in time which letter was written. Thestart
andstop
columns of thelabels.csv
file point to theMillis
column in thesensor_data.csv
file.calibration.txt
: Contains parameters obtained from the last calibration procedure performed with this pen. More info on the calibration.
- A Python 3 script
split_characters.py
that splits thesensor_data.csv
file according to the timestamps given in the correspondinglabels.csv
file. It also saves the obtained letters and their labels in a pickle file for easier future use. Look at this file’s source code to understand what’s going on in-depth.
Stage 2
For stage two, we added lower case letters to the data set. 26179 letters are provided overall to predict 52 classes. The data format stays the same.
![](https://stabilodigital.com/wp-content/uploads/2020/02/Recording-1920x1440.jpeg)
Data Acquisition
To obtain the sensor data we provide for this challenge, we implemented a recording app that connects to a DigiPen and tells the volunteers which letter to write. These are some of the constraints that were met during the recordings:
- The recordings were conducted sitting on a chair in front of a table.
- The writing surface was horizontal.
- Normal, white paper sheets (about 80g/m^2) were used to write upon.
- The sheet was padded by five additional sheets.
- There was no guideline concerning the size of the handwriting. The subjects were asked to use a size that is natural for them.
- There was no guideline concerning the way of holding the pen. The subjects were asked to use a position that is natural for them.
- The volunteers were asked to make sure the STABILO logo faces up to avert different pen orientations.
- Participants could choose freely between print and cursive writing styles.
- Only right-handed recordings are released during this challenge.