Data

The Dataset

Stage 1

In stage 1 of the open competition, your task is to classify 26 upper case letters. We recorded 100 volunteers that provided 13102 letters.

This is what you will be able to download:

  • A folder containing 100 subfolders. Each subfolder contains one recording of one person. Each recording consists of three files:
    • sensor_data.csv: Contains the raw time series data of a complete recording of two 3D accelerometers, a 3D gyroscope, a 3D magnetometer and a 1D force sensor sampled at 100 Hz. Furthermore, the last column of the file is a simple sample counter and the first column contains a timestamp indicating when the sample was transmitted to the recording device via Bluetooth. More info on the sensors.
    • labels.csv: Contains the information at which point in time which letter was written. The start and stop columns of the labels.csv file point to the Millis column in the sensor_data.csv file.
    • calibration.txt: Contains parameters obtained from the last calibration procedure performed with this pen. More info on the calibration.
  • A Python 3 script split_characters.py that splits the sensor_data.csv file according to the timestamps given in the corresponding labels.csv file. It also saves the obtained letters and their labels in a pickle file for easier future use. Look at this file’s source code to understand what’s going on in-depth.

Stage 2

For stage two, we added lower case letters to the data set. 26179 letters are provided overall to predict 52 classes. The data format stays the same.

Stage 3

Stay tuned!

Data Acquisition

To obtain the sensor data we provide for this challenge, we implemented a recording app that connects to a DigiPen and tells the volunteers which letter to write. These are some of the constraints that were met during the recordings:

  • The recordings were conducted sitting on a chair in front of a table.
  • The writing surface was horizontal.
  • Normal, white paper sheets (about 80g/m^2) were used to write upon.
  • The sheet was padded by five additional sheets.
  • There was no guideline concerning the size of the handwriting. The subjects were asked to use a size that is natural for them.
  • There was no guideline concerning the way of holding the pen. The subjects were asked to use a position that is natural for them.
  • The volunteers were asked to make sure the STABILO logo faces up to avert different pen orientations.
  • Participants could choose freely between print and cursive writing styles.
  • Only right-handed recordings are released during this challenge.