Submissions

Final Results

The STABILO Challenge has come to an end! We were impressed with the range of different algorithms that you applied to this task of classifying 52 letters. Keeping in mind how difficult it is – even for humans – to distinguish characters like S, s, X and x, the best teams’ results are really remarkable. The algorithms were evaluated with writing from 20 persons that didn’t contribute to the training set – ensuring that the generalization capabilties of the submissions were being put to the test.

  • 1st place: TAL ML Team (72.02%)
    • Clever data augmentation techniques combined with an ensemble method yielded the victory by a great margin.
    • Affiliation: TAL Education Group (CN)
  • 2nd place: LME_SAGI (64.59%)
    • Affiliation: Pattern Recognition Lab at University of Erlangen-Nuremberg (GER)
  • 3rd place: ValueError shape mismatch (61.50%)
    • Affiliation: Human-Computer Interaction Group at University of Duisburg-Essen (GER)

Congratulations to the winning teams!

Thanks to everybody who participated in this challenge. We hope you enjoyed the process despite finishing after stage 2 and not being able to celebrate this in Cancún. Cross your fingers for Ubicomp 2021 in Cancún – with a STABILO Challenge 2.0!

 

Stage 1 Results

The 44 participating teams (by 2020/04/14) could make voluntary submissions to have their stage 1 algorithms evaluated. The teams that reached above 80% accuracy are listed below:

  • EINNS: 83.71%
  • ValueError shape mismatch: 80.85%

Congratulations to the leading teams! Keep tuning your algorithms and good luck in stage 2!

 

Evaluation Metrics (Stage 2)

The stage 2 submission is considered to determine the winning teams.

  • The most important evaluation criterion is the accuracy obtained from testing the submitted executable file with data from unknown writers.
  • If the difference in accuracy of multiple teams is below 2 percent points, the quality of the written report and the quality of the source code are taken into consideration.

What to submit after stage 2?

  • An executable of your classifier (more info below)
  • Source code necessary for preprocessing, training and testing your model (along with a common open source license of your choice). Please provide documentation.
  • A short written report (pdf, 2-6 pages) describing the algorithms that your team used.
Format of the Submitted Executable

All evaluations will be conducted on Windows 10 (64 bit) so please test your executable in that environment.

Requirements of the executable:

  • System: It’s an .exe-file runnable on Windows 10 (64 bit)
  • Arguments: It takes the following command line arguments:
    • -p C:/path/to/folder/containing/split/csv/files/ (more info below)
    • -c C:/path/to/calibration/file/calibration.txt
  • Naming: Please call your .exe file TeamName_Stage2.exe.
  • Output: For every .csv file in the given folder, the path of the file and the predicted letter has to be printed. Path and predictions are separated by *** and different files are separated by ~~~. The complete output has to be one line. The order of path***prediction tuples does not matter. Example:
    C:/path/0.csv***R~~~C:/path/1.csv***X~~~C:/path/10.csv***T~~~C:/path/100.csv***E~~~C:/path/101.csv***Y~~~...
  • Speed: The executable loads your saved model only once and then does the inference of all the validation files in the given folder. If the evaluation takes an unusually long period of time, your submission might not be considered.
  • The executable has to work independently and off-line.
  • How-To: Please include a short, concise ReadMe file if necessary.

Format of the validation folder:

We did not publish the recordings of a number of volunteers. These are used as a validation set in stage 1 and stage 2.
Along with the training data, you received the python script split_characters.py. For each person, it creates a csv folder containing the split single letters. These files have exactly the same format as the validation files. They contain 15 columns with a header.
Consequently, you can test your executable on the training data folders created by the split_characters-script. (Unfortunately, the validation files will not contain the ground truth in their name 😛 )

Useful code snippets:

    • An example class that loads the model and prints the output in the expected format. This one should be saved as an .exe file and handed in.
      import os
      import argparse
      import pandas as pd
      from keras.models import load_model
      
      MODEL_PATH = 'C:/.../my_model.h5'
      CLASSES = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
      CLASSES = list(CLASSES)
      
      
      def do_pred_cmd_line():
          '''
          Starts the prediction routine from the command line
          '''
          parser = argparse.ArgumentParser(description='Add path to folder')
          parser.add_argument('-p', '--path', help='Input folder path', required=True)
          parser.add_argument('-c', '--calib', help='Calibration file path', required=True)
          args = parser.parse_args()
      
          do_pred(args.path, args.calib)
      
      
      def do_pred(input_folder_path, calib_path):
          '''
          Does the predictions for the given folder and the given calibration file
          :param input_folder_path: the absolute path leading to the folder containing the split csv files
          :param calib_path: the path leading to the calibration file
          :return: the formatted output string (path***prediction~~~path***prediction)
          '''
      
          # gather all csv files of given path
          letter_files = [f.path for f in os.scandir(input_folder_path) if ".csv" in f.path]
      
          # load the saved model
          model = load_model(MODEL_PATH)
      
          # prepare the list of predictions
          predictions = []
      
          # do the prediction for each file
          for letter_file in letter_files:
              path, prediction = single_char_pred(model, letter_file, calib_path)
      
              predictions.append((path, prediction))
      
          # build the return string in the right format (path***prediction~~~path***prediction)
          singlePredictions = [path + "***" + prediction for path, prediction in predictions]
          result_string = ('~~~'.join(singlePredictions))
      
          # print the result string (this output string will be evaluated by STABILO)
          print(result_string)
      
          # return only for testing purposes
          return result_string
      
      
      def single_char_pred(model, letter_path, calib_path):
          '''
          Performs the prediction for a single given letter on the given model. Also takes the calibration into account.
          :param model: the model or None to initialize it for the first time
          :param letter_path: the path to the letter csv file to predict
          :param calib_path: the path to the calibration file
          :return: a (path,prediction) tuple
          '''
      
          # check if the model was already loaded. Make sure you do not reload the model for each prediction
          if (model == None):
              model = load_model(MODEL_PATH)
      
          # read the csv files
          letter = pd.read_csv(letter_path, delimiter=';', index_col='Millis')
          calib = pd.read_csv(calib_path, delimiter=':', header=None)
      
          # apply custom preprocessing routines
          letter = do_preprocessing(letter, calib)
      
          # perform the prediction
          output = model.predict_classes([letter])    # output could be [22]
      
          # decode model output
          prediction = CLASSES[output[0]] #  [22] is 'W'
      
          # return a (path,prediction) tuple
          return (letter_path, prediction)
      
      
      def do_preprocessing(letter, calib):
          # magic
          return preprocessed_letter
      
      # for compiling as executable:
      do_pred_cmd_line()
    • Some code that you can use to test your .exe file. Make sure the highlighted lines also work for your executable.
      import os
      import subprocess
      
      VALIDATION_FOLDER = "C:/.../"    # the folder containing the different person folders like /0/, /1/, ...
      
      
      def gather_valid_persons(folder):
          '''
          Gathers all persons' folders contained in the given folder
          :param folder: the folder containing subfolders representing persons
          :return: a list containing the directories contained in the given folder
          '''
      
          valid_person_folders = [f.path for f in os.scandir(folder) if f.is_dir()]
          valid_person_folders.sort(key=len)
      
          return valid_person_folders
      
      
      def transform_output_line(line):
          '''
          Transform the executable's string output to a list of (path,prediction)-tuples
          :param line: the output string to transform (looks like path***prediction~~~path***prediction)
          :return: the transformed list containing tuples
          '''
          tuples = []
          letters = line.split("~~~")
          for letter in letters:
              path_and_prediction = letter.split("***")
              tuples.append((path_and_prediction[0], path_and_prediction[1]))
      
          return tuples
      
      
      valid_persons = gather_valid_persons(VALIDATION_FOLDER)
      csv_folder_suffix = "/split_letters_csv/"
      path_to_exe_folder = "C:/.../"
      
      use_exe = True  # can be set to False to evaluate/debug the python script, not the exe
      correct = 0
      incorrect = 0
      
      for valid_person in valid_persons:
          valid_dir = valid_person + csv_folder_suffix # imagine these file names don't contain the ground truth
          lookup_table = get_ground_truth(...) # contains path,ground_truth tuples
          calib_file = valid_dir + "calibration.txt"
      
          if use_exe:
              os.chdir(path_to_exe_folder)
      
              with open(os.devnull, 'w') as devnull:  # to ignore stderr
                  print("pred.exe -p " + valid_dir + " -c " + calib_file)
                  output = subprocess.check_output(
                      [r"pred.exe", "-p", valid_dir , "-c", calib_file],
                      stderr=devnull)
      
              # get the last output line of the executable
              output_line = output.splitlines()[-1]
      
              # decode the output line (as it is a binary string)
              result_string = str(output_line, 'utf-8')
      
              predictions = transform_output_line(result_string)
      
          else:  # for testing purposes, evaluate pred.py directly
              from pred import do_pred
      
              print("pred.exe -p " + valid_dir  + " -c " + calib_file)
              result_string = do_pred(valid_dir , calib_file)
              predictions = transform_output_line(result_string)
      
          # print the parsed predictions and the ground truths
          print("PRED: ", end="")
          for path, pred in predictions:
              print(pred + " ", end="")
      
          print("\nGrTr: ", end="")
          for path, pred in predictions:
              ground_truth = lookup_table[path]
              print(ground_truth + " ", end="")
      
              # for computing accuracy, save correct and incorrect results
              if ground_truth == pred:
                  correct = correct + 1
              else:
                  incorrect = incorrect + 1
      
          print("\n---next person---")
      
      print("SUMMARY:")
      print(str(correct) + " out of " + str(correct + incorrect) + " predictions were correct. (" + str(
          correct / (correct + incorrect)) + "%)")

 

How to submit?

The participants will be given information on how to submit via email.