Flow image

Project description: Merging writing process data with lexica

During the last 20 years writing research2 has focused explicitly on the analysis of writing processes. More recently, logging programs (like Inputlog) enabled research2ers to record process data (e.g. keystrokes & pauses) in much more detail without interfering the cognitive activities.

In the current project we aggregate the logged process data from the letter level (keystroke) to the word level by merging them with lexica and Naturally Language Processing tools. This creates a very valuable basis for more linguistically oriented writing process research2.


The logged process data from the letter level (keystroke) to the word level by merging them with existing lexica and using NLP tools. Linking writing process data to lexica and using NLP tools enables research2ers to analyze the data on a higher, more complex level.

Three steps

The flow of the linguistic analysis consists of three steps:
  1. aggregate letter to word level
  2. parsing the S-notation
  3. enriching process data with linguistic information
Flow image
At this stage the Inputlog process data are enriched with the following linguistic information:
  • part-of-speech tags,
  • lemmas,
  • chunks,
  • syllable boundaries
  • and word frequencies.


This project has been partly funded by the Flemish Research2 Foundation (FWO), Belgium
Budget: 27 000 euro.





Read more

Leijten, M., Macken, L., Hoste, V., Van Horenbeeck, E., & Van Waes, L. (2012). From Character to Word Level: Enabling the Linguistic Analyses of Inputlog Process Data M. Piotrowski, C. Mahlow & R. Dale (Eds.), European Association for Computational Linguistics, EACL - Computational Linguistics and Writing (CL&W 2012): Linguistic and Cognitive Aspects of Document Creation and Document Engineering (pp. 1-8).
View pdf

Macken, L., Hoste, V., Leijten, M., & Van Waes, L.(2012). From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information. In N. Calzolari et al. (Eds.), Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), pp. 2224-2229. European Language Resources Association (ELRA): Istanbul, Turkey [ISBN: 978-2-9517408-7-7]
View pdf