Blackbox, Five Years On: An Evaluation of a Large-scale Programming Data Collection Project

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

39 Citations (Scopus)
295 Downloads (Pure)


The Blackbox project has been collecting programming activity data from users of BlueJ (a novice-targeted Java development environment) for nearly five years. The resulting dataset of more than two terabytes of data has been made available to interested researchers from the outset. In this paper, we assess the impact of the Blackbox project: we perform a mapping study to assess eighteen publications which have made use of the Blackbox data, and we report on the advantages and difficulties experienced by researchers working with this data, collected via a survey. We find that Blackbox has enabled pieces of research which otherwise would not have been possible, but there remain technical challenges in the analysis. Some of these -- but not all -- relate to the scale of the data. We provide suggestions for the future use of Blackbox, and reflections on the role of such data collection projects in programming research.
Original languageEnglish
Title of host publicationACM International Computing Education Research Conference
Number of pages9
Publication statusPublished - 13 Aug 2018


Dive into the research topics of 'Blackbox, Five Years On: An Evaluation of a Large-scale Programming Data Collection Project'. Together they form a unique fingerprint.

Cite this