Member Initiatives

Discussion group: What (not) to share?

by Jasper de Groot

Open Science is hot. One day, the fact that “science” is “open” has become so utterly normal that “open science” has become a tautology. Our grandchildren, at least the ones keen on identifying odd dichotomies, will laugh: “Was there ever a closed science?” Unfortunately, yes. The current era is one of transition. There are many benefits to open science (I won’t reiterate those here), which are realized by an increasing number of individuals, united on platforms like these that keep expanding. So far, so good. Yet, what we often do not realize in our (uncritical) enthusiasm is that there are risks associated with open science, and we should think carefully about what we share with whom. There is much room for improvement here, and this concerns both our knowledge about sharing (personal) data, but also the availability of facilities to responsibly do so.

As you may know, the “new” EU-based data protection act has been in place since 25 May 2018 (General Data Protection Regulation: https://gdpr-info.eu). Basically, this regulation specifies the way organizations and their employees handle the personal information they collect from and possess about e.g. students and research subjects, as well as the files containing such information. This act may have (temporarily) increased awareness of handling personal information, but there are still a substantial number of data breaches. Oftentimes, the data breach is the first moment of “knowledge exchange” between data protection officers and researchers – a tad too late, perhaps. As prevention is better than treatment, communication between these two parties should be improved to avoid frustrations (and work load) on either side, while facilities should be built (and their use made mandatory) to shield sensitive information in ways that burden researchers (and teachers) as little as possible, as time is costly.

Fortunately, initiatives for improved communication between the different parties have emerged on the national, university, and faculty level. These are not mutually exclusive initiatives; these can benefit from cross-fertilization. It’s also a work in progress. At present, a task force from the National Coordination Point of Research Data Management (LCRDM: https://www.lcrdm.nl/en/task-groups) is devising a national strategy for research data magenement, for instance by assessing privacy risks associated with typical research scenarios in the social sciences. To arrive at typical research scenarios, they need researchers. Currently, I am the only researcher involved in this task force, and knowing my limitations in not being able to oversee all possible research scenarios, I would very much welcome your input here (email below). When the task force’s work is done, universities are free to take over the recommendations regarding research data management. In the meantime, Utrecht University’s Research Data Management (RDM) support office has tools and trainings available how to handle personal data (https://www.uu.nl/en/research/research-data-management), workshops are being planned around this theme, and around Halloween – yes, no joke – you can expect to hear some data horror stories (so reach out to them, if you have them ;-)). Third, the Faculty of Social & Behavioral Sciences (FSBS) has introduced the PRIvacy-Data-Ethics (PRIDE) tool, which enables us to register what kind of personal data we expect to gather, just stopping short of referring us (unless I overlooked something) to facilities we are expected to use when we do collect these data.

Talking about facilities, personal information is usually gathered during the earliest stages of research (e.g., screening information regarding sexual preference, ethnicity, political views, et cetera), and one solution that could nip potential data breaches in the bud is building a secure data server on which you can build questionnaires aimed at obtaining personal data. This way, personal information stays on one secure location, and access to that information remains traceable. Compare this to the risk of data “predation” by third parties when using Gmail to communicate about sensitive information with your prospective participants (which probably still happens out of ignorance, convenience, or both), then downloading these sensitive data to your personal computer and flash drive, and later forgetting your flash drive at the printer/unwittingly uploading the personal data to an online repository visible to anyone.

Long story short: Open science is awesome, but we should be careful when handling personal data. While this seems obvious, we can easily go wrong here, partly due to lack of knowledge, partly due to a lack of adequate facilities. We have a shared responsibility here. Awareness is one step, and even the most modest input to the national task force could be another. Hence, this is also an initiative asking for further initiative. Feel free to reach out to me if you have any comments or ideas: j.h.b.degroot@uu.nl.