Comprehensive data quality: success factor for Open Science
Open Science promotes the creation of new knowledge and access to research results. The management of research data is a major challenge. Together with SWITCH, SATW is analysing existing standards and measures to simplify the exchange of data among researchers.
In the age of Big Data, everyone generates hundreds of megabytes of data per day, and the trend is rising. The standards for these are generally relatively low and the majority of this data is hardly reused. This should not be the case in research, where expensive measuring equipment or elaborate surveys can make it costly to obtain data. Making research data widely accessible is a goal of the Open Science movement and in particular of Open Data.
Data must be FAIR
Research data management presents the international scientific community with various challenges. Various initiatives are currently underway in Switzerland to meet these challenges. For example, the Swiss National Science Foundation (SNSF) has committed itself to making publicly financed research accessible to the public free of charge wherever possible. swissuniversities is developing an Open Science programme to enable Swiss universities to reuse and disseminate research data. The Swiss Academies of Arts and Sciences are also involved in this programme, and in their recently published fact sheet they make recommendations on the promotion of Open Access and Open Data. The central principles of these initiatives can be summarised by the acronym FAIR: Findable, accessible, interoperable and reusable.
The documentation of research data depends on the science in question
Comprehensive data quality is central to the exchange and reuse of research data. Researchers need to know how raw data has been processed and that it has not been manipulated. This requires appropriate documentation, for example in the form of metadata. This is particularly important when data from different research disciplines are brought together. Since different data is generated in each field, a range of different attributes are also being used.
Comprehensive data quality for a research data connectome
SWITCH Innovation Labs was recently launched by the SWITCH foundation as an agile collaboration platform with higher education partners. In order to promote the Swiss Open Science ecosystem and the creation of a research data connectome, two labs were defined: "Comprehensive data quality" and "Technologies for a research data connectome". SATW was commissioned for the first lab. Clear data quality metrics are essential to share research data across disciplines.
In addition to new research questions and findings, comprehensive data quality enables better reproducibility of results. So far, this can be very time-consuming or even impossible. However, the requirement to disclose and share research data can also trigger resistance among researchers. After all, this increases the risk of errors being exposed and potentially damaging the reputation of researchers or their institutions. In addition, the documentation shouldn't generate significant additional effort - so that researchers can concentrate on their core tasks.
Expert survey documents the state of knowledge
With the help of an expert survey, the SATW collects the state of knowledge and measures for a well-founded, comprehensive data quality in various research areas. This is based on the assumption that individual disciplines apply different standards and approach the issue as such from various angles. The survey identifies national needs and problems. Initial results are expected by the end of 2019, and follow-up activities will be initiated from 2020.
Manuel Kugler, Head Priority programme Advanced Manufacturing and Artificial Intelligence, Phone +41 44 226 50 21, manuel.kugler(at)satw.ch