Data infrastructure

The Competence operates a quality assured data infrastructure. It is hosted at FIZ Karlsruhe. It is built from contents of the databases Scopus (Elsevier) and Web of Science (Thomson Reuters).

The data are checked by a suite of automatic and semiautomatic procedures during loading and errors are fixed according to defined patterns. Several unification and standardization steps are run, for example concerning journal names and country names.

The schemas of the databases are designed and optimized for bibliometric applications. In addition to the raw data the databases contain enhanced data and pre-computed indicators.

One particular improvement is the institutional address disambiguation of German institutions, that is, the cleaning and unification of address data. This sub-project is run by I²SOS at Bielefeld University.

To ensure reproducibility of bibliometric analyses the databases are set up once a year and old versions are archived.