Logo der Universitätsmedizin Mainz

Mainzelliste as an Open Source Service

Mainzelliste is a web-based first-level pseudonymisation service. It allows for the creation of personal identifiers (PID) from identifying attributes (IDAT), and thanks to the record linkage functionality, this is even possible with poor quality identifying data. The functions are available through a REST interface.

Multi-centric research projects are one use for it: For this, Mainzelliste functions as a trustworthy third party that creates uniform pseudo-identifiers, even creating several for each system connected if necessary. By this means, data from each study subject is connectable between institutions, yet legal data protection requirements can be ensured through the implementation of power separations, including pseudonymised data storage.

Mainzelliste was developed as the successor of the so-called  “PID Generator” (Professor Klaus Pommerening, University Medical Centre, Mainz) in order to fulfil the requirements of the revised data protection concept from the TMF.

Figure 1 - Components of the Mainzelliste
Figure 1 - Components of the Mainzelliste

Functionality

Creation of a Non-Speaking Pseudonym (PID Generator)

For each patient entered, Mainzelliste creates one or more non-speaking so-called personal identifiers (PID) that are compatible with the identifiers of the original PID generator. These deterministically created 8-digit strings are appropriate for use on the Web as well as for manual data transfer, as they can identify up to two typos [1].

Record Linkage

For each patient there should only be exactly one pseudonym created, even for multiple entries of the patient. Therefore, when a patient is added the database is examined to see if this patient already exists. Thanks to a modular record linkage system that can be adapted to the demands of specific uses through a configuration file, this is possible even in the event of typos or alternate spellings. Particularly new compared to the PID generator is the possibility of using in-house phonetic codes and string comparisons, thereby allowing names from other linguistic backgrounds to be fault-tolerantly compared. Currently, weight-based record linkage is supported, but the modular concept allows for retrofitting an in-house algorithm. The possibility to manually re-work uncertain assignments further supports the automatic matching process.

REST-Based Web Interface

A lightly-weighted REST-based [2] interface permits simple connection to different systems, for example to registries, biobanks, EDC systems and study management systems. This makes it possible to use web browsers over AJAX and JSONP requests in the first place. Among others, implementations exist for the study management system “SecuTrial” from the company iAS.

Beispielhafte Anwendung in einem klinischen Register
Figure 2 - Example application in a clinical registry.

User and Administrator Interface

The complexity of record linkage remains hidden from the user because he or she enters identifying patient data in a web browser over an easy-to-understand, streamlined user-interface.

Possible record linkage errors can be corrected in an administrative user-interface (implementation is in progress). Appropriate error messages simplify the process for users and administrators.

HTML-Nutzerschnittstelle
Figure 3: HTML user-interface

Backward Compatibility

For networks that still use the PID generator, a migration path to Mainzelliste is available in most cases. For this, phonetics in accordance with the phonetic algorithm by Jörg Michael [3] has been reimplemented as part of an evaluation [4].

Merging Different Data Classes in Web Browsers

In some applications surrounding patient care, it is permitted and desirable for users to be able to see real names instead of pseudonyms. The REST interface of Mainzelliste makes the revelation of these names possible in principle, but some modern and nearly all old web browsers present diverse hurdles to a stable implementation [5]. Mainzelliste helps circumvent these obstacles by allowing access through the Mainz data protection library (implementation is being worked on as part of the project OSSE).

Anlegen und Nutzen einer TempID (nach Modell A des alten TMF-Konzepts)
Figure 4 - Generating and using a TempID (according to Model A of the current TMF concept).

Temporary Identifiers

In some applications surrounding patient care, permanent pseudonyms should be hidden from the user. In their place are temporarily valid pseudonyms (TempIDs) that can be created, shared between the servers involved and converted back to identifiable data through the Mainz data protection library (implementation is being worked on as part of the project OSSE).

Dissemination

Mainzelliste is already used in numerous projects and its application interface has been implemented in various software products for medical research. You can find a current list of usage references on the Bitbucket project web site.

Download

Download Icon

Everything that you need for your own Mainzelliste can be directly downloaded here (you will be redirected to an overview of the available versions). Please see the terms of use farther down on this page.

 

 

Citation

If you want to refer to Mainzelliste in your publications, please cite the following article, which deals with the general concept and the interface of the program:

Lablans M, Borg A, Ückert F: A RESTful interface to pseudonymization services in modern web applications. BMC Med Inform Decis Mak. 2015 Feb 7;15:2. doi: 10.1186/s12911-014-0123-5.

Mailing list and contact

In order to stay up-to-date, you can register for our mailing list.

You can reach the developers at  info@mainzelliste.de.

Take Part in BitBucket!

We have decided to make the code for Mainzelliste available in the repository service “BitBucket” for mutual cooperation purposes. This service from the company Atlassian supports the distributed version-control systems Git and Mercurial and is available for free in the basic version. BitBucket supplies all the necessary features for repository management (code browser, forks, commit history, etc.) and offers additional tools through an integrated Wiki as well as an issue tracking system.

Click here to access our repository.

Documentation

It is relatively simple to set up the development environment for Mainzelliste and to start up the programme locally. In our instructions for developers, you can read about this as well as a few other bits of information (currently only available in German if not stated otherwise).

  • Getting started (Pdf-file, 74,2 KB): Basic information on installing and running Mainzelliste (English).
  • Instructions for developers (Pdf-file, 98,4 KB): Describes the set up for a development environment to further develop the programme code.
  • Configuration handbook (Pdf-file, 418,6 KB): Describes the parameters that are used in the configuration file of Mainzelliste, for example the record linkage parameters.
  • Installation instructions (Pdf-file, 101,1 KB): Describes the installation iof an instance of Mainzelliste on a web server.
  • Instructions for the MDAT admin (Pdf-file, 180,4 KB): Describes how the Mainzelliste interface can be used by other web applications (for example register software) to transparently integrate the pseudonymisation.
  • Mainzelliste API (Pdf-file, 440,3 KB): Formal specifications of the interface for a comprehensive overview of the functionality as well as a reference for developers implementing the interface on either the client or server side.
Adobe PDF Icon

Configuration File

The entries in the configuration are of critical importance for correct matching weights and matching limits. We have already entered these at our institute in an executable state based on comprehensive experience.

Terms of Use

AGPLv3_Logo

Mainzelliste - A tool for pseudonymisation

Copyright © 2013 Martin Lablans, Andreas Borg, Frank Ückert

This programme is free software: Under the conditions of the GNU Affero General Public License (AGPL) (either Version 3 of the license or a later version) published by the Free Software Foundation, you can distribute and/or modify the software. You should have received a copy of the AGPL with the software. If not, please have a look at www.gnu.org/licenses/. If your software interacts with remote users over a computer network, you must be sure that there is a way for your users to download the software’s source code (including if it has been altered by you). For example, your website (in the case of a web application) could provide a link to an archive with the source code (also see Article 13 of the AGPL).

The following brief summary does not replace the text of the license. In brief, the GNU Affero General Public License means:

  • You are permitted to download the software for free and use it for non-commercial as well as commercial uses.
  • You must maintain the licence (GNU General Public License) if you distribute the software or a portion of the software. You are NOT permitted to distribute the software in whole or in part under another license.
  • If the software is made available as a service over a network, this is also considered to be distribution.

Read the entire AGPL license here.

Mainzelliste without any hassle

Are you interested in using Mainzelliste for pseudonymisation, for example for your medical research association? But you don’t want to have to operate the software within your own work? You don’t have the appropriate hardware available or are unable to be a Mainzelliste provider yourself? Then we have a solution for you:

JGU Unimedizin

In principle, we, the IMBEI at the University Medical Centre, offer to operate Mainzelliste for our cooperation partners. For this, you need to complete the corresponding contract with the University Medical Centre that we have already developed and used. In it, the contract subject matter, time lines, compensation, confidentiality, terms of use, contract termination and many other small details are defined.

References

  1. Faldum A., Pommerening K., An optimal code for patient identifiers. Computer methods and programs in biomedicine, 2005. 79: p. 81-8.
  2. Fielding R.T., Architectural Styles and the Design of Network-based Software Architectures, in Building, R.N. Taylor, Editor. 2000, Citeseer. p. 162.
  3. Michael J., Doppelgänger gesucht – Ein Programm für kontextsensitive phonetische Textumwandlung. c’t Magazin für Computertechnik, 1999. 25.
  4. Warnecke T., Borg A., Ückert F., Lablans M., Fehlertolerantes Record Linkage von Patientendaten durch den Phonet-Algorithmus, http://www.egms.de/static/en/meetings/gmds2013/13gmds055.shtml.
  5. Lablans, M., et al. Eine generische Softwarebibliothek zur Umsetzung des TMF-Datenschutzkonzepts A im Webeinsatz. in 55. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds). 2010.
  6. Lablans M, Borg A, Ückert F: A RESTful interface to pseudonymization services in modern web applications. BMC Med Inform Decis Mak. 2015 Feb 7;15:2. doi: 10.1186/s12911-014-0123-5.