Working Paper

Machine Learning Based Linkage of Company Data for Economic Research: Application to the EBDC Business Panels

Valentin Reich
ifo Institute, Munich, 2024

ifo Working Paper No. 409

This article presents a comprehensive approach to probabilistic linkage of German com pany data using Machine Learning and Natural Language Processing techniques. Here, the long-running ifo Institute surveys are linked to fnancial information in the Orbis database by addressing the unique challenges of company data linkage, such as corporate structures and linguistic nuances in company names. Compared to a previous linkage, the approach achieves improved match rates and is able to re-evaluate existing matches. This article contributes best practice advice for company data linkage and serves as a documentation for the resulting research dataset.

Keywords: record linkage, company data, orbis, survey data
JEL Classification: C810, C880