Welcome to iPEHD – the ifo Prussian Economic History Database
The ifo Prussian Economic History Database (iPEHD) is a county-level database covering a rich collection of variables for all counties of Prussia during the 19th century. The Royal Prussian Statistical Office collected these data in a number of censuses over the period 1816-1901, with much county-level information surviving in the archives. These data provide a unique treasure for unprecedented micro-regional empirical research in economic history, analyzing the importance of such factors as education, religion, fertility, and many others for the economic development of Prussia in the 19th century. The service of iPEHD is to provide the data in a digitized and structured way.
“The ifo Prussian Economic History Database (iPEHD) provides a great opportunity for researchers to explore regional economic development during the 19th century.”
Prof. Dr. Ludger Wößmann, Director of the ifo Center for the Economics of Education
The Scope of iPEHD
iPEHD starts with the population census in 1816, which is the first full-scale census released by the Royal Prussian Statistical Office, which had been founded in 1805. The 1816 census covers the 308 Prussian counties at the time. Further extensive census data are available in 1849, 1864, 1871, and 1882, but – as indicated in the following table – many more detailed data were collected in additional years. As the number of counties grew over time, by 1901 the data cover 574 Prussian counties.
In total, iPEHD contains more than 1,500 variables and more than half a million data points, all at the county level. These data are drawn from a total of 15 original sources, many of which consist of several volumes.
Year |
No. of variables |
No. of county observations |
No. of data points |
1816 |
58 |
308 |
17,864 |
1819 |
5 |
344 |
1,720 |
1821 |
22 |
344 |
7,568 |
1816-1821 |
24 |
456 |
10,944 |
1829 |
6 |
59 |
354 |
1849 |
712 |
335 |
238,520 |
1858 |
6 |
342 |
2,052 |
1862 |
4 |
346 |
1,384 |
1864 |
53 |
347 |
18,391 |
1866a |
1 |
342 |
342 |
1866b |
11 |
334 |
3,674 |
1871a |
25 |
453 |
11,325 |
1871b |
14 |
458 |
6,412 |
1878 |
5 |
426 |
2,130 |
1882a |
269 |
464 |
124,816 |
1882b |
14 |
465 |
6,510 |
1886a |
156 |
544 |
84,864 |
1886b |
97 |
518 |
50,246 |
1892 |
8 |
550 |
4,400 |
1896 |
15 |
552 |
8,280 |
1901 |
8 |
574 |
4,592 |
Sum |
1,513 |
|
606,388 |
Note: Some of the data points may contain missing information.
Using iPEHD
Before using the data contained in the iPEHD database, it is necessary for the user to understand the structure in which the data are presented in iPEHD. We document the features of the iPEHD database on the different sections of this website (accessible at the bottom), as well as in the following paper:
Becker, Sascha, Francesco Cinnirella, Erik Hornung and Ludger Wößmann, "iPEHD - The ifo Prussian Economic History Database", Historical Methods: A Journal of Quantitative and Interdisciplinary History 47 (2), 2014, 57–66, Information, Working paper version available as: CESifo Working Paper 3904 (PDF)
One of the biggest challenges when analyzing historical data is to ensure comparability over time, where the dimension of the units of observation has to be comparable. Our service facilitates the analysis of data at the county level, holding the administrative boundaries fixed. Thus, before using iPEHD, please make sure to read the merging data section carefully to get familiar with the procedure of combining different census years.
iPEHD stores its data in comma-separated values (csv) format. The raw data are categorized by eight content areas and can be accessed in the raw data section.
The codebook section provides information on the names, definitions, labels, and sources for each variable contained in iPEHD.
The sources section documents the original volumes from which the iPEHD data have been digitized, published by the Royal Prussian Statistical Bureau or its employees.
A lot of research in economic history has used data from iPEHD by now. The publications section documents these papers and, for those of them already published in academic journals, provides ready-made datasets and codes to replicate the tables published in the papers using Stata.
It is always telling to visualize the data on historical maps. Therefore, thematic maps of some of the iPEHD data are provided in the maps section.
iPEHD is certainly not the only project dealing with historical Prussian data at the county level. Other projects provide such services as maps, information on territorial changes, additional data, and other material on Prussian counties. Several of these projects, whose work is highly appreciated and can be viewed as complementary to ours, are listed in the external links section.
The FAQs section provides answers to some frequently asked questions on standard problems encountered by iPEHD users. However, we have to point out that at the end of the day, the best way to structure and use the data will be specific to every single research project. To find the best possible solution to this task is a crucial part of any research project and thus lies in the responsibility of every individual researcher. While providing the service of supplying the historical data in a digitized way and suggesting ways on how to merge the data from different sources, the people behind iPEHD do not have the resources to answer additional questions for specific research projects.
iPEHD History
In 2006, when looking for data to analyze the relationship of literacy and religion with economic outcomes in German history, we stumbled upon the rich county-level data available from the Prussian census of 1871. The following two example pictures provide an impression of how the source volumes look like.
After thorough studies of the data, we were fascinated by the depth and breadth of the historical information that the Royal Prussian Statistical Office had collected and documented. Prussian thoroughness had produced high-quality data at the county level in the 19th century documenting everything from education over religion and demographics to economic development (see the following figure for an example).
Prussian economic history in the 19th century proves a fascinating setting to study many of the most fundamental questions in economic history. A country of such high diversity, but with roughly uniform institutional settings, allows answering many important research questions by analyzing the micro-regional data with modern microeconometric methods.
Soon, we recognized the sheer amount of data that were just sitting around in the statistical annals at German state libraries. The quality of this impressive collection of information, remarkable for the 19th century, has generally been regarded as excellent by historians and demographers. And compared to the selective samples which a lot of historical research is restricted to, the full censuses covering the whole population provide a much more reliable picture of the historical setting.
After the original “Was Weber wrong?” paper which relied mainly on the 1871 census and subsequent data, we explored annals covering rather unknown census data from 1816 to 1821. Although lots of effort had to be undertaken to make these data ready for research and to ensure their comparability, we soon found it to be very promising and equally reliable.
A third large data digitization project involved the census of 1849. The sheer amount of information provided in the sources was overwhelming.
The censuses of 1816, 1849, and 1871 became the foundation of iPEHD. But, as time went by, we also digitized data from different other censuses to fill in the gaps. Although far from complete, we find the data to provide a rather comprehensive overview of 19th-century economic history in Prussia.
Thus, we are happy to be able to make the digitized data available to the scientific community and the interested public. iPEHD went online in the summer of 2012 to be freely used by anyone interested.
The collection of these data and their provision to the scientific community is part of the project "Establishment of a leading international center for empirical research on the importance of education for long-term economic development," generously funded by the Leibniz Association under the Pact for Research and Innovation. The project was carried out at the department for Human Capital and Innovation at the ifo Institute – Leibniz Institute for Economic Research at the University of Munich.
Conditions of Use
The data are provided free of charge. We have tried to document the data as good as - we think - we possibly could, including references to the original publications from which the data are drawn. By downloading data from iPEHD, you assume full responsibility for their use. We cannot provide assistance on issues of research design, let alone statistical analysis.
If you are fully convinced you discovered a digitization error and have checked this against the original sources, we are grateful to hear about this at iPEHD@ifo.de.
How to Cite the iPEHD Database
When using data from the iPEHD database in your work, please reference it as follows:
Becker, Sascha, Francesco Cinnirella, Erik Hornung and Ludger Wößmann, "iPEHD - The ifo Prussian Economic History Database", Historical Methods: A Journal of Quantitative and Interdisciplinary History 47 (2), 2014, 57–66, Information, Working paper version available as: CESifo Working Paper 3904 (PDF)
Please also send one electronic copy of any work that uses data from the iPEHD database to us at iPEHD@ifo.de.
Data, Publications, FAQs
-
Before using the data contained in the iPEHD database, it is necessary for the user to understand the structure in which the data are presented. Therefore, please make sure to read the merging data section carefully before accessing the raw data. Apart from describing the data structure, that section presents a procedure to combine data from different census years.
iPEHD consists of county-level information gathered from different censuses. The data are currently presented in 76 separate data files, organized by content area, specific topic, and census year. Each data file in iPEHD contains a unique county (Kreis) identifier, the county name, the abbreviated district (Regierungsbezirk) name (rb), and a set of variables of census data.
iPEHD stores its data in comma-separated values (csv) format, which is easily accessible from any statistical software. For example, to open the csv data files in Stata, just type:
> insheet using="xxxxxx.csv" <To give an example of a data file, the following table shows a brief extract of a few variables for the first few counties (by alphabet) from the data file "ipehd_1819_indu_fac.csv", which contains data on the number of factories in a county in 1819. E.g., the variable "fac1819_brick" documents the total number of brick manufactories in a county in 1819, and the variable "mill1819_water" the total number of water mills.
Extract from an example data file
kreiskey1800
county
rb
fac1819_brick
fac1819_lime
fac1819_glass
mill1819_water
277
Achen
AAC
5
10
2
26
33
Adelnau
POS
11
6
0
26
254
Adenau
KOB
0
1
0
71
196
Ahaus
MUN
11
15
0
20
255
Ahrweiler
KOB
0
0
0
51
2
Allenstein
KON
5
0
1
31
219
Altena
ARN
3
13
0
41
257
Altenkirchen
KOB
1
0
0
41
10
Angerburg
GUM
4
26
0
5
53
Angermünde
POT
13
2
0
28
32
Anklam
STE
3
0
0
2
209
Arnsberg
ARN
12
4
0
26
67
Arnswalde
FRA
7
3
3
29
160
Aschersleben
MAG
8
5
0
57
55
(Nieder-)Barnim
POT
8
0
1
30
54
(Ober-)Barnim
POT
18
0
0
36
190
Beckum
MUN
8
3
0
22
Note: Extract from iPEHD data file “ipehd_1819_indu_fac.csv”.
The iPEHD data are categorized into the following eight content areas (also accessible through the sidebar in the upper right corner). A zip-file containing all iPEHD data files together can be accessed here:
Complete iPEHD Database (ZIP, 1392 KB)
Education
This area contains, among others, such data as the number of students, teachers, and schools by school type, literacy, and school finance.
Occupation
This area contains, among others, data on the labor force in agriculture, in factories, in manufacturing, in crafts, and in services.
Wages and Income Tax
This area contains data on daily wages of day laborers, on teacher income, and on income taxes.
Industry
This area contains data on a huge number of different factories, technologies, and transportation.
Agriculture
This area contains, among others, such data as livestock, crop yields, soil composition, and the distribution of land.
Population
This area contains data on the population by age, by gender, and by marital status, on birth and deaths, and on population with disabilities.
Religion
This area contains denomination-specific data on population, literacy, education, occupation, and number of churches.
Miscellaneous
This area contains data on the surface area, buildings, municipalities, and residential areas for each county.
Apart from these eight content areas, the
Merger File
provides information on merger variables necessary to combine data from different census years; see the merging data section for details.
-
iPEHD consists of county-level information gathered from different censuses. The data are currently presented in 76 separate data files, organized by content area, specific topic, and census year. Each data file in iPEHD contains a unique county (Kreis) identifier, the county name, the abbreviated district (Regierungsbezirk) name (rb), and a set of variables of census data. The categorized data can be accessed from the raw data section.
County-level Structure of the Data
Starting after the Congress of Vienna in 1815, Prussia reformed its administrative structure and introduced the county level. At the time, the dimension of a county was meant to follow borders of previously existing administrative units. The maximum distance to the administrative center was meant to be two to three Prussian Miles (roughly 15 to 23 km or 9 to 14 miles), such that every inhabitant could travel forth and back within a day. The population size was meant to range between 20,000 in sparsely populated areas and 36,000 in densely populated areas.
Throughout the 19th century, various administrative reforms reshaped the county structure of Prussia. As the population grew over time, it became necessary to divide existing administrative units in order to reduce administrative efforts. Most of these changes were partitions of one county into two or more counties.
Thus, it is usually possible to reconstruct earlier administrative units by aggregating data from later years to the former structure. A drawback of this procedure is that the researcher loses part of the variation provided by having more observations. Still, the procedure appears necessary in order to have intertemporal comparability of the units of observation. The alternative would be to assign the same early data to two or more subsequently parted units, introducing measurement error if observed data were not uniformly distributed in the original area.
A peculiarity of the Prussian county system is the city county. Starting with the introduction of the county level in 1815, the so-called Immediatstädte (immediate towns) became a county themselves. As urbanization advanced, an increasing number of cities were detached from their original county and became a county of their own. Thus, the database often contains a Landkreis(rural county) and a Stadtkreis (city county) with similar names. For example, there are six pairs of Landkreis/Stadtkreis information among the 335 county observations in the 1849 classification and 20 pairs among the 458 county observations in the 1874 classification.
The external links section provides links to websites that contain lists of all Prussian counties, as well as more insight into the reforms.
County Identifiers
All data in iPEHD reflect the administrative conditions in place at the date of publication of the census. Since censuses often ordered the counties in different ways, identifiers were assigned reflecting the order of each census. Thus, each county in each census has been assigned a continuous number which is unique within a census but not across censuses. The identifiers are named kreiskeyYYYY, where YYYY represents the four-digit year (see below for additional peculiarities of the 1816-21 data).
The year in the identifier denotes the administrative structure of Prussia, which is not necessarily the same as the census year. In some cases, different identifiers (e.g., kreiskey1871 and kreiskey1874) even had to be assigned to data from the same census year (1871) because the Royal Prussian Statistical Office used different aggregations in different publications of data from the same census.
Intertemporal Comparisons
If you are interested in intertemporal comparisons and the construction of panel datasets using iPEHD, please make sure to get familiar with the structure of the data first.
One of the biggest challenges of using iPEHD data from different years is to make sure that the dimension of the units of observation is comparable over time. Our county identifiers, together with the merge-county file, provide a service that facilitates such linkage. If you want to use it, please closely follow the nine steps mentioned below. Our suggestion is that, in order to analyze the data using a comparable set of observations, you will need to collapse the data to the earliest set of counties in the data.
However, we have to point out that at the end of the day, the best way to structure and use the data will be specific to every single research project. To find the best possible solution to this task is a crucial part of any research project and thus lies in the responsibility of every individual researcher.
If you want to conduct intertemporal comparisons, our suggestion is to follow the following nine-step procedure. For this, you will have to download the following file:
Merge-county file (CSV, 130 KB)
If your goal is to construct only cross-sections, then follow the procedure only until step 3.
1. Choose datasets from the same census year.
2. Merge all datasets from the same census year using the identifier (e.g. kreiskey1882).
3. Save the cross-section.
4. Use the merge-county file provided on this page.
5. Drop all duplicate and missing observations from the merge-county file according to the identifier in the cross-section (e.g. kreiskey1882): See examples below.
6. Merge the merge-county file with the cross-section using the identifier (e.g. kreiskey1882).
7. Aggregate (sum/mean) all variables in the cross-section to the aggregation level of the earliest census in your analysis using the identifier of the earliest census in your analysis (crucial step!).
8. Repeat steps 1 to 7 for datasets from other census years.
9. Merge the resulting cross-sections using the identifier of the earliest census in your analysis.
Example from the Merger File
In the example below, you will find that the eight illustrative counties observed in 1901 were established from six counties in 1874 and five counties in 1849. Between 1849 and 1874, the 'Elbing Landkreis' had been divided into 'Elbing Stadtkreis' and 'Elbing Landkreis'. Between 1874 and 1901, the 'Danzig Landkreis' had been divided into 'Danzig Niederung', 'Danzig Höhe', and 'Dirschau'.
Kreiskey 1901
County1901
Kreiskey 1874
County1874
Kreiskey 1849
County1849
38
ELBING STADTKREIS
38
ELBING STADTKREIS
37
ELBING LANDKREIS
39
ELBING LANDKREIS
39
ELBING LANDKREIS
37
ELBING LANDKREIS
40
MARIENBURG IN PREUSSEN
40
MARIENBURG IN PREUSSEN
38
MARIENBURG IN PREUSSEN
41
DANZIG STADTKREIS
41
DANZIG STADTKREIS
39
DANZIG STADTKREIS
42
DANZIG NIEDERUNG
42
DANZIG LANDKREIS
40
DANZIG LANDKREIS
43
DANZIG HOHE
42
DANZIG LANDKREIS
40
DANZIG LANDKREIS
44
DIRSCHAU
42
DANZIG LANDKREIS
40
DANZIG LANDKREIS
45
PREUSSISCH STARGARD
43
PREUSSISCH STARGARD
41
PREUSSISCH STARGARD
Note: Extract from the iPEHD merge file “ipehd_merge_county.csv”.
In order to have a comparable set of observations when performing intertemporal comparisons between 1901 and 1849, you will have to aggregate the observations of 'Danzig Niederung', 'Danzig Höhe', and 'Dirschau' to match 'Danzig Landkreis'. Thus, you should always aggregate the data to the aggregation level of the earliest census year in your analysis (step 7).
However, if you would like to perform intertemporal comparisons between e.g. 1874 and 1849, you will need to drop the duplicate entries of 'Danzig Landkreis' from the merger file first (step 5). In addition, you will need to drop entries from the merger file that have missing observations on the county identifier in the respective year. Such missing observations exist because some territories were annexed by Prussia only after the respective census year.
As one example of how to merge datasets from 1874 and 1849, the following Stata code exemplifies the nine steps of the suggested procedure: Download
Peculiarity of the Data from 1816 to 1821
By 1816, Prussia had just started her administrative reform that established the county level. In some parts of the country, the reforms had not been finalized even in 1821. Thus, the data from the censuses in 1816 until after 1821 sometimes reflect old administrative units.
Unfortunately, due to the reform, these old units were subsequently aggregated and then newly divided in order to establish the new counties. This makes it impossible to accurately match the data of (some of) the administrative units from the early censuses to (some) counties in subsequent censuses. We thus coded the kreiskey1800 which aggregates the data to a higher level. You need to use kreiskey1800 to link the 1816-1821 data to later periods.
However, iPEHD also provides a unique identifier that allows merging data from the same census for these cross-sections. These identifiers are named ‘id1816’ and ‘id1819’. In order to merge data from 1816 to other data from 1816, please use id1816. In order to merge data from 1819 or 1821 to other data from 1819 or 1821, please use id1819. In order to merge data from 1816, 1819, or 1821 to data from subsequent censuses, please follow the steps below:
1. Choose datasets from 1816, 1819, or 1821.
2. Merge all datasets from the same census using the identifier (idYYYY).
3. Aggregate (sum/mean) all cross sections using the identifier 'kreiskey1800'.
4. Merge the cross section with aggregated data from subsequent censuses using the identifier 'kreiskey1800'.
-
The codebooks provide additional information for each variable contained in iPEHD. There is one codebook for each year, so that you will find explanations for each variable in the codebook for the corresponding year. A summary codebook that combines all years is also provided; this summary codebook allows a content search of the whole iPEHD.
The codebooks list the variable name (“variable name”), the name of the data file where it can be found (“ipehd datasets”), an English label (“label”), and the original label in German language (“original label”). The German language label is similar to the table headings found in the original sources. The English label leads with the year and is a shortened (direct) translation of the German label; in cases where a translation is not feasible, the original German term was adopted. On top of each set of variables, the codebooks also indicate the source of this set of variables (“source”).
In any case, it is always recommendable to access the original sources for detailed information. They often give helpful insights regarding the exact attributes of the variables. You can also find additional explanations by reading the publications that first used the specific iPEHD data.
1816-1901 Codebook Summary (XLSX, 122 KB)
-
The iPEHD data have been digitized from different sources originally published by the Royal Prussian Statistical Bureau or its employees. This page provides a list of all the volumes used as sources for iPEHD. By now, complete scans of many of these volumes can be found at Google Books.
1816-21
Mützell, Alexander A. (1821-25). Neues Topographisch-statistisch-geographisches Wörterbuch des Preussischen Staats, Vol. 1-6. Halle: Karl August Kümmel.
1829
Preussisches Statistisches Landesamt (1829). Beiträge zur Statistik der Königlichen Preussischen Rheinlande, aus amtlichen Nachrichten zusammengestellt. Aachen: J.A. Mayer.
1849
Statistisches Bureau zu Berlin (1851-55). Tabellen und amtliche Nachrichten über den Preussischen Staat für das Jahr 1849, Vol. 1-6b. Berlin: Statistisches Bureau zu Berlin.
1858
Meitzen, August (1868). Der Boden und die landwirthschaftlichen Verhältnisse des Preussischen Staates, Vol. 1-4. Berlin: Verlag von Paul Parey.
1862
Königlich Preussisches Statistisches Bureau (1863). Die Eisen-, Stein- und Wasserstrassen des preussischen Staates im Jahre 1862, in Zeitschrift des Königlich Preussischen Statistischen Bureaus, Vol. 3, 206–214. Berlin: Verlag der Königlichen Geheimen Ober-Hofbuchdruckerei.
1864
Königliches Statistisches Bureau in Berlin (1867). Die Ergebnisse der Volkszählung und Volksbeschreibung, der Gebäude und Viehzählung, nach den Aufnahmen vom 3. December 1864, resp. Anfang 1865 und die Statistik der Bewegung der Bevölkerung in den Jahren 1862, 1863 und 1864. Preussische Statistik Vol. 10. Berlin: Verlag von Ernst Kuehn.
1866
Meitzen, August (1868). Der Boden und die landwirthschaftlichen Verhältnisse des Preussischen Staates, Vol. 1-4. Berlin: Verlag von Paul Parey.
1871
Königliches Statistisches Bureau (1873-74). Die Gemeinden und Gutsbezirke des Preussischen Staates und ihre Bevölkerung: Nach den Urmaterialien der allgemeinen Volkszählung vom 1.December 1871, Vol. 1-11. Berlin: Verlag des Königlichen Statistischen Bureaus
Königliches Statistisches Bureau in Berlin (1875). Die Ergebnisse der Volkszählung und Volksbeschreibung im Preussischen Staate vom 1. December 1871. Preussische Statistik Vol. 30. Berlin: Verlag des Königlichen Statistischen Bureaus.
1878
Herrfurth, Ludwig and Conrad Studt (1880). Finanzstatistik der Kreise des preussischen Staates für das Jahr 1877/78. Zeitschrift des Preussischen Statistischen Landesamtes, Ergänzungshefte, Vol. 7. Berlin: Verlag des Königlichen Statistischen Bureaus
1882
Königliches Statistisches Bureau in Berlin (1884/85). Die Ergebnisse der Berufsstatistik vom 5. Juni 1882 im preussischen Staat. Preussische Statistik Vol. 76 a-c. Berlin: Verlag des Königlichen Statistischen Bureaus.
1886
Königliches Statistisches Bureau in Berlin (1887). Die Ergebnisse der Ermittelung des Ernteertrags im preussischen Staate für das Jahr 1886. Preussische Statistik Vol. 92. Berlin: Verlag des Königlichen Statistischen Bureaus.
Königliches Statistisches Bureau in Berlin (1889). Das gesammte Volksschulwesen im preußischen Staate im Jahre 1886. Preussische Statistik Vol. 101. Berlin: Verlag des Königlichen Statistischen Bureaus.
1892
Neuhaus, Georg (1904). Die ortsüblichen Tagelöhne gewöhnlicher Tagearbeiter in Preußen 1892 und 1901, in Zeitschrift des Königlich Preussischen Statistischen Bureaus, Vol. 44, 310–346. Berlin: Verlag des Königlichen Statistischen Bureaus.
1896
Königliches Statistisches Bureau in Berlin (1897). Die Ergebnisse der Ermittelung des Ernteertrags im preussischen Staate für das Jahr 1896. Preussische Statistik Vol. 147. Berlin: Verlag des Königlichen Statistischen Bureaus.
1901
Neuhaus, Georg (1904). Die ortsüblichen Tagelöhne gewöhnlicher Tagearbeiter in Preußen 1892 und 1901, in Zeitschrift des Königlich Preussischen Statistischen Bureaus, Vol. 44, 310–346. Berlin: Verlag des Königlichen Statistischen Bureaus.
-
A lot of research in economic history has used data from the iPEHD by now. These publications are listed below, ordered by the year in which the first working-paper version of the paper has been published. The list also provides brief descriptions of and links to the papers; for papers published in academic journals, please always refer to the final published versions, which often provide substantial additional robustness analyses than the initial working-paper versions. For those papers already published in academic journals, we also provide ready-made datasets and codes to replicate the tables published in the papers using Stata.
Recent Projects Using Prussian Data
Becker, Sascha O., and Luigi Pascali (2019). Religion, Division of Labor, and Conflict: Anti-semitism in Germany over 600 Years. American Economic Review 109 (5): 1764-1804.
Bauernschuster, Stefan, Anastasia Driva, and Erik Hornung (2019). Bismarck's Health Insurance and the Mortality Decline. Journal of the European Economic Association, forthcoming.
Hornung, Erik (2019). Diasporas, Diversity, and Economic Activity: Evidence from 18th-century Berlin. Explorations in Economic History 73: 101261.
Cinnirella, Francesco, and Ruth Schueler (2018). Nation Building: The Role of Central Spending in Education. Explorations in Economic History 67 (1): 18-39.
Cinnirella, Francesco, and Jochen Streb (2017). The Role of Human Capital and Innovation in Economic Development: Evidence from Post-Malthusian Prussia. Journal of Economic Growth 22 (2): 193-227.
Ashraf, Quamrul, Francesco Cinnirella, Oded Galor, Boris Gershman, and Erik Hornung (2017). Capital-Skill Complementarity and the Emergence of Labor Emancipation. CESifo Working Paper 6423.
Cinnirella, Francesco, and Jochen Streb (2017). Religious Tolerance as Engine of Innovation. CESifo Working Paper 6797.
Cinnirella, Francesco, and Ruth Schueler (2016). The Cost of Decentralization: Linguistic Polarization and the Provision of Education. CESifo Working Paper 5894.
Hornung, Erik (2015). Railroads and Growth in Prussia. Journal of the European Economic Association 13 (4): 699-736.
Hornung, Erik (2014). Immigration and the Diffusion of Technology: The Huguenot Diaspora in Prussia. American Economic Review 104 (1): 84-122.
2014
Education and Religious Participation: City-Level Evidence from Germany’s Secularization Period 1890-1930
Using panel data of advanced-school enrollment and Protestant church attendance in German cities between 1890 and 1930, this paper finds that education is negatively related to church attendance in panel models with fixed effects.
Published version: Becker, Sascha O., Markus Nagler, and Ludger Woessmann (2017). Education and Religious Participation: City-Level Evidence from Germany’s Secularization Period 1890-1930. Journal of Economic Growth 22 (3): 273-311.
Data download: Datasets & Stata codes (ZIP)2013
Not the Opium of the People: Income and Secularization in a Panel of Prussian Counties
Combining income data with data on Protestant church attendance in Prussian counties for six waves from 1886-1911, the paper finds that – in contrast to the negative cross-sectional association – panel analyses do not confirm a significant relationship between income and church attendance.
Published version: Becker, Sascha O., and Ludger Woessmann (2013). Not the Opium of the People: Income and Secularization in a Panel of Prussian Counties. American Economic Review: Papers & Proceedings 103 (3): 539-544.
Data download: Datasets & Stata codes2012
iPEHD - The ifo Prussian Economic History Database
This paper documents the ifo Prussian Economic History Database (iPEHD), which provides a rich collection of county-level variables for 19th-century Prussia in a digitized and structured way.
Published version: Becker, Sascha O., Francesco Cinnirella, Erik Hornung, and Ludger Woessmann (2014). iPEHD – The ifo Prussian Economic History Database. Historical Methods 47 (2): 57-66.
2011
Landownership Concentration and the Expansion of Education
Combining data from several censuses that effectively span the entire 19th century (1816, 1849, 1864, 1886, and 1896), as well as data from a 1866 classification of soil composition, this paper finds that landownership concentration, a proxy for the institution of serf labor, has a negative effect on school enrollment which diminishes in the second half of the century.
Published version: Cinnirella, Francesco and Erik Hornung, Landownership Concentration and the Expansion of Education. Journal of Development Economics 121: 135-152.
Knocking on Heaven’s Door? Protestantism and Suicide
Using data from 1816-21 and 1869-71, this paper finds a substantial positive effect of Protestantism on suicide.
Published version: Becker, Sascha O., and Ludger Woessmann (2018). Social Cohesion, Religious Beliefs, and the Effect of Protestantism on Suicide (with S.O. Becker). Review of Economics and Statistics 100 (3): 377-391.
Data download: Datasets & Stata codes
Does Women's Education Affect Fertility? Evidence from Pre-Demographic Transition Prussia
Combining data from three censuses – 1816, 1849, and 1867 – this paper finds a negative residual effect of women's education on fertility, despite controlling for several demand and supply factors.
Published version: Becker, Sascha O., Francesco Cinnirella, and Ludger Woessmann (2013). Does Women’s Education Affect Fertility? Evidence from Pre-Demographic Transition Prussia. European Review of Economic History 17 (1): 24-44.
Data download: Datasets & Stata codes (ZIP, 46 KB)2010
The Effect of Investment in Children's Education on Fertility in 1816 Prussia
Using data from the 1816 census, this paper finds a significant negative causal effect of education on fertility – evidence for a child quantity-quality trade-off – already several decades before the demographic transition and shows that it is robust to accounting for spatial autocorrelation.
Published version: Becker, Sascha O., Francesco Cinnirella, and Ludger Woessmann (2012). The Effect of Investment in Children’s Education on Fertility in 1816 Prussia. Cliometrica 6 (1): 29-44.
Data download: Datasets & Stata codes (ZIP, 18 KB)
The Effect of Protestantism on Education before the Industrialization: Evidence from 1816 PrussiaThis paper shows that Protestantism led to more schooling already in 1816, before the Industrial Revolution, ruling out that Protestant education just resulted from industrialization.
Published version: Becker, Sascha O., and Ludger Woessmann (2010). The Effect of Protestantism on Education before the Industrialization: Evidence from 1816 Prussia. Economics Letters 107 (2): 224-228.
Data download: Datasets & Stata codes (ZIP, 30 KB)2009
Education and Catch-up in the Industrial Revolution
This paper combines school-enrollment and factory-employment data from 1816, 1849, and 1882 to show that – in contrast to the state-of-the-art view based on British evidence – basic education is significantly associated with non-textile industrialization in both phases of the Industrial Revolution.
Published version: Becker, Sascha O., Erik Hornung, and Ludger Woessmann (2010). Education and Catch-up in the Industrial Revolution. American Economic Journal: Macroeconomics 3 (3): 92-126.
Download: Datasets & Stata codes (ZIP, 73 KB)
The Trade-off between Fertility and Education: Evidence from before the Demographic TransitionThis paper uses data from the 1849 census and other sources to show that a trade-off between child quantity and quality existed already in the 19th century and that causation between fertility and education runs both ways.
Published version: Becker, Sascha O., Francesco Cinnirella, and Ludger Woessmann (2010). The Trade-off between Fertility and Education: Evidence from before the Demographic Transition. Journal of Economic Growth 15 (3): 177-204.
Data download: Datasets & Stata codes (ZIP, 32 KB)2008
Luther and the Girls: Religious Denomination and the Female Education Gap in 19th Century Prussia
Using data from the first Prussian census in 1816, among others, this paper shows that a larger share of Protestants in a county’s population decreased the gender gap in basic education.
Published version: Becker, Sascha O., and Ludger Woessmann (2008). Luther and the Girls: Religious Denomination and the Female Education Gap in 19th Century Prussia. Scandinavian Journal of Economics 110 (4): 777-805.
Data download: Datasets & Stata codes (ZIP, 81 KB)2007
Was Weber Wrong? A Human Capital Theory of Protestant Economic History
This paper uses data from several censuses (Population 1871, Occupation 1882, Education 1886) and additional sources (including the Income Tax Statistics 1877) to show that the higher economic prosperity of Protestant relative to Catholic counties can be accounted for by Protestants' higher literacy (presumably spurred by instruction in reading the Bible), suggesting that explanations based purely on differential work ethics may have limited explanatory power.
Published version: Becker, Sascha O., and Ludger Woessmann (2009). Was Weber Wrong? A Human Capital Theory of Protestant Economic History. Quarterly Journal of Economics 124 (2): 531-596.
Data download: Datasets & Stata codes (ZIP, 69 KB)
-
iPEHD is certainly not the only project dealing with historical Prussian data at the county level. Other projects provide such services as maps, information on territorial changes, additional data, and other material on Prussian counties. Several of these projects, whose work is highly appreciated and can be viewed as complementary to ours, are listed below.
Galloway Prussia Database 1861 to 1914
In contrast to the focus of iPEHD on data relevant for economic history, the "Galloway Prussia Database 1861 to 1914" provides digitized Prussian census data for demographic analyses. These data provide a lot of information complementary to iPEHD.
Galloway Prussia Database 1861 to 1914
IEG Maps
The maps provided in our thematic maps section were produced with the help of the IEG-Maps project located at the Institute for European History the University of Mainz. Their project homepage provides digital base maps of German and European history, regarding themes such as politics, administration, economics, and transportation.
MPIDR Project on GIS Maps
At the Max Planck Institute for Demographic Research (MPIDR) in Rostock, an ongoing research project compiles GIS-libraries with shapefiles of annual cross-sections documenting the changing historical regional administrative boundaries in Germany at the district level since the early 19th century.
Population History GIS Collection
ZBW Project on Digital Statistics of the German Reich (1873-1883)
The ZBW - Leibniz Information Center for Economics has an ongoing project on the "Digitisation of the Statistics of the German Reich - Alte Folge (old sequence) - (1873-1883)". The project aims to make the volumes available to be viewed on the internet and, subsequently, also accessible and searchable in spreadsheed formats.
Documentations of Counties and Territorial Changes
The following links provide lists of all Prussian counties and insight into territorial reforms (only in German):
Territorial Changes of German Municipalities
Source Volumes
By now, complete scans of many of the original volumes from which the iPEHD data have been digitized, as documented in the source section, can be found at Google Books.
-
In the questions below, we document answers on a few standard problems that users have run into.
1. Where do the data come from?
The data were originally published in various outlets by the Royal Prussian Statistical Bureau and its staff. We document these under sources. For each specific variable, the codebooks document the specific source from which it stems.
2. How can I get help if I have questions on the underlying data or on how to merge variables from different datasets into one dataset?
To fully understand the underlying data, there is no other way but to consult the original sources. On how to merge variables, please consult the merging data section. However, at the end of the day, the best way to structure and use the data will be specific to every single research project, and it is the task of the researcher to solve this in the best possible way for the specific research project. This is a crucial part of your research, and you will be responsible for how you do it. The ifo Institute and the persons behind the iPEHD do not have the resources to answer any questions on this.
3. When I try to open the CSV (comma-separated values) files, Excel or some other program puts all the variables from one observation (row) into one cell. What can I do?
Since CSV files use commas to separate the values, there can be problems with the formatting for some regional settings. If your computer uses the comma for decimal or list separation, the file may open in a difficult-to-read format in Excel. The easiest way to deal with this problem is to change your computers region to an English speaking region.
On Windows, change your computer's region by going to Start > Control Panel, clicking on Regional and Language Options, and then the Formats tab. Here you'll find the Region menu, where you can specify an English speaking region.
If you want to stick to your own language, click the Customize button in the Formats tab to open the Customize Regional Options dialog, where you'll find the Decimal and List Separator option. Replace the Decimal symbol with a period, the List separator with a comma, and click OK. After restarting Excel, you should be able to create CSV files that use the comma separator.
On Mac OS X, change your computer's region by going to Apple Menu > System Preferences, clicking on International, and then the Formats tab. Here you'll find the Region menu, where you can specify an English speaking region.
4. Why is the identifier year sometimes different from the census year?
The identifier year usually denotes the administrative structure of Prussia – not necessarily the census year. Sometimes data from the same census were published for different purposes at different times, and the administrative boundaries had been changed in-between. The Royal Prussian Statistical Office always used the county structure effective in the year of publication. Thus, data from the same census were published with different aggregations. As a consequence, in some cases different identifiers (e.g., kreiskey1871 and kreiskey1874) had to be assigned to data from the same census year (1871).
5. How should I cite the database?
When using data from the iPEHD database, please reference it as follows:
Becker, Sascha, Francesco Cinnirella, Erik Hornung and Ludger Wößmann, "iPEHD - The ifo Prussian Economic History Database", Historical Methods: A Journal of Quantitative and Interdisciplinary History 47 (2), 2014, 57–66, Information, Working paper version available as: CESifo Working Paper 3904 (PDF)
iPEHD Contact
If you have any feedback regarding iPEHD, please contact us solely through this address: iPEHD@ifo.de
People behind iPEHD
iPEHD was initially set up by a small team of scientists at the ifo Institute – Sascha O. Becker (now Monash University), Francesco Cinnirella (now University of Bergamo), Erik Hornung (now University of Cologne), and Ludger Woessmann:
Disclaimer
The authors reserve the right not to be responsible for the topicality, correctness, completeness or quality of the information provided. Liability claims regarding damage caused by the use of any information provided, including any kind of information which is incomplete or incorrect, will therefore be rejected.
Parts of the pages or the complete publication including all offers and information might be extended, changed or partly or completely deleted by the author without separate announcement.