Dr. Semmelweis and the discovery of handwashing

Michael Taylor

2018/05/21

blogdown::shortcode_html('figure', src='/projects/128px-Ignaz_Semmelweis_1860.jpg')

This is Dr. Ignaz Semmelweis, a Hungarian physician born in 1818 and an assistant to the professor of the maternity clinic at the Vienna General Hospital. Semmelweis demonstrated that the incidence of childbed fever could be drastically reduced by appropriate hand washing by medical care-givers.

knitr::opts_chunk$set(cache = TRUE)
# Load in the tidyverse package
library(tidyverse)
# Read datasets/yearly_deaths_by_clinic.csv into yearly
(filez<-unzip('survey-data.zip', list = T))
##                          Name Length                Date
## 1 yearly_deaths_by_clinic.csv    299 2018-06-26 05:00:00
## 2          monthly_deaths.csv   1753 2018-06-26 05:00:00
yearly <- read_csv(unzip('survey-data.zip', filez[1,1]) )
## Parsed with column specification:
## cols(
##   year = col_integer(),
##   births = col_integer(),
##   deaths = col_integer(),
##   clinic = col_character()
## )
# Print out yearly
yearly
## # A tibble: 12 x 4
##     year births deaths clinic  
##    <int>  <int>  <int> <chr>   
##  1  1841   3036    237 clinic 1
##  2  1842   3287    518 clinic 1
##  3  1843   3060    274 clinic 1
##  4  1844   3157    260 clinic 1
##  5  1845   3492    241 clinic 1
##  6  1846   4010    459 clinic 1
##  7  1841   2442     86 clinic 2
##  8  1842   2659    202 clinic 2
##  9  1843   2739    164 clinic 2
## 10  1844   2956     68 clinic 2
## 11  1845   3241     66 clinic 2
## 12  1846   3754    105 clinic 2

Semmelweis found that women who underwent street births, or giving birth on the way to the hospital and not admitted to the clinic but receiving lying-in benefits, rarely showed any signs of puerperal fever. Below we can see the proportion of deaths out of the number of women giving birth.

# Adding a new column to yearly with proportion of deaths per no. births
yearly <- yearly  %>% mutate(proportion_deaths = deaths / births)

# Print out yearly
yearly
## # A tibble: 12 x 5
##     year births deaths clinic   proportion_deaths
##    <int>  <int>  <int> <chr>                <dbl>
##  1  1841   3036    237 clinic 1            0.0781
##  2  1842   3287    518 clinic 1            0.158 
##  3  1843   3060    274 clinic 1            0.0895
##  4  1844   3157    260 clinic 1            0.0824
##  5  1845   3492    241 clinic 1            0.0690
##  6  1846   4010    459 clinic 1            0.114 
##  7  1841   2442     86 clinic 2            0.0352
##  8  1842   2659    202 clinic 2            0.0760
##  9  1843   2739    164 clinic 2            0.0599
## 10  1844   2956     68 clinic 2            0.0230
## 11  1845   3241     66 clinic 2            0.0204
## 12  1846   3754    105 clinic 2            0.0280
# Plot yearly proportion of deaths at the two clinics

ggplot(yearly, aes(x=year,y=proportion_deaths,color= clinic)) +
geom_line()

Why is the proportion of deaths constantly so much higher in Clinic 1? Semmelweis saw the same pattern and was puzzled and distressed. The only difference between the clinics was that many medical students served at Clinic 1, while mostly midwife students served at Clinic 2. While the midwives only tended to the women giving birth, the medical students also spent time in the autopsy rooms examining corpses.

Semmelweis started to suspect that something on the corpses, spread from the hands of the medical students, caused childbed fever. So in a desperate attempt to stop the high mortality rates, he decreed: Wash your hands! This was an unorthodox and controversial request, nobody in Vienna knew about bacteria at this point in time.

Let’s load in monthly data from Clinic 1 to see if the handwashing had any effect.

# Read datasets/monthly_deaths.csv into monthly
monthly <- read_csv(unzip('survey-data.zip', filez[2,1]) )
## Parsed with column specification:
## cols(
##   date = col_date(format = ""),
##   births = col_integer(),
##   deaths = col_integer()
## )
# Adding a new column with proportion of deaths per no. births
monthly <- monthly  %>% mutate(proportion_deaths = deaths / births)

# Print out the first rows in monthly
head(monthly)
## # A tibble: 6 x 4
##   date       births deaths proportion_deaths
##   <date>      <int>  <int>             <dbl>
## 1 1841-01-01    254     37           0.146  
## 2 1841-02-01    239     18           0.0753 
## 3 1841-03-01    277     12           0.0433 
## 4 1841-04-01    255      4           0.0157 
## 5 1841-05-01    255      2           0.00784
## 6 1841-06-01    200     10           0.05

The effect of hand washing

ggplot(monthly, aes( date, proportion_deaths)) + 
geom_line() +
xlab('TIME') + ylab('PROPORTION OF DEATHS')

The effect of hand washing highlighted

# From this date handwashing was made mandatory
handwashing_start = as.Date('1847-06-01')

# Add a TRUE/FALSE column to monthly called handwashing_started
monthly <- monthly  %>% mutate(handwashing_started = date >= handwashing_start )

# Plot monthly proportion of deaths before and after handwashing
monthly  %>% 
  ggplot(aes(date, proportion_deaths, color =  handwashing_started)) + 
  geom_line() +
  labs(x = "TIME", y = "PROPORTION OF DEATHS")

More handwashing, fewer deaths?

Again, the graph shows that handwashing had a huge effect. How much did it reduce the monthly proportion of deaths on average?

# Calculating the mean proportion of deaths 
# before and after handwashing.

monthly_summary <- monthly %>% group_by(handwashing_started) %>% 
summarise(mean_proportion_deaths = mean(proportion_deaths) )

# Printing out the summary.
monthly_summary
## # A tibble: 2 x 2
##   handwashing_started mean_proportion_deaths
##   <lgl>                                <dbl>
## 1 FALSE                               0.105 
## 2 TRUE                                0.0211

A statistical analysis of Semmelweis handwashing data

It reduced the proportion of deaths by around 8 percentage points! From 10% on average before handwashing to just 2% when handwashing was enforced (which is still a high number by modern standards). To get a feeling for the uncertainty around how much handwashing reduces mortalities we could look at a confidence interval (here calculated using a t-test).

# Calculating a 95% Confidence intrerval using t.test 
test_result <- t.test( proportion_deaths ~ handwashing_started, data = monthly)
test_result
## 
##  Welch Two Sample t-test
## 
## data:  proportion_deaths by handwashing_started
## t = 9.6101, df = 92.435, p-value = 1.445e-15
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.06660662 0.10130659
## sample estimates:
## mean in group FALSE  mean in group TRUE 
##          0.10504998          0.02109338
sessionInfo()
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17134)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252   
## [3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C                   
## [5] LC_TIME=English_Canada.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] bindrcpp_0.2.2   rvest_0.3.2      xml2_1.2.0       forcats_0.3.0   
##  [5] stringr_1.3.1    dplyr_0.7.6      purrr_0.2.5      readr_1.1.1     
##  [9] tidyr_0.8.1      tibble_1.4.2     ggplot2_3.0.0    tidyverse_1.2.1 
## [13] RefManageR_1.2.0
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_0.2.4 xfun_0.3         reshape2_1.4.3   haven_1.1.2     
##  [5] lattice_0.20-35  colorspace_1.3-2 htmltools_0.3.6  yaml_2.1.19     
##  [9] utf8_1.1.4       rlang_0.2.1      pillar_1.2.3     withr_2.1.2     
## [13] foreign_0.8-70   glue_1.2.0       modelr_0.1.2     readxl_1.1.0    
## [17] bindr_0.1.1      plyr_1.8.4       munsell_0.5.0    blogdown_0.7    
## [21] gtable_0.2.0     cellranger_1.1.0 codetools_0.2-15 psych_1.8.4     
## [25] evaluate_0.10.1  labeling_0.3     knitr_1.20       parallel_3.5.1  
## [29] broom_0.4.5      Rcpp_0.12.17     backports_1.1.2  scales_0.5.0    
## [33] jsonlite_1.5     mnormt_1.5-5     hms_0.4.2        digest_0.6.15   
## [37] stringi_1.1.7    bookdown_0.7     grid_3.5.1       rprojroot_1.3-2 
## [41] bibtex_0.4.2     cli_1.0.0        tools_3.5.1      magrittr_1.5    
## [45] lazyeval_0.2.1   crayon_1.3.4     pkgconfig_2.0.1  lubridate_1.7.4 
## [49] rstudioapi_0.7   assertthat_0.2.0 rmarkdown_1.10   httr_1.3.1      
## [53] R6_2.2.2         nlme_3.1-137     compiler_3.5.1
knitr::write_bib(.packages(), "packages.bib") 

References

Wickham, Hadley. 2017. Tidyverse: Easily Install and Load the ’Tidyverse’. https://CRAN.R-project.org/package=tidyverse.