Title: | Tools for visualising epidemiological data |
---|---|
Description: | Static visualisation of patient level linelist data. |
Authors: | Paul Campbell [aut, cre] , Hugo Soubrier [aut] |
Maintainer: | Paul Campbell <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.0.9000 |
Built: | 2024-10-25 09:18:47 UTC |
Source: | https://github.com/epicentre-msf/epivis |
Useful when you have overlapping labels on the xAxis.
dodge_x_labs(n.dodge = 2)
dodge_x_labs(n.dodge = 2)
n.dodge |
passed to ggplot2::guide_axis |
Format break labels
label_breaks(breaks, lab_accuracy = 0.1, replace_Inf = TRUE)
label_breaks(breaks, lab_accuracy = 0.1, replace_Inf = TRUE)
breaks |
numeric vector of breaks |
lab_accuracy |
accuracy of labels, passed to |
replace_Inf |
if |
This linelist replicates a measles outbreak over several regions of Southern Chad. It uses realistic distributions and parameters from a measles outbreak.
moissala_measles
moissala_measles
moissala_measles
A data frame with 5,028 rows and 30 columns:
epi id
Site name
Case name
Case sex
Case age
Age units
Age group
Case region of residence
Case sub-prefecture of residence
Date of symptoms onset
Hospitalisation status
Date of hospital admission
Ct value of RT-PCR
Malaria RDT result
presence of fever upon admission
presence of rash upon admission
presence of cough upon admission
presence of red eyes upon admission
presence of pneumonia upon admission
presence of encephalitis upon admission
Middle Upper Arm Circumference (MUAC) upon admission
MUAC category
Vaccination status
Vaccine doses received
Outcome
Date of death
Date of hospital exit
Epidemiological classification
https://epicentre-msf.github.io/gallery/
Helper function to plot epidemic curves with ggplot2 with options for grouping data, facets and proportion lines.
plot_epicurve( df, date_col, group_col = NULL, facet_col = NULL, prop_col = NULL, prop_numer = NULL, prop_denom = "non_missing", prop_line_colour = "black", prop_line_size = 0.8, floor_date_week = FALSE, label_weeks = FALSE, week_start = 1, date_breaks, date_labels = waiver(), date_max = NULL, sec_date_axis = FALSE, facet_nrow = NULL, facet_ncol = NULL, facet_scales = "fixed", facet_labs = ggplot2::label_wrap_gen(width = 25), facet_lab_pos = "top", group_na_colour = "grey", title = waiver(), subtitle = waiver(), date_lab = waiver(), y_lab = waiver(), group_lab = waiver(), prop_lab = NULL )
plot_epicurve( df, date_col, group_col = NULL, facet_col = NULL, prop_col = NULL, prop_numer = NULL, prop_denom = "non_missing", prop_line_colour = "black", prop_line_size = 0.8, floor_date_week = FALSE, label_weeks = FALSE, week_start = 1, date_breaks, date_labels = waiver(), date_max = NULL, sec_date_axis = FALSE, facet_nrow = NULL, facet_ncol = NULL, facet_scales = "fixed", facet_labs = ggplot2::label_wrap_gen(width = 25), facet_lab_pos = "top", group_na_colour = "grey", title = waiver(), subtitle = waiver(), date_lab = waiver(), y_lab = waiver(), group_lab = waiver(), prop_lab = NULL )
df |
un-aggregated dataframe with a minumum of a date column with a date or POSIX class |
date_col |
date variable to plot incidence with. Must be provided. |
group_col |
optional grouping variable to be applied to the fill aesthetic of columns |
facet_col |
optional faceting variable to split chart into small multiples |
prop_col |
optional variable to be used to plot a proportion line on top of the epicurve |
prop_numer |
value(s) in the |
prop_denom |
value(s) in the |
prop_line_colour |
colour of the proportion line. defaults to "red" |
prop_line_size |
width of the proportion line. defaults to 0.8 |
floor_date_week |
should |
label_weeks |
label primary date axis with week numbers? defaults to TRUE |
week_start |
day of week defined as the start of the week as integer 1-7 (Monday = 1, Sunday = 7). defaults to 1 (ISO week standard) |
date_breaks |
date break intervals passed to |
date_labels |
|
date_max |
force a date axis max date. Useful for when a week has passed with no incidence and you want to show this on the plot. Setting date_max to the current week will force the date axis to show this week with no incidence. |
sec_date_axis |
plot a secondary date axis using default calculated ggplot2 date breaks and labels? defaults to FALSE |
facet_nrow |
nrow argument passed to |
facet_ncol |
ncol argument passed to |
facet_scales |
value for the |
facet_labs |
facet labeller argument passed to |
facet_lab_pos |
facet label position argument passed to strip.position in |
group_na_colour |
colour for missing values in |
title |
optional title for the plot |
subtitle |
optional subtitle for the plot |
date_lab |
optional label for the date axis. defaults to |
y_lab |
optional label for the Y axis. defaults to |
group_lab |
optional label for the group legend. defaults to |
prop_lab |
label for the proportion line. There is no default so this should be provided when plotting proportion lines |
a ggplot object
library(dplyr) df_ebola <- dplyr::as_tibble(outbreaks::ebola_sim_clean$linelist) df_ebola |> dplyr::mutate(outcome = forcats::fct_explicit_na(outcome, "Unknown")) |> plot_epicurve( date_col = date_of_onset, group_col = outcome, prop_col = outcome, prop_numer = "Death", prop_denom = c("Death", "Recover"), floor_date_week = TRUE, date_breaks = "2 weeks", sec_date_axis = TRUE, date_lab = "Week of onset", y_lab = "Incidence", group_lab = "Outcome", prop_lab = "CFR" )
library(dplyr) df_ebola <- dplyr::as_tibble(outbreaks::ebola_sim_clean$linelist) df_ebola |> dplyr::mutate(outcome = forcats::fct_explicit_na(outcome, "Unknown")) |> plot_epicurve( date_col = date_of_onset, group_col = outcome, prop_col = outcome, prop_numer = "Death", prop_denom = c("Death", "Recover"), floor_date_week = TRUE, date_breaks = "2 weeks", sec_date_axis = TRUE, date_lab = "Week of onset", y_lab = "Incidence", group_lab = "Outcome", prop_lab = "CFR" )
Function to generate a tile plot exploring the missing values for all observations across all variables of a dataframe.
plot_miss_vis( x, facet = NULL, col_vec = c("#6a040f", "#cce3de"), y_axis_text_size = 8 )
plot_miss_vis( x, facet = NULL, col_vec = c("#6a040f", "#cce3de"), y_axis_text_size = 8 )
x |
a dataframe |
facet |
a character value of variable to facet the graph |
col_vec |
a vector of length 2 specifying the color for Missing and Present values respectively |
y_axis_text_size |
a nuemric value for the size of the y axis text |
a ggplot tile graph displaying the missing/present values for all variables of the dataframe
# Use simulated measles data suppressMessages(library(dplyr)) epivis::moissala_measles |> filter(site %in% c("Moïssala Hospital", "Bouna Hospital")) |> plot_miss_vis(facet = "site")
# Use simulated measles data suppressMessages(library(dplyr)) epivis::moissala_measles |> filter(site %in% c("Moïssala Hospital", "Bouna Hospital")) |> plot_miss_vis(facet = "site")
Plot Age/Sex Pyramids
plot_pyramid( df, age_col, gender_col, gender_levels, facet_col = NULL, make_age_groups = TRUE, age_breaks = c(seq(0, 80, 10), Inf), age_labels = label_breaks(age_breaks), drop_age_levels = FALSE, gender_labs = NULL, x_lab = waiver(), y_lab = waiver(), colours = c("#486090FF", "#7890A8FF"), show_data_labs = FALSE, lab_size = 4, lab_in_col = "white", lab_out_col = "grey30", lab_nudge_factor = 5, facet_nrow = NULL, facet_ncol = NULL, facet_scales = "fixed", facet_labs = label_wrap_gen(width = 25), facet_lab_pos = "top", add_missing_cap = TRUE )
plot_pyramid( df, age_col, gender_col, gender_levels, facet_col = NULL, make_age_groups = TRUE, age_breaks = c(seq(0, 80, 10), Inf), age_labels = label_breaks(age_breaks), drop_age_levels = FALSE, gender_labs = NULL, x_lab = waiver(), y_lab = waiver(), colours = c("#486090FF", "#7890A8FF"), show_data_labs = FALSE, lab_size = 4, lab_in_col = "white", lab_out_col = "grey30", lab_nudge_factor = 5, facet_nrow = NULL, facet_ncol = NULL, facet_scales = "fixed", facet_labs = label_wrap_gen(width = 25), facet_lab_pos = "top", add_missing_cap = TRUE )
df |
un-aggregated dataframe with a minimum of age and gender variables. |
age_col |
age variable name in |
gender_col |
gender variable name in |
gender_levels |
length 2 character vector with male and female level in |
facet_col |
optional faceting variable name to split chart into small multiples. |
make_age_groups |
set to TRUE (default) if |
age_breaks |
breaks to be used for binning a numerical |
age_labels |
break labels to accompany |
drop_age_levels |
should age groups with no observations be removed from the chart? Defaults to FALSE. |
gender_labs |
optional labels for |
x_lab |
optional label for the X axis. |
y_lab |
optional label for the Y axis. |
colours |
length 2 character vector of colours used for male and female, respectively. |
show_data_labs |
show data labels on chart? Defaults to FALSE. |
lab_size |
data labels size. |
lab_in_col |
data label colour when placed inside a bar. |
lab_out_col |
data label colour when placed outside a bar. |
lab_nudge_factor |
threshold for moving a data label outside a bar. Defaults to 5. Increasing the number increases the distance from the max value required to move a label outside the bar. |
facet_nrow |
nrow argument passed to |
facet_ncol |
ncol argument passed to |
facet_scales |
facet scales argument passed to |
facet_labs |
facet labeller argument passed to |
facet_lab_pos |
facet label position argument passed to strip.position in |
add_missing_cap |
show missing data counts for |
a ggplot object
suppressMessages(library(dplyr)) df_flu <- outbreaks::fluH7N9_china_2013 plot_pyramid( df = df_flu, age_col = age, gender_col = gender, gender_levels = c("m", "f") )
suppressMessages(library(dplyr)) df_flu <- outbreaks::fluH7N9_china_2013 plot_pyramid( df = df_flu, age_col = age, gender_col = gender, gender_levels = c("m", "f") )
Stacked barplot
plot_stacked_bar( df, cols, levels_value, keep_na = TRUE, use_counts = TRUE, flip = FALSE, x_lab = waiver(), caption = TRUE )
plot_stacked_bar( df, cols, levels_value, keep_na = TRUE, use_counts = TRUE, flip = FALSE, x_lab = waiver(), caption = TRUE )
df |
un-aggregated dataframe (linelist). |
cols |
vector of character/factor variables names in |
levels_value |
vector of level values to be used for the plotting. |
keep_na |
logical, default = |
use_counts |
logical, default = |
flip |
logical, default = |
x_lab |
character name for the x axis |
caption |
logical, default = |
a ggplot object
# Use fake data from Epidemiologist R handbook suppressMessages(library(dplyr)) epivis::moissala_measles |> mutate(across( c( fever, rash, cough, red_eye, pneumonia, encephalitis ), ~ as.character(.x) )) |> plot_stacked_bar( cols = c("fever", "rash", "cough", "red_eye", "pneumonia", "encephalitis"), levels_value = c(0, 1), keep_na = FALSE, use_counts = FALSE, flip = TRUE )
# Use fake data from Epidemiologist R handbook suppressMessages(library(dplyr)) epivis::moissala_measles |> mutate(across( c( fever, rash, cough, red_eye, pneumonia, encephalitis ), ~ as.character(.x) )) |> plot_stacked_bar( cols = c("fever", "rash", "cough", "red_eye", "pneumonia", "encephalitis"), levels_value = c(0, 1), keep_na = FALSE, use_counts = FALSE, flip = TRUE )