Package 'epivis'

Title: Tools for visualising epidemiological data
Description: Static visualisation of patient level linelist data.
Authors: Paul Campbell [aut, cre] , Hugo Soubrier [aut]
Maintainer: Paul Campbell <[email protected]>
License: MIT + file LICENSE
Version: 0.0.0.9000
Built: 2024-10-25 09:18:47 UTC
Source: https://github.com/epicentre-msf/epivis

Help Index


Wrapper function to dodge xAxis labels

Description

Useful when you have overlapping labels on the xAxis.

Usage

dodge_x_labs(n.dodge = 2)

Arguments

n.dodge

passed to ggplot2::guide_axis


Format break labels

Description

Format break labels

Usage

label_breaks(breaks, lab_accuracy = 0.1, replace_Inf = TRUE)

Arguments

breaks

numeric vector of breaks

lab_accuracy

accuracy of labels, passed to scales::number

replace_Inf

if Inf is your final break, replace with a + sign in the label?


Simulated Measles outbreak in Moïssala (Chad)

Description

This linelist replicates a measles outbreak over several regions of Southern Chad. It uses realistic distributions and parameters from a measles outbreak.

Usage

moissala_measles

Format

moissala_measles

A data frame with 5,028 rows and 30 columns:

id

epi id

site

Site name

case_name

Case name

sex

Case sex

age

Case age

age_unit

Age units

age_group

Age group

region

Case region of residence

sub_prefecture

Case sub-prefecture of residence

date_onset

Date of symptoms onset

hospitalisation

Hospitalisation status

date_admission

Date of hospital admission

ct_value

Ct value of RT-PCR

malaria_rdt

Malaria RDT result

fever

presence of fever upon admission

rash

presence of rash upon admission

cough

presence of cough upon admission

red_eye

presence of red eyes upon admission

pneumonia

presence of pneumonia upon admission

encephalitis

presence of encephalitis upon admission

muac

Middle Upper Arm Circumference (MUAC) upon admission

muac_cat

MUAC category

vacc_status

Vaccination status

vacc_doses

Vaccine doses received

outcome

Outcome

date_death

Date of death

date_exit

Date of hospital exit

epi_classification

Epidemiological classification

Source

https://epicentre-msf.github.io/gallery/


Plot incidence over time from patient level data

Description

Helper function to plot epidemic curves with ggplot2 with options for grouping data, facets and proportion lines.

Usage

plot_epicurve(
  df,
  date_col,
  group_col = NULL,
  facet_col = NULL,
  prop_col = NULL,
  prop_numer = NULL,
  prop_denom = "non_missing",
  prop_line_colour = "black",
  prop_line_size = 0.8,
  floor_date_week = FALSE,
  label_weeks = FALSE,
  week_start = 1,
  date_breaks,
  date_labels = waiver(),
  date_max = NULL,
  sec_date_axis = FALSE,
  facet_nrow = NULL,
  facet_ncol = NULL,
  facet_scales = "fixed",
  facet_labs = ggplot2::label_wrap_gen(width = 25),
  facet_lab_pos = "top",
  group_na_colour = "grey",
  title = waiver(),
  subtitle = waiver(),
  date_lab = waiver(),
  y_lab = waiver(),
  group_lab = waiver(),
  prop_lab = NULL
)

Arguments

df

un-aggregated dataframe with a minumum of a date column with a date or POSIX class

date_col

date variable to plot incidence with. Must be provided.

group_col

optional grouping variable to be applied to the fill aesthetic of columns

facet_col

optional faceting variable to split chart into small multiples

prop_col

optional variable to be used to plot a proportion line on top of the epicurve

prop_numer

value(s) in the prop_col variable as a single value or vector to be used to calculate the numerator of the proportion calculation

prop_denom

value(s) in the prop_col variable as a single value or vector to be used to calculate the denominator of the proportion calculation. default "non_missing" will take the sum of all non-missing values in the column.

prop_line_colour

colour of the proportion line. defaults to "red"

prop_line_size

width of the proportion line. defaults to 0.8

floor_date_week

should date_col dates be floored to the Monday of the ISO week they fall in? defaults to TRUE

label_weeks

label primary date axis with week numbers? defaults to TRUE

week_start

day of week defined as the start of the week as integer 1-7 (Monday = 1, Sunday = 7). defaults to 1 (ISO week standard)

date_breaks

date break intervals passed to ggplot2::scale_x_date. defaults to "2 weeks"

date_labels

base::strptime date label code passed to ggplot2::scale_x_date. defaults to "\%V" (ISO Week)

date_max

force a date axis max date. Useful for when a week has passed with no incidence and you want to show this on the plot. Setting date_max to the current week will force the date axis to show this week with no incidence.

sec_date_axis

plot a secondary date axis using default calculated ggplot2 date breaks and labels? defaults to FALSE

facet_nrow

nrow argument passed to ggplot2::facet_wrap

facet_ncol

ncol argument passed to ggplot2::facet_wrap

facet_scales

value for the scales argument passed to ggplot2::facet_wrap. Default to fixed.

facet_labs

facet labeller argument passed to ggplot2::facet_wrap. Defaults to ggplot2::label_wrap_gen(width = 25).

facet_lab_pos

facet label position argument passed to strip.position in ggplot2::facet_wrap. defaults to "top". Options are c("top", "bottom", "left", "right")

group_na_colour

colour for missing values in group_col. defaults to "grey"

title

optional title for the plot

subtitle

optional subtitle for the plot

date_lab

optional label for the date axis. defaults to date_col name if not provided

y_lab

optional label for the Y axis. defaults to n if not provided

group_lab

optional label for the group legend. defaults to group_col name if not provided

prop_lab

label for the proportion line. There is no default so this should be provided when plotting proportion lines

Value

a ggplot object

Examples

library(dplyr)
df_ebola <- dplyr::as_tibble(outbreaks::ebola_sim_clean$linelist)

df_ebola |>
  dplyr::mutate(outcome = forcats::fct_explicit_na(outcome, "Unknown")) |>
  plot_epicurve(
    date_col = date_of_onset,
    group_col = outcome,
    prop_col = outcome,
    prop_numer = "Death",
    prop_denom = c("Death", "Recover"),
    floor_date_week = TRUE,
    date_breaks = "2 weeks",
    sec_date_axis = TRUE,
    date_lab = "Week of onset",
    y_lab = "Incidence",
    group_lab = "Outcome",
    prop_lab = "CFR"
  )

Missing data visualisation

Description

Function to generate a tile plot exploring the missing values for all observations across all variables of a dataframe.

Usage

plot_miss_vis(
  x,
  facet = NULL,
  col_vec = c("#6a040f", "#cce3de"),
  y_axis_text_size = 8
)

Arguments

x

a dataframe

facet

a character value of variable to facet the graph

col_vec

a vector of length 2 specifying the color for Missing and Present values respectively

y_axis_text_size

a nuemric value for the size of the y axis text

Value

a ggplot tile graph displaying the missing/present values for all variables of the dataframe

Examples

# Use simulated measles data

suppressMessages(library(dplyr))

epivis::moissala_measles |>
  filter(site %in% c("Moïssala Hospital", "Bouna Hospital")) |>
  plot_miss_vis(facet = "site")

Plot Age/Sex Pyramids

Description

Plot Age/Sex Pyramids

Usage

plot_pyramid(
  df,
  age_col,
  gender_col,
  gender_levels,
  facet_col = NULL,
  make_age_groups = TRUE,
  age_breaks = c(seq(0, 80, 10), Inf),
  age_labels = label_breaks(age_breaks),
  drop_age_levels = FALSE,
  gender_labs = NULL,
  x_lab = waiver(),
  y_lab = waiver(),
  colours = c("#486090FF", "#7890A8FF"),
  show_data_labs = FALSE,
  lab_size = 4,
  lab_in_col = "white",
  lab_out_col = "grey30",
  lab_nudge_factor = 5,
  facet_nrow = NULL,
  facet_ncol = NULL,
  facet_scales = "fixed",
  facet_labs = label_wrap_gen(width = 25),
  facet_lab_pos = "top",
  add_missing_cap = TRUE
)

Arguments

df

un-aggregated dataframe with a minimum of age and gender variables.

age_col

age variable name in df. Can be either a numeric vecotr of ages or a character/factor vector of age groups.

gender_col

gender variable name in df with levels indicating male or female.

gender_levels

length 2 character vector with male and female level in gender_col, respectively.

facet_col

optional faceting variable name to split chart into small multiples.

make_age_groups

set to TRUE (default) if age_col is numeric and needs to be binned into groups.

age_breaks

breaks to be used for binning a numerical age_col.

age_labels

break labels to accompany age_breaks. Defaults to epivis::label_breaks(age_breaks).

drop_age_levels

should age groups with no observations be removed from the chart? Defaults to FALSE.

gender_labs

optional labels for gender_levels

x_lab

optional label for the X axis.

y_lab

optional label for the Y axis.

colours

length 2 character vector of colours used for male and female, respectively.

show_data_labs

show data labels on chart? Defaults to FALSE.

lab_size

data labels size.

lab_in_col

data label colour when placed inside a bar.

lab_out_col

data label colour when placed outside a bar.

lab_nudge_factor

threshold for moving a data label outside a bar. Defaults to 5. Increasing the number increases the distance from the max value required to move a label outside the bar.

facet_nrow

nrow argument passed to ggplot2::facet_wrap.

facet_ncol

ncol argument passed to ggplot2::facet_wrap.

facet_scales

facet scales argument passed to ggplot2::facet_wrap. Should scales be fixed ("fixed", the default), free ("free"), or free in one dimension ("free_x", "free_y")?

facet_labs

facet labeller argument passed to ggplot2::facet_wrap. Defaults to ggplot2::label_wrap_gen(width = 25).

facet_lab_pos

facet label position argument passed to strip.position in ggplot2::facet_wrap. Defaults to "top". Options are c("top", "bottom", "left", "right").

add_missing_cap

show missing data counts for age_col and gender_col? Defaults to TRUE.

Value

a ggplot object

Examples

suppressMessages(library(dplyr))
df_flu <- outbreaks::fluH7N9_china_2013

plot_pyramid(
  df = df_flu,
  age_col = age,
  gender_col = gender,
  gender_levels = c("m", "f")
)

Stacked barplot

Description

Stacked barplot

Usage

plot_stacked_bar(
  df,
  cols,
  levels_value,
  keep_na = TRUE,
  use_counts = TRUE,
  flip = FALSE,
  x_lab = waiver(),
  caption = TRUE
)

Arguments

df

un-aggregated dataframe (linelist).

cols

vector of character/factor variables names in df to be displayed in the barplot.

levels_value

vector of level values to be used for the plotting.

keep_na

logical, default = TRUE. Keep NAs in the graphs and the proportions ?

use_counts

logical, default = TRUE. Use counts or proportion in y axis ?

flip

logical, default = FALSE. Flip the barplot ?

x_lab

character name for the x axis

caption

logical, default = TRUE. Display the plot caption summarising the number of cases ?

Value

a ggplot object

Examples

# Use fake data from Epidemiologist R handbook

suppressMessages(library(dplyr))

epivis::moissala_measles |>
  mutate(across(
    c(
      fever,
      rash,
      cough,
      red_eye,
      pneumonia,
      encephalitis
    ),
    ~ as.character(.x)
  )) |>
  plot_stacked_bar(
    cols = c("fever", "rash", "cough", "red_eye", "pneumonia", "encephalitis"),
    levels_value = c(0, 1),
    keep_na = FALSE,
    use_counts = FALSE,
    flip = TRUE
  )