Package: hmatch 0.1.0.9000

Patrick Barks

hmatch: Tools for Cleaning and Matching Hierarchically-Structured Data

Tools for matching raw, potentially messy hierarchical data (e.g. province, county, township) against a reference dataset.

Authors:Patrick Barks [aut, cre], Paul Campbell [ctb]

hmatch_0.1.0.9000.tar.gz
hmatch_0.1.0.9000.zip(r-4.7)hmatch_0.1.0.9000.zip(r-4.6)hmatch_0.1.0.9000.zip(r-4.5)
hmatch_0.1.0.9000.tgz(r-4.6-any)hmatch_0.1.0.9000.tgz(r-4.5-any)
hmatch_0.1.0.9000.tar.gz(r-4.7-any)hmatch_0.1.0.9000.tar.gz(r-4.6-any)
hmatch_0.1.0.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
hmatch/json (API)
NEWS

# Install 'hmatch' in R:
install.packages('hmatch', repos = c('https://epicentre-msf.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/epicentre-msf/hmatch/issues

Datasets:

On CRAN:

Conda:

3.47 score 11 stars 27 scripts 15 exports 21 dependencies

Last updated from:1a57862a84. Checks:7 NOTE, 2 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64NOTE130
source / vignettesOK168
linux-release-x86_64NOTE137
macos-release-arm64NOTE96
macos-oldrel-arm64NOTE106
windows-develNOTE101
windows-releaseNOTE85
windows-oldrelNOTE78
wasm-releaseOK109

Exports:count_tokenshcodes_inthcodes_strhmatchhmatch_compositehmatch_manualhmatch_parentshmatch_permutehmatch_settlehmatch_splithmatch_tokensmax_levelsref_expandseparate_hcodestring_std

Dependencies:clicpp11dplyrgenericsgluelifecyclemagrittrpillarpkgconfigpurrrR6rlangstringdiststringistringrtibbletidyrtidyselectutf8vctrswithr

Readme and manuals

Help Manual

Help pageTopics
Find frequently occurring tokens within a hierarchical columncount_tokens
Dictionary-based recoding of values during hierarchical matchingdictionary_recoding
Create codes to identify each unique combination of hierarchical levels in a reference datasethcodes hcodes_int hcodes_str
Match sets of hierarchical variables between a raw and reference datasethmatch
Implement a variety of hierarchical matching strategies in sequencehmatch_composite
Manual hierarchical matchinghmatch_manual
Hierarchical matching of parents based on sets of common offspringhmatch_parents
Hierarchical matching with sequential column permutation to allow for values entered at the wrong hierarchical levelhmatch_permute
Sequential hierarchical matching at each hierarchical level, settling for the highest resolution match that is possible for each rowhmatch_settle
Hierarchical matching, separately at each hierarchical levelhmatch_split
Hierarchical matching with tokenization of multi-term valueshmatch_tokens
Types of hierarchical joinsjoin_types
Maximum hierarchical levelsmax_levels
Raw datasetne_raw
Reference datasetne_ref
Expand a reference data.frame containing N hierarchical columns to an N-level reference data.frameref_expand
Separate a hierarchical code reflecting multiple levels into its constituent parts, with one column for each levelseparate_hcode
Specifying hierarchical columns with arguments 'pattern' or 'by'specifying_columns
String Standardizationstring_standardization
String standardization prior to matchingstring_std