Package: hmatch 0.1.0.9000

Patrick Barks

hmatch: Tools for Cleaning and Matching Hierarchically-Structured Data

Tools for matching raw, potentially messy hierarchical data (e.g. province, county, township) against a reference dataset.

Authors:Patrick Barks [aut, cre], Paul Campbell [ctb]

hmatch_0.1.0.9000.tar.gz
hmatch_0.1.0.9000.zip(r-4.5)hmatch_0.1.0.9000.zip(r-4.4)hmatch_0.1.0.9000.zip(r-4.3)
hmatch_0.1.0.9000.tgz(r-4.4-any)hmatch_0.1.0.9000.tgz(r-4.3-any)
hmatch_0.1.0.9000.tar.gz(r-4.5-noble)hmatch_0.1.0.9000.tar.gz(r-4.4-noble)
hmatch_0.1.0.9000.tgz(r-4.4-emscripten)hmatch_0.1.0.9000.tgz(r-4.3-emscripten)
hmatch.pdf |hmatch.html
hmatch/json (API)
NEWS

# Install 'hmatch' in R:
install.packages('hmatch', repos = c('https://epicentre-msf.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/epicentre-msf/hmatch/issues

Datasets:

On CRAN:

3.43 score 10 stars 27 scripts 15 exports 22 dependencies

Last updated 1 years agofrom:1a57862a84. Checks:OK: 5 NOTE: 2. Indexed: yes.

TargetResultDate
Doc / VignettesOKOct 24 2024
R-4.5-winNOTEOct 24 2024
R-4.5-linuxNOTEOct 24 2024
R-4.4-winOKOct 24 2024
R-4.4-macOKOct 24 2024
R-4.3-winOKOct 24 2024
R-4.3-macOKOct 24 2024

Exports:count_tokenshcodes_inthcodes_strhmatchhmatch_compositehmatch_manualhmatch_parentshmatch_permutehmatch_settlehmatch_splithmatch_tokensmax_levelsref_expandseparate_hcodestring_std

Dependencies:clicpp11dplyrfansigenericsgluelifecyclemagrittrpillarpkgconfigpurrrR6rlangstringdiststringistringrtibbletidyrtidyselectutf8vctrswithr

Readme and manuals

Help Manual

Help pageTopics
Find frequently occurring tokens within a hierarchical columncount_tokens
Dictionary-based recoding of values during hierarchical matchingdictionary_recoding
Create codes to identify each unique combination of hierarchical levels in a reference datasethcodes hcodes_int hcodes_str
Match sets of hierarchical variables between a raw and reference datasethmatch
Implement a variety of hierarchical matching strategies in sequencehmatch_composite
Manual hierarchical matchinghmatch_manual
Hierarchical matching of parents based on sets of common offspringhmatch_parents
Hierarchical matching with sequential column permutation to allow for values entered at the wrong hierarchical levelhmatch_permute
Sequential hierarchical matching at each hierarchical level, settling for the highest resolution match that is possible for each rowhmatch_settle
Hierarchical matching, separately at each hierarchical levelhmatch_split
Hierarchical matching with tokenization of multi-term valueshmatch_tokens
Types of hierarchical joinsjoin_types
Maximum hierarchical levelsmax_levels
Raw datasetne_raw
Reference datasetne_ref
Expand a reference data.frame containing N hierarchical columns to an N-level reference data.frameref_expand
Separate a hierarchical code reflecting multiple levels into its constituent parts, with one column for each levelseparate_hcode
Specifying hierarchical columns with arguments 'pattern' or 'by'specifying_columns
String Standardizationstring_standardization
String standardization prior to matchingstring_std