Caltech Library logo

USAGE

csvjoin [OPTIONS] CSV1 CSV2 COL1 COL2

DESCRIPTION

csvjoin outputs CSV content based on two CSV files with matching column values. Each CSV input file has a designated column to match on. The values are compared as strings. Columns are counted from one rather than zero.

OPTIONS

Below are a set of options available.

    -allow-duplicates     allow duplicates when searching for matches
    -case-sensitive       make a case sensitive match (default is case insensitive)
    -col1                 column to on join on in first CSV file
    -col2                 column to on join on in second CSV file
    -contains             match columns based on csv1/col1 contained in csv2/col2
    -csv1                 first CSV filename
    -csv2                 second CSV filename
    -d, -delimiter        set delimiter character
    -delete-cost          deletion cost to use when calculating Levenshtein edit distance
    -examples             display example(s)
    -generate-manpage     generate man page
    -generate-markdown    generate markdown documentation
    -h, -help             display help
    -in-memory            if true read both CSV files
    -insert-cost          insertion cost to use when calculating Levenshtein edit distance
    -l, -license          display license
    -levenshtein          match columns using Levensthein edit distance
    -max-edit-distance    maximum edit distance for match using Levenshtein distance
    -o, -output           output filename
    -quiet                supress error messages
    -stop-words           a column delimited list of stop words to ingnore when matching
    -substitute-cost      substitution cost to use when calculating Levenshtein edit distance
    -trim-leading-space   trim leading space in field(s) for CSV input
    -trimspaces           trim spaces around cell values before comparing
    -use-lazy-quotes      use lazy quotes for CSV input
    -v, -version          display version
    -verbose              output processing count to stderr

EXAMPLES

Simple usage of building a merged CSV file from data1.csv and data2.csv where column 1 in data1.csv matches the value in column 3 of data2.csv with the results being written to merged-data.csv..

csvjoin -csv1=data1.csv -col1=2 \
   -csv2=data2.csv -col2=4 \
   -output=merged-data.csv

csvjoin v0.0.25