Given the a CSV file called data.csv you select a sampling of rows with the command csvrows using a few options. In this example we will assume data.csv has a header row we want to preserve and that our resulting sample will be called sample.csv. The options we use are
-iselecting data.csv as the input source
-osends the resulting CSV to the file named sample.csv
-header=trueindicates the header should be preserved and not be counted as part of the sample
-randomsets the number or rows to return in the sample, in this case twenty
Putting it all together–
csvrows -i data.csv -o sample.csv -header=true -random=20
NOTE: If data.csv has less than 20 rows then sample.csv will include all the rows of data.csv in a shuffled order.
csvrows reads in the entire csv file into memory, shuffles the row using Go’s rand package to calculate the rows to swap and then write out the number of rows request in the shuffled order. The randomness is limitted by the shuffle and the write of the first N shuffled rows.