csvsed¶
Description¶
Like sed
, but on a per-column basis
Example:
$ csvsed "Ab[bi].+" "Abby" -E "(B|R)ob.*" "\1ob" -E "(?:Jack|John).*" "John" examples/aliases.csv
id,to,from
1,Abby,Bob
2,Bob,John
3,Abby,John
4,John,Abner
5,Rob,John
6,Jon,Abby
7,Rob,Abby
Examples¶
Multiple expressions¶
Cleaning up currency values¶
remove commas and spaces
replace negative notation, from
(42)
to-42
add a decimal amount for all whole dollar values
$ csvsed '[$, ]' "" examples/ledger.csv \
-c 'revenue,gross' \
-E '\((.+?)\)' '\-\1' \
-E '(?<=\d)(\d{2})$' '\1.00'
id,name,revenue,gross
001,apples,21456.00,3210.45
002,bananas,2442.00,-1234.00
003,cherries,9700.55,-7.90
004,dates,4102765.33,18765.00
005,eggplants,3987.00,501.00
006,figs,30333.00,-777.66
006,grapes,154321.98,-32654.00
Replace entire field with -R/--replace
¶
$ cat examples/rolodex.csv |
csvsed -c phone '(?P<area>\d{3}).*?(?P<x>\d{3}).*?(?P<y>\d{4})' \
'(\g<area>)-\g<x>-\g<y>'
Output:
name,zipcode,phone
Andie,10003,((555)-123-4567
Betty,23456,1-(800)-777-2222
Caren,33033,1((900)-333-1212
Denny,42742,(212)-867-5309
Ellie,90210,(555)-404-2020
Some of the phone numbers have superfluous characters, like country calling code (i.e. “1-“) and unneeded parentheses. Use the -R
option to specify that the replacement pattern should overwrite the entire field:
$ cat examples/rolodex.csv |
csvsed -R -c phone '(?P<area>\d{3}).*?(?P<x>\d{3}).*?(?P<y>\d{4})' \
'(\g<area>)-\g<x>-\g<y>'
name,zipcode,phone
Andie,10003,(555)-123-4567
Betty,23456,(800)-777-2222
Caren,33033,(900)-333-1212
Denny,42742,(212)-867-5309
Ellie,90210,(555)-404-2020