Stochastic Nonsense

Put something smart here.

Grep With Inclusion of the First Line

When working with csv or strongly typed files from our dfs, I often want to grep a file but keep the first line of output which is a list of field names. grep, unfortunately, doesn’t seem to have any knowledge about line numbers. What I really want is something that essentially does:

1
head -1 testfile; grep word <(tail +2 testfile)

without having to type that over and over. You can kluge this together by putting a script named `grep“ in your personal bin:

1
2
3
4
5
6
#!/bin/bash

# echo ${@:1:$((${#@} - 1))}   # all arguments except the last

head -1 "${@: -1}"
grep ${@:1:$((${#@} - 1))} <(tail -n+2 "${@: -1}")

And if you create a testfile as so:

1
2
3
4
5
6
7
8
$ cat testfile
field1|field2|field3|field4
a|b|c|d
e|f|g|h
word|a|b|d
a|word|b|c
f|word|3|a
f|ord|3|a

Then

1
2
3
4
5
$ grep1 word testfile
field1|field2|field3|field4
word|a|b|d
a|word|b|c
f|word|3|a

and

1
2
3
4
5
$ grep1 -v word testfile
field1|field2|field3|field4
a|b|c|d
e|f|g|h
f|ord|3|a

work as expected. It isn’t perfect. For example, if you include context around the matched lines — eg -C 3 — it’s possible to output the header twice, but it gets close to what I want without having to build a custom grep executable. Alternate suggestions seem to be to use sed or awk instead, but I prefer my solution since the regex syntax is slightly different for each program.