Brendan Dawes
Analogue + Digital

Process

Deleting Thousands of Lines not Containing a Pattern with Vim

Here's a small demonstration of why I love Vim so much — editing large amounts of data such as csv and tsv files.

Let's say I have a fairly large tsv file — in this case my Twitter archive spanning eight years. It's made up of over 24,000 lines. I want to edit this file so it only contains my 2014 tweets. There's many ways to do this using various tools but here's how easy it is in Vim.

First I move down to the second line as I want to keep the headers at the top. Then I press V to go into visual line selection mode and press G to select all the way down to the bottom of the document.

Next I press : which will take me to execute command mode. I then type in the following:

g!/2014-/d

This tells Vim to work through every line of the current selection and delete any line that doesn't match the string 2014- (I use the hyphen so it doesn't match any other numbers that may match 2014). Notice the ! mark after the g – this says anything that doesn't match the following pattern. The d on the end is the command to execute.

Hitting return will then execute this command on every line, deleting anything that doesn't match 2014-. It then leaves me with a nice .tsv file containing only the data I want.