Filtering an huge .CSV

August 27th, 2013

I have a csv with more than 600k rows but many of them could taken from the file because they don’t fullfill some requisits.
I have already tried to use Excel Filters to filer and delete those rows but it’s not working.
Can you please help me doing this?
The criteria is basically to delete all rows containing any number (1234567890) and – (hypen).
Also i would like to know if theres any quicker way to put the same formula on all the rows (drag and drop takes years on 600k rows..)
Thanks

Answer #1
Easy enough with a batch file but I’ll need to see the exact layout of the data.
Answer #2
You can use this in vim:
Press escape first, then type this:
:g/\d+.*-/d
First, do _not_ type the “d” at the very end, then it will show you what /would/ be deleted, press q to return to everything, then press colon, arrow up, and add a d at the end.
Download gvim if you’re on windows.

 

| Sitemap |