Tuesday, 17 September 2013

How to see if number in column from file1 is between numbers from columns in file2 using Bash - nested loops?

How to see if number in column from file1 is between numbers from columns
in file2 using Bash - nested loops?

I would like to do the following in the bash command line...
I have 2 files. File1 looks like.
585 1504 13 10000 10468 ID1
585 3612 114 10468 11447 ID2
585 437 133 11503 11675 ID1
File2 looks like.
400220 10311 10311
400220 11490 11490
400220 11923 11923
for each number in File2 column 2, I would like to know if it is between
any of the number pairs in File1 columns 4 and 5 And create File3.txt with
the output as follows...
If yes, I want to write column 2 from File2 and column 6 from File1 to
File3. If no, I want to write column 2 from File2 and the string "NoID" to
File3. So for the example data File3.txt should look like so.
10311 ID1
11490 NoID
11923 NoID
I am used to working in Python and in there would write a script using a
nested for loops and if statements, but would prefer to use Bash for this
(of which I am still a relative beginner). It seems to me that using a
similar nested loop approach combined with awk and other conditional
statements could be the way to go. can anyone suggest good ideas with
maybe example syntax?
NB. The actual files contain over 3 million rows data
Cheers muchly in advance

No comments:

Post a Comment