- Comparing two files
- Posted by BradS on January 31st, 2005
All,
I'm trying to compare files using <comm file1 file2>. This works great, it gives me separate
columns. First column is text only in file1, column two is text only in file2 and text that is in
both file1 & file2 in column 3. My problem is, when there are multiple occurrences of the same text
in file2, and it also appears in file1, it appears both in column 2 & 3.
My question is, how would I go about getting rid of the extra occurrences in column two but still
show the text in column 3.
Attached is a small example of the output after I compared the two files. The full result is approx.
10,000 lines.
CU1289DLL1
CU1289KSC1
CU1291KSC1
CU1292KSC1
CU1293KSC1
CU1294KSC1
CU1295KSC1
CU1296KSC1
CU1296KSC1
CU1300KSC1
CU1302KSC1
CU1304HUD1
CU1304HUD1
CU1304HUD1
CU1304KSC1
CU1305KSC1
CU1306KSC1
CU1307KSC1
CU1308HUD1
CU1308HUD1
CU1308HUD1
CU1308HUD1
CU1309HRN1
Thanks in advance
B~
- Posted by Stephane CHAZELAS on January 31st, 2005
2005-01-31, 13:58(+00), BradS:
What about running your files through uniq to have distinct
occurences?
comm <(uniq < file1) <(uniq < file2)
for instance (note that you need bash or zsh for that type of
command, ksh may also work but that would depend on your
operating system).
(If your system has /dev/fd/<n> files, you can also do
uniq < file1 | { uniq < file2 | comm /dev/fd/3 -; } 3<&0
if note, you can use a named pipe or temp file).
If:
file1 file2
a a
a a
b a
b d
c e
The result would be:
a
b
c
d
e
--
Stéphane
- Posted by BradS on February 1st, 2005
"Stephane CHAZELAS" <this.address@is.invalid> wrote in message
news:slrncvsiad.4br.stephane.chazelas@spam.is.inva lid...
That worked in bash. Thank you!
B~