用 comm 來過濾 e-mail 清單
comm 指令可以用來找出兩份清單中相同或不同的部分。包在 POSIX 標準(文件)內所以在一些奇奇怪怪的機器上都可以用。
例如這兩份清單:
left.txt
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
right.txt
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
下這樣的指令
$ comm <(cat left.txt | sort) <(cat right.txt | sort)
會得到這樣的結果
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
由左至右(不是由上至下)為「第一個檔案才有的項目」「第二個檔案才有的項目」「兩個檔案都出現的項目」。注意資料要先 sort 過,不然 comm 比對的結果會是錯的。若不希望結果有重複項目可以用 sort -u
來排序。
$ comm <(cat left.txt | sort -u) <(cat right.txt | sort -u)
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
另外可以用 -1
-2
-3
參數隱藏三欄輸出資料(一樣由左至右理解)中的任何欄位。
例如這樣是顯示同時出現在兩份清單的項目:
$ comm -12 <(cat left.txt | sort) <(cat right.txt | sort)
[email protected]
[email protected]
[email protected]
這樣是顯示出現在檔案一,但沒出現在檔案二的項目,等同於過濾掉 e-mail 黑名單。
$ comm -23 <(cat left.txt | sort) <(cat right.txt | sort)
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]