How To : Find duplicates files

Thanks to this site I found out how to track duplicates files. I modified the solution proposed to adapt it to my own need. To sum up, the command get every file size, and compare them in order to know if there are same files sizes. If yes, a md5 hash will be done to be sure that the files are exactly the same.

Intro

Configuration

Command

We set the SEARCH variable which contains the path where we would like to track duplicates files :

root@host:~# SEARCH=/data; find $SEARCH -not -empty -type f -printf %s\\n | sort -rn | uniq -d | xargs -I{} -n1 find $SEARCH -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate

Explanations

Licence Creative Commons
This website http://shebangthedolphins.net is licensed to the public under a licence Creative Commons Attribution licence.
Contact :