How To : Find duplicates files

Thanks to this site I found out how to track duplicates files. I modified the solution proposed to adapt it to my own need. To sum up, the command get every file size, and compare them in order to know if there are same files sizes. If yes, a md5 hash will be done to be sure that the files are exactly the same.




We set the SEARCH variable which contains the path where we would like to track duplicates files :

root@host:~# SEARCH=/data; find $SEARCH -not -empty -type f -printf %s\\n | sort -rn | uniq -d | xargs -I{} -n1 find $SEARCH -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate


Licence Creative Commons
This website is licensed to the public under a licence Creative Commons Attribution licence.
Contact :