De-dup Files from the Terminal
Here's a way to remove duplicate files using a cli tool.
Install fdupes
There are many tooling options. I have used fdupes to good effect.
Get it:
sudo apt install fdupes
Interactive phases
If you want to delete the duplicate files in the current directory, run this command:
fdupes . -d
This will detect duplicates then open a terminal ui for interactively deleting the duplicates. There are 3 phases to this process:
- Selection - which files will be operated on
- Tag - mark for deletion (or preservation)
- Execute - do the actual filesystem deletion
First, selection: There are many methods for selection. To see these (and other helpful commands in this interactive mode), type help
.
For our example, we'll delete those files that are duplicated according to the MacOS naming of duplicate files on creation/import: That is, the files get " 1" or " 2" appended to the end of the file name, for the 1st, 2nd, etc. duplicate copy.
(A note here: fdupes is smarter at detecting dupes than just checking file names. It actually compares bytes. Because, for example IMG_1.JPG and IMG_1 1.JPG aren't necessarily duplicates.)
Selection
Ok, on to the selection. At the interactive prompt, I'll type:
sele 1.JPG
This will select any files that have this pattern at the end of their file name. They will be highlighted in the UI. Note the two spaces in this command. The first is for the command, separating sele
from the file pattern. The second is for the first character of the pattern itself.
We could repeat this selection for " 2.JPG", " 3.MOV" etc.
Selecting in this way is faster for this use case compared to the default method. By default, your duplicates are brough up in sets and you preserve one file per set. But if you have 300 sets, this takes a long time.
Output can look like this (note this is already-tagged output):
Set 1 of 349:
1 [ ] ./IMG_7618.JPEG
2 [-] ./IMG_7618 1.JPEG
Set 2 of 349:
[-] ./IMG_0188 3.MOV
[+] ./IMG_0188 1.MOV
Set 3 of 349:
[+] ./IMG_0177 1.MOV
[-] ./IMG_0177 3.MOV
Set 4 of 349:
[+] ./IMG_6923 1.JPG
[-] ./IMG_6923 2.JPG
...
With the prompt:
( Preserve files [1 - 2, all, help] ):
Tagging
Now with a selection made, we want to tag, or mark, that selection for deletion. We do that with this command:
ds
That will put a -
sign in front of the selected files, like [-] ./IMG_6923 2.JPG
.
Execution
Finally, we get those files deleted. Beware, this command is the one that counds, and files will be removed from the filesystem upon execution.
To delete, type:
prune
(or press the Delete key).
Automatic delete
If you don't want an interactive delete, possibly in a simpler case, you can detect duplicates and delete them automatically with this command:
fdupes --order=name -d -N
--order
will order the files by name, -d
will delete found duplicates, and -N
(or --noprompt
) won't ask you if you're sure (non-interactive).
For other options on automatic processes, use fdupes --help
.
Now go out and regain that hard drive space!