Deduposaur is a command-line program to help you add files to your personal archive.
Functions:
- It checks the integrity of your archive.
- It helps you process files before adding them to your archive.
- Renames files that you previously processed and decided not to add, adding DELETED to the filename.
- Renames files that already exist in your archive, adding DUPE to the filename.
- Renames files that already exist in your archive but their contents are different, adding CHANGED to the filename.
- Renames files that already exist in your archive, but their names or dates are different, adding METADATA to the filename.
- Leaves new files untouched.
- Remembers files that you delete.
$ cargo install deduposaur
First create an empty deduposaur.archive_metadata.json
file:
$ cd /my_archive
$ ls
1.jpg 2.jpg 3.jpg 4.jpg 5.jpg
$ touch deduposaur.archive_metadata.json
Then run deduposaur
:
$ deduposaur --archive=.
1.jpg is new
2.jpg is new
3.jpg is new
4.jpg is new
5.jpg is new
Verified /my_archive
$
To check your archive, simply run deduposaur
again:
$ deduposaur --archive=/my_archive
Verified /my_archive
$
After updating your archive, run deduposaur
again and respond to the prompts:
$ cd /my_archive
$ ls
deduposaur.archive_metadata.json 1.jpg 2.jpg 3.jpg 4.jpg 5.jpg
$ mv 2.jpg 2.hawaii.jpg
$ rm 3.jpg
$ echo 'corrupted' > 4.jpg
$ touch 5.jpg
$ deduposaur --archive=.
WARNING 2.jpg is renamed to 2.hawaii.jpg
Accept (y/n) or revert (r)? y
WARNING 3.jpg is deleted
Accept change? (y/n) y
WARNING 4.jpg is changed
Accept change? (y/n) n
WARNING 5.jpg mtime changed 2021-07-10T12:30:00-0700 -> 2021-07-20T15:11:03-0700
Accept (y/n) or revert (r)? r
$ cp /another_backup/4.jpg .
$ deduposaur --archive=.
Verified .
First, run deduposaur
and it will record metadata of new files and rename known files:
$ cd /new_files
$ ls
1.jpg 2.jpg 3.jpg 4.jpg 5.jpg 6.jpg 7.jpg
$ deduposaur --archive=/my_archive --process=.
Verified /my_archive
Created deduposaur.process_metadata.json
Renamed DUPE.1.jpg - /my_archive/1.jpg
Renamed DUPE.2.jpg - /my_archive/2.hawaii.jpg
Renamed DELETED.3.jpg
Renamed CHANGED.4.jpg - /my_archive/4.jpg
Renamed METADATA.5.jpg - /my_archive/5.jpg
$
Second, go through through the files. Move some to your archive and delete the rest.
$ rm DUPE.1.jpg
$ rm DUPE.2.jpg
$ rm DELETED.3.jpg
$ mv CHANGED.4.jpg /my_archive/4.jpg
$ rm METADATA.5.jpg
$ mv 6.jpg /my_archive/
$ rm 7.jpg
$ ls
deduposaur.process_metadata.json
$
Finally, run deduposaur
again to update your archive and remember the deleted files.
$ deduposaur --archive=/my_archive --process=.
/my_archive/4.jpg is replaced by 4.jpg
/my_archive/6.jpg is new
Verified /my_archive
METADATA.5.jpg was deleted
7.jpg was deleted
Deleting deduposaur.process_metadata.json since it is now empty.
$ ls
$
- Find a way to make it run faster. Re-validating the archive takes a long time.
Consider skipping validating the archive when the user specifies
--process
. - Report duplicates in archive and process dir
- DONE - Integration tests
- DONE - Make tests pass.
- Switch away from libraries with unsafe code:
structopt
(Why does command-line processing need unsafe code for?)serde_json
sha2
Metric output format: x/y
x = unsafe code used by the build
y = total unsafe code found in the crate
Symbols:
π = No `unsafe` usage found, declares #![forbid(unsafe_code)]
β = No `unsafe` usage found, missing #![forbid(unsafe_code)]
β’οΈ = `unsafe` usage found
Functions Expressions Impls Traits Methods Dependency
0/0 0/0 0/0 0/0 0/0 β deduposaur 0.1.0
1/1 44/90 2/2 0/0 0/0 β’οΈ βββ chrono 0.4.19
1/20 10/365 0/2 0/0 5/40 β’οΈ β βββ libc 0.2.124
0/0 0/0 0/0 0/0 0/0 β β βββ num-integer 0.1.44
0/0 4/10 0/0 0/0 0/0 β’οΈ β β βββ num-traits 0.2.14
β β [build-dependencies]
0/0 0/0 0/0 0/0 0/0 β β β βββ autocfg 1.1.0
β β [build-dependencies]
0/0 0/0 0/0 0/0 0/0 β β β βββ autocfg 1.1.0
0/0 4/10 0/0 0/0 0/0 β’οΈ β βββ num-traits 0.2.14
0/0 5/5 0/0 0/0 0/0 β’οΈ β βββ serde 1.0.136
0/0 0/0 0/0 0/0 0/0 β β β βββ serde_derive 1.0.136
0/0 12/12 0/0 0/0 3/3 β’οΈ β β βββ proc-macro2 1.0.37
0/0 0/0 0/0 0/0 0/0 π β β β βββ unicode-xid 0.2.2
0/0 0/0 0/0 0/0 0/0 β β β βββ quote 1.0.18
0/0 12/12 0/0 0/0 3/3 β’οΈ β β β βββ proc-macro2 1.0.37
0/0 47/47 3/3 0/0 2/2 β’οΈ β β βββ syn 1.0.91
0/0 12/12 0/0 0/0 3/3 β’οΈ β β βββ proc-macro2 1.0.37
0/0 0/0 0/0 0/0 0/0 β β β βββ quote 1.0.18
0/0 0/0 0/0 0/0 0/0 π β β βββ unicode-xid 0.2.2
1/1 218/218 0/0 0/0 0/0 β’οΈ β βββ time 0.1.44
1/20 10/365 0/2 0/0 5/40 β’οΈ β βββ libc 0.2.124
0/0 35/78 0/0 0/0 0/0 β’οΈ βββ filetime 0.2.16
0/0 0/0 0/0 0/0 0/0 β β βββ cfg-if 1.0.0
1/20 10/365 0/2 0/0 5/40 β’οΈ β βββ libc 0.2.124
0/0 0/0 0/0 0/0 0/0 β βββ hex 0.4.3
0/0 5/5 0/0 0/0 0/0 β’οΈ β βββ serde 1.0.136
0/0 5/5 0/0 0/0 0/0 β’οΈ βββ serde 1.0.136
0/0 4/7 0/0 0/0 0/0 β’οΈ βββ serde_json 1.0.79
0/0 7/7 0/0 0/0 0/0 β’οΈ β βββ itoa 1.0.1
7/9 587/723 0/0 0/0 2/2 β’οΈ β βββ ryu 1.0.9
0/0 5/5 0/0 0/0 0/0 β’οΈ β βββ serde 1.0.136
0/0 6/6 0/0 0/0 0/0 β’οΈ βββ serde_with 1.13.0
1/1 44/90 2/2 0/0 0/0 β’οΈ β βββ chrono 0.4.19
0/0 0/0 0/0 0/0 0/0 β β βββ doc-comment 0.3.3
0/0 0/0 0/0 0/0 0/0 β β βββ hex 0.4.3
0/1 0/1 0/0 0/0 0/0 β β βββ rustversion 1.0.6
0/0 5/5 0/0 0/0 0/0 β’οΈ β βββ serde 1.0.136
0/0 4/7 0/0 0/0 0/0 β’οΈ β βββ serde_json 1.0.79
0/0 0/0 0/0 0/0 0/0 π β βββ serde_with_macros 1.5.2
0/0 0/0 0/0 0/0 0/0 β β βββ darling 0.13.4
0/0 0/0 0/0 0/0 0/0 β β β βββ darling_core 0.13.4
0/0 0/0 0/0 0/0 0/0 β β β β βββ fnv 1.0.7
0/0 0/0 0/0 0/0 0/0 β β β β βββ ident_case 1.0.1
0/0 12/12 0/0 0/0 3/3 β’οΈ β β β βββ proc-macro2 1.0.37
0/0 0/0 0/0 0/0 0/0 β β β β βββ quote 1.0.18
0/0 0/0 0/0 0/0 0/0 π β β β βββ strsim 0.10.0
0/0 47/47 3/3 0/0 2/2 β’οΈ β β β βββ syn 1.0.91
0/0 0/0 0/0 0/0 0/0 β β β βββ darling_macro 0.13.4
0/0 0/0 0/0 0/0 0/0 β β β βββ darling_core 0.13.4
0/0 0/0 0/0 0/0 0/0 β β β βββ quote 1.0.18
0/0 47/47 3/3 0/0 2/2 β’οΈ β β βββ syn 1.0.91
0/0 12/12 0/0 0/0 3/3 β’οΈ β βββ proc-macro2 1.0.37
0/0 0/0 0/0 0/0 0/0 β β βββ quote 1.0.18
0/0 47/47 3/3 0/0 2/2 β’οΈ β βββ syn 1.0.91
8/8 202/202 0/0 0/0 0/0 β’οΈ βββ sha2 0.9.9
0/0 6/6 0/0 0/0 0/0 β’οΈ β βββ block-buffer 0.9.0
1/1 292/292 20/20 8/8 5/5 β’οΈ β β βββ generic-array 0.14.5
0/0 5/5 0/0 0/0 0/0 β’οΈ β β βββ serde 1.0.136
0/0 0/0 0/0 0/0 0/0 π β β βββ typenum 1.15.0
β β [build-dependencies]
0/0 0/0 0/0 0/0 0/0 β β β βββ version_check 0.9.4
0/0 0/0 0/0 0/0 0/0 β β βββ cfg-if 1.0.0
0/1 0/14 0/0 0/0 0/0 β β βββ cpufeatures 0.2.2
0/0 0/0 0/0 0/0 0/0 π β βββ digest 0.9.0
1/1 292/292 20/20 8/8 5/5 β’οΈ β β βββ generic-array 0.14.5
0/0 0/0 0/0 0/0 0/0 β β βββ opaque-debug 0.3.0
0/0 0/0 0/0 0/0 0/0 π βββ structopt 0.3.26
0/0 1/1 0/0 0/0 0/0 β’οΈ βββ clap 2.34.0
0/0 32/32 0/0 0/0 0/0 β’οΈ β βββ ansi_term 0.12.1
0/0 5/5 0/0 0/0 0/0 β’οΈ β β βββ serde 1.0.136
2/2 45/45 0/0 0/0 0/0 β’οΈ β βββ atty 0.2.14
1/20 10/365 0/2 0/0 5/40 β’οΈ β β βββ libc 0.2.124
0/0 0/0 0/0 0/0 0/0 β β βββ bitflags 1.3.2
0/0 0/0 0/0 0/0 0/0 β β βββ strsim 0.8.0
0/0 0/0 0/0 0/0 0/0 β β βββ textwrap 0.11.0
0/0 0/0 0/0 0/0 0/0 β β β βββ unicode-width 0.1.9
0/0 0/0 0/0 0/0 0/0 β β βββ unicode-width 0.1.9
0/0 0/0 0/0 0/0 0/0 β β βββ vec_map 0.8.2
0/0 5/5 0/0 0/0 0/0 β’οΈ β βββ serde 1.0.136
0/0 7/7 1/1 0/0 0/0 β’οΈ βββ lazy_static 1.4.0
0/0 0/0 0/0 0/0 0/0 π βββ structopt-derive 0.4.18
0/0 0/0 0/0 0/0 0/0 β βββ heck 0.3.3
0/0 0/0 0/0 0/0 0/0 β β βββ unicode-segmentation 1.9.0
0/0 0/0 0/0 0/0 0/0 π βββ proc-macro-error 1.0.4
0/0 0/0 0/0 0/0 0/0 β β βββ proc-macro-error-attr 1.0.4
β β [build-dependencies]
0/0 0/0 0/0 0/0 0/0 β β β βββ version_check 0.9.4
0/0 12/12 0/0 0/0 3/3 β’οΈ β β βββ proc-macro2 1.0.37
0/0 0/0 0/0 0/0 0/0 β β β βββ quote 1.0.18
0/0 12/12 0/0 0/0 3/3 β’οΈ β βββ proc-macro2 1.0.37
0/0 0/0 0/0 0/0 0/0 β β βββ quote 1.0.18
0/0 47/47 3/3 0/0 2/2 β’οΈ β βββ syn 1.0.91
β [build-dependencies]
0/0 0/0 0/0 0/0 0/0 β β βββ version_check 0.9.4
0/0 12/12 0/0 0/0 3/3 β’οΈ βββ proc-macro2 1.0.37
0/0 0/0 0/0 0/0 0/0 β βββ quote 1.0.18
0/0 47/47 3/3 0/0 2/2 β’οΈ βββ syn 1.0.91
21/44 1564/2168 26/28 8/8 17/52
License: Apache-2.0