About
Genetic Algorithm File Fitter, or just GAFFitter, is a command-line software written in C++ that arranges--via a genetic algorithm--an input list of items or files/directories into volumes of a certain capacity (target), such as CD or DVD, in a way that the total wastage is minimized. By smartly arranging the input list, GAFFitter fits better the given items and so optimizes (reduces) the number of required volumes to pack them.
Currently, GAFFitter runs on GNU/Linux and other POSIX systems, but it is designed in such manner that should be easily extended to non-POSIX operating environment.
Features
There are five key features behind GAFFitter, namely:
- Search by a global meta-heuristic (Genetic Algorithm search).
- The command-line interface provides high integration (via pipe) with other tools, i.e. it works as a "filter".
- Allows the user to enter 'size identifier' pairs directly instead of file/dir names.
- Pretty configurable. GAFFitter has many input parameters to control/adjust its behavior (including GA params).
- It is Free Software! (GPL)
Usage
Usage: gaffitter -t target [options...] <files>
... | gaffitter - -t target [options...] [files]
General options:
-t <f>, --target <f>
target size (mandatory), f>0.0
-b, --bytes
target, min and max size in bytes
-k, --kb
target, min and max size in kibi bytes (KiB); KB if --si
-m, --mb
target, min and max size in mebi bytes (MiB) [default]; MB if --si
-g, --gb
target, min and max size in gibi bytes (GiB); GB if --si
--si
use powers of 1000 (not 1024) for target, min, max and output sizes
-i <n>, --iter <n>
maximum number of iterations (volumes) [default = "unlimited"]
-v, --verbose
verbose
--min <n>, --min-size <n>
minimum file size [default = none]
--max <n>, --max-size <n>
maximum file size [default = none]
--bs <n>, --block-size <n>
the smallest amount of bytes a file can occupy [default = 1]
--ss, --show-size
print the size of each file
--sb, --show-bytes
also print the sizes in bytes
--su, --show-unselected
print the unselected files
--hsel, --hide-selected
don't print the selected files
--hs, --hide-summary
hide summary line containing sum, difference and number of
selected files
-s, --sort-by-size
sort the output by size, not by name
-n, --no-case
use case-insensitive sorting
-r, --sort-reverse
sort the output in reverse order
--ew <char>, --enclose-with <char>
enclose file names with "char" [default = none]
--dw <char>, --delimit-with <char>
delimit file names (lines) with "char" [default = newline]
--version
print GAFFitter version and exit
-h, --help
print this help and exit
Direct Input options:
--di, --direct-input
switch to direct input mode, i.e., read directly "size identifier"
pairs instead of file names
--di-b, --di-bytes
assume input sizes as bytes
--di-k, --di-kb
assume input sizes as kibi bytes (KiB); KB if --di-si
--di-m, --di-mb
assume input sizes as mebi bytes (MiB); MB if --di-si
--di-g, --di-gb
assume input sizes as gibi bytes (GiB); GB if --di-si
--di-si
use powers of 1000 (not 1024) for input sizes
Genetic Algorithm options:
--ga-s <n>, --ga-seed <n>
GA initialization seed, n>=0 [default = 1]; 0 = random
--ga-rs, --ga-random-seed
use random GA seed (same as --ga-seed 0)
--ga-ng <n>, --ga-num-generations <n>
maximum number of generations, n>0 [default = auto]
--ga-ps <n>, --ga-pop-size <n>
number of individuals, n>tournament_size [default = auto]
--ga-cp <f>, --ga-cross-prob <f>
crossover probability, 0.0<=f<=1.0 [default = 0.95]
--ga-mp <f>, --ga-mutation-prob <f>
mutation probability (per gene), 0.0<=f<=1.0 [default = auto]
--ga-sp <n>, --ga-sel-pressure <n>
selection pressure (tournament size), 2<=n<pop_size [default = 2]
Other search methods
--bf, --brute-force
tries all possible combinations (use carefully!)
--ap, --approximate
local approximation using Best First search (non-optimal but
very fast)
--sp, --split
just split the input when target size is reached (preserves
original order while splitting)
Examples
Simple usage
~$ gaffitter -t 700 *.mp3 brahms_4_balladen_op_10_3.mp3 iii_sarka.mp3 ... quatuor_gdur_grave.mp3 [1] Sum: 700.00MiB of 4.55GiB, Diff: 1Bytes, Files: 296/861 francesca_da_rimini_op_32.mp3 sinf_en_re_menor_lento.mp3 ... zapateado_allegro_vivace.mp3 [2] Sum: 700.00MiB of 3.87GiB, Diff: 59Bytes, Files: 43/565 . . .
Creating a maximum amount of volumes (-i/--iter option)
~$ gaffitter -t 4.37 -g -i 1 * beethoven_melos_quartett beethoven_ludwig_van bruch_dvorak ... richard_wagner wolf_strauss [1] Sum: 4.37GiB of 21.57GiB, Diff: 1.11KiB, Files: 14/68
Input via stdin (pipes)
~$ find . -type f | gaffitter - -t 1.4 -i 1 ./gaffitter ./optimizers/GeneticAlgorithm.o ./Input.o ... ./util/CVS/Repository [1] Sum: 1.40MiB of 1.49MiB, Diff: 0Bytes, Files: 20/39
Using the --split option
This option is useful when you need to preserve the order of the given files/items, thus just splitting them accordingly to the target size. This method usually wastes more space, though.
Suppose you have a few music files and you want to generate two volumes of 10MiB each. However, preserving the input order is important for you:
~$ gaffitter -t 10 -i 2 --split * 01_track1.mp3 3.73MiB 02_track2.mp3 5.42MiB [1] Sum: 9.15MiB of 77.28MiB, Diff: 870.00KiB, Files: 2/18 03_track3.mp3 3.37MiB 04_track4.mp3 3.69MiB [2] Sum: 7.06MiB of 68.13MiB, Diff: 2.94MiB, Files: 2/16
Of course, without this ordering restriction the volumes are better explored:
~$ gaffitter -t 10 -i 2 * 03_track3.mp3 3.37MiB 05_track5.mp3 3.49MiB 11_track11.mp3 3.14MiB [1] Sum: 10.00MiB of 77.28MiB, Diff: 2.00KiB, Files: 3/18 02_track2.mp3 5.42MiB 13_track13.mp3 4.54MiB [2] Sum: 9.96MiB of 67.28MiB, Diff: 38.00KiB, Files: 2/15
More features
~$ gaffitter -t 600 -k --show-size --show-unselected src/* src/optimizers 447.38KiB src/DiskUsage.o 104.19KiB src/util 32.42KiB src/Input.cc 4.10KiB src/Optimizer.hh 3.40KiB src/Input.hh 3.20KiB src/Params.hh 3.17KiB src/DiskUsage.hh 2.15KiB [1] Sum: 600.00KiB of 1.50MiB, Diff: 1Bytes, Files: 8/19 src/gaffitter 546.31KiB src/Input.o 152.47KiB src/Params.o 110.88KiB src/Optimizer.o 96.41KiB src/Params.cc 10.05KiB src/CVS 4.55KiB src/Optimizer.cc 4.41KiB src/Exception.hh 4.09KiB src/DiskUsage.cc 3.07KiB src/gaffitter.cc 2.35KiB src/Makefile 2.05KiB [1] <UNSELECTED> Sum: 936.65KiB of 1.50MiB, Files: 11/19
Direct Input
~$ gaffitter -t 3.14 --di --ss '1 id one' '2.4 ID2' '0.3 b' '0.5 foo' '1.23456789 bar' b 0.3 bar 1.23456789 foo 0.5 id one 1 [1] Sum: 3.03456789 of 5.43456789, Diff: 0.10543211, Items: 4/5 ID2 2.4 [2] Sum: 2.4 of 2.4, Diff: 0.74, Items: 1/1
~$ du * | gaffitter - -t 200 --di -i 1 optimizers util/CVS Makefile Exception.hh [1] Sum: 200 of 536, Diff: 0, Items: 4/22
~$ du * | gaffitter - -t 200 -k --di --di-k -i 1 optimizers util/CVS Makefile Exception.hh [1] Sum: 200.00KiB of 536.00KiB, Diff: 0Bytes, Files: 4/22
Or if you prefer, some screenshots can be found here :)
Integration with other Apps
Creation of ISO9660 image files
Being a filter, GAFFitter can be used for many tasks involving the packing of files and directories. For instance, this shell script creates CD/DVD ISO9660 images using genisoimage. The syntax is as following:
Usage: gaffitter-image --cd|--dvd|--cd74 [--split] [--iter <n>] [--link] files
where
--cd: create volumes of 700MB (data CD)
--dvd: create volumes of 4.38GB (data DVD)
--cd74: create volumes of 650MB (data CD)
--split: just splits the input (i.e. preserves original order)
--iter n: maximum number of volumes (default = as much as possible)
--link: create temporary hardlinks instead of copying (faster, but
won't work if there are files/dirs on different devices)
Example: gaffitter-image --dvd *
The generated image files will be stored in the current directory and
they will be named as CD0001.iso, ..., CD000n.iso
and DVD0001.iso, ..., DVD000n.iso for CD and DVD images respectively. n is the
number of volumes (bins) used to pack the given list of files/dirs.
Note 1: if you want to use mkisofs instead of genisoimage, please
replace the command genisoimage in variable ISO_CMD with
mkisofs.
Note 2: Because genisoimage/mkisofs merges the given paths instead of
storing files with their full paths, it is necessary to copy the
selected files/dirs into a temporary directory and finally calls
genisoimage/mkisofs passing the path of this directory. In order to
minimize this overhead, it is possible to just link the files/dirs (--link
option) instead of copying, but this procedure won't work if there are
files/dirs on different devices (cross-devices).
K3B and GAFFitter
Please, download this shell script and follow the usage instruction:
Usage: gaffitter-k3b --cd|--dvd|--cd74 [--split] [--iter <n>] files Where: --cd: create volumes of 700MB (data CD) --dvd: create volumes of 4.38GB (data DVD) --cd74: create volumes of 650MB (data CD) --split: just splits the input (preserves original order) --iter n: maximum number of volumes (default = as much as possible) Example: gaffitter-k3b --dvd *
Note:
- This script should work with filenames containing whitespaces
and other unusual characters (except for the backquote
`char, which is currently the delimiter used by this script --it can be changed, however). - The mentioned script is designed to work with GAFFitter 0.5.1 (and possibly with later versions too).
Nautilus Scripts for gaffitter-k3b
If you use Nautilus, then the following scripts may be useful:
- gaff-k3b-cd
- Packs files/dirs into CD volumes and burn them using K3B
- gaff-k3b-cd-split
- Packs files/dirs into CD volumes and burn them using K3B, but preserves the order
- gaff-k3b-dvd
- Packs files/dirs into DVD volumes and burn them using K3B
- gaff-k3b-dvd-split
- Packs files/dirs into DVD volumes and burn them using K3B, but preserves the order
Installation:
- Set the script files as executable and put them in
~/.gnome2/nautilus-scripts/
Usage:
- Under Nautilus, select the files/dirs to be packed
- Right-click the selection and go to the entry Scripts on the context menu
- Select one of the above mentioned scripts
Note: those scripts require gaffitter-k3b (see gaffitter.sf.net), awk, sed and echo
Resources: K3B website, Nautilus File Manager Scripts
Brasero and GAFFitter
Similar to gaffitter-k3b, there is a shell script called
gaff-brasero that integrates GAFFitter with
Brasero CD/DVD burner.
(Thanks to Mark Edgington)
Note: Unlike K3B, Brasero follows symbolic lynks automatically; at present Brasero doesn't provide means to disable such behaviour. Also, Brasero can only manage one volume at time, so on multiple volumes output Brasero will be called sequentially multiple times.
Following symbolic links using GNU Disk Usage (du)
GAFFitter doesn't follow symbolic links, however, this can be fully achieved by using du for getting the file/dir sizes:
du -Lbs <files/dirs> | gaffitter - --di --di-b <user options>
For example
gaffitter -t 700 --bs 2048 *
is equivalent to
du -Lbs * | gaffitter - --di --di-b -t 700 --bs 2048
except for the symbolic link dereference.
- Resources: GNU du manpage
License
GAFFitter is licensed under the GNU General Public License (GPL) Version 3 (or later), June 2007








