em português

Menu

Home Page

Project Page (SF.net)

GAFFitter (Freshmeat)

Latest news

2008-06-04: Released 0.5.2 version! (More...)

Download

Latest version (0.5.2)

SVN Repository

Official Debian packages

Documentation

Class Hierarchy

Roadmap

Author

Douglas A. Augusto daaugusto@gmail.com

daaugusto@jabber.org




SourceForge.net Logo

Last updated on Fri Jul 4 19:08:08 2008.

About

Genetic Algorithm File Fitter, or just GAFFitter, is a command-line software written in C++ that arranges--via a genetic algorithm--an input list of items or files/directories into volumes of a certain capacity (target), such as CD or DVD, in a way that the total wastage is minimized. By smartly arranging the input list, GAFFitter fits better the given items and so optimizes (reduces) the number of required volumes to pack them.

Currently, GAFFitter runs on GNU/Linux and other POSIX systems, but it is designed in such manner that should be easily extended to non-POSIX operating environment.

Features

There are five key features behind GAFFitter, namely:

Usage

Usage: gaffitter -t target [options...] <files>
       ... | gaffitter - -t target [options...] [files]
General options:
  -t <f>, --target <f>
     target size (mandatory), f>0.0
  -b, --bytes
     target, min and max size in bytes
  -k, --kb
     target, min and max size in kibi bytes (KiB); KB if --si
  -m, --mb
     target, min and max size in mebi bytes (MiB) [default]; MB if --si
  -g, --gb
     target, min and max size in gibi bytes (GiB); GB if --si
  --si
     use powers of 1000 (not 1024) for target, min, max and output sizes
  -i <n>, --iter <n>
     maximum number of iterations (volumes) [default = "unlimited"]
  -v, --verbose
     verbose
  --min <n>, --min-size <n>
     minimum file size [default = none]
  --max <n>, --max-size <n>
     maximum file size [default = none]
  --bs <n>, --block-size <n>
     the smallest amount of bytes a file can occupy [default = 1]
  --ss, --show-size
     print the size of each file
  --sb, --show-bytes
     also print the sizes in bytes
  --su, --show-unselected
     print the unselected files
  --hsel, --hide-selected
     don't print the selected files
  --hs, --hide-summary
     hide summary line containing sum, difference and number of
     selected files
  -s, --sort-by-size
     sort the output by size, not by name
  -n, --no-case
     use case-insensitive sorting
  -r, --sort-reverse
     sort the output in reverse order
  --ew <char>, --enclose-with <char>
     enclose file names with "char" [default = none]
  --dw <char>, --delimit-with <char>
     delimit file names (lines) with "char" [default = newline]
  --version
     print GAFFitter version and exit
  -h, --help
     print this help and exit
Direct Input options:
  --di, --direct-input
     switch to direct input mode, i.e., read directly "size identifier"
     pairs instead of file names
  --di-b, --di-bytes
     assume input sizes as bytes
  --di-k, --di-kb
     assume input sizes as kibi bytes (KiB); KB if --di-si
  --di-m, --di-mb
     assume input sizes as mebi bytes (MiB); MB if --di-si
  --di-g, --di-gb
     assume input sizes as gibi bytes (GiB); GB if --di-si
  --di-si
     use powers of 1000 (not 1024) for input sizes
Genetic Algorithm options:
  --ga-s <n>, --ga-seed <n>
     GA initialization seed, n>=0 [default = 1]; 0 = random
  --ga-rs, --ga-random-seed
     use random GA seed (same as --ga-seed 0)
  --ga-ng <n>, --ga-num-generations <n>
     maximum number of generations, n>0 [default = auto]
  --ga-ps <n>, --ga-pop-size <n>
     number of individuals, n>tournament_size [default = auto]
  --ga-cp <f>, --ga-cross-prob <f>
     crossover probability, 0.0<=f<=1.0 [default = 0.95]
  --ga-mp <f>, --ga-mutation-prob <f>
     mutation probability (per gene), 0.0<=f<=1.0 [default = auto]
  --ga-sp <n>, --ga-sel-pressure <n>
     selection pressure (tournament size), 2<=n<pop_size [default = 2]
Other search methods
  --bf, --brute-force
     tries all possible combinations (use carefully!)
  --ap, --approximate
     local approximation using Best First search (non-optimal but
     very fast)
  --sp, --split
     just split the input when target size is reached (preserves
     original order while splitting)

Examples

Simple usage

~$ gaffitter -t 700 *.mp3
brahms_4_balladen_op_10_3.mp3
iii_sarka.mp3
...
quatuor_gdur_grave.mp3

[1] Sum: 700.00MiB of 4.55GiB, Diff: 1Bytes, Files: 296/861

francesca_da_rimini_op_32.mp3
sinf_en_re_menor_lento.mp3
...
zapateado_allegro_vivace.mp3

[2] Sum: 700.00MiB of 3.87GiB, Diff: 59Bytes, Files: 43/565

.
.
.

Creating a maximum amount of volumes (-i/--iter option)

~$ gaffitter -t 4.37 -g -i 1 *
beethoven_melos_quartett
beethoven_ludwig_van
bruch_dvorak
... 
richard_wagner
wolf_strauss

[1] Sum: 4.37GiB of 21.57GiB, Diff: 1.11KiB, Files: 14/68

Input via stdin (pipes)

~$ find . -type f | gaffitter - -t 1.4 -i 1
./gaffitter
./optimizers/GeneticAlgorithm.o
./Input.o
...
./util/CVS/Repository

[1] Sum: 1.40MiB of 1.49MiB, Diff: 0Bytes, Files: 20/39

Using the --split option

This option is useful when you need to preserve the order of the given files/items, thus just splitting them accordingly to the target size. This method usually wastes more space, though.

Suppose you have a few music files and you want to generate two volumes of 10MiB each. However, preserving the input order is important for you:

~$ gaffitter -t 10 -i 2 --split *
01_track1.mp3   3.73MiB
02_track2.mp3   5.42MiB

[1] Sum: 9.15MiB of 77.28MiB, Diff: 870.00KiB, Files: 2/18

03_track3.mp3   3.37MiB
04_track4.mp3   3.69MiB

[2] Sum: 7.06MiB of 68.13MiB, Diff: 2.94MiB, Files: 2/16

Of course, without this ordering restriction the volumes are better explored:

~$ gaffitter -t 10 -i 2 *
03_track3.mp3   3.37MiB
05_track5.mp3   3.49MiB
11_track11.mp3  3.14MiB

[1] Sum: 10.00MiB of 77.28MiB, Diff: 2.00KiB, Files: 3/18

02_track2.mp3   5.42MiB
13_track13.mp3  4.54MiB

[2] Sum: 9.96MiB of 67.28MiB, Diff: 38.00KiB, Files: 2/15

More features

~$ gaffitter -t 600 -k --show-size --show-unselected src/* 
src/optimizers  447.38KiB
src/DiskUsage.o 104.19KiB
src/util        32.42KiB
src/Input.cc    4.10KiB
src/Optimizer.hh        3.40KiB
src/Input.hh    3.20KiB
src/Params.hh   3.17KiB
src/DiskUsage.hh        2.15KiB

[1] Sum: 600.00KiB of 1.50MiB, Diff: 1Bytes, Files: 8/19

src/gaffitter   546.31KiB
src/Input.o     152.47KiB
src/Params.o    110.88KiB
src/Optimizer.o 96.41KiB
src/Params.cc   10.05KiB
src/CVS 4.55KiB
src/Optimizer.cc        4.41KiB
src/Exception.hh        4.09KiB
src/DiskUsage.cc        3.07KiB
src/gaffitter.cc        2.35KiB
src/Makefile    2.05KiB

[1] <UNSELECTED> Sum: 936.65KiB of 1.50MiB, Files: 11/19

Direct Input

~$ gaffitter -t 3.14 --di --ss '1 id one' '2.4 ID2' '0.3 b' '0.5 foo' '1.23456789 bar'
b       0.3
bar     1.23456789
foo     0.5
id one  1

[1] Sum: 3.03456789 of 5.43456789, Diff: 0.10543211, Items: 4/5

ID2     2.4

[2] Sum: 2.4 of 2.4, Diff: 0.74, Items: 1/1

~$ du * | gaffitter - -t 200 --di -i 1
optimizers
util/CVS
Makefile
Exception.hh

[1] Sum: 200 of 536, Diff: 0, Items: 4/22
~$ du * | gaffitter - -t 200 -k --di --di-k -i 1
optimizers
util/CVS
Makefile
Exception.hh

[1] Sum: 200.00KiB of 536.00KiB, Diff: 0Bytes, Files: 4/22

Or if you prefer, some screenshots can be found here :)

Integration with other Apps

Creation of ISO9660 image files

Being a filter, GAFFitter can be used for many tasks involving the packing of files and directories. For instance, this shell script creates CD/DVD ISO9660 images using genisoimage. The syntax is as following:

Usage: gaffitter-image --cd|--dvd|--cd74 [--split] [--iter <n>] [--link] files

where

  --cd:     create volumes of 700MB (data CD)
  --dvd:    create volumes of 4.38GB (data DVD)
  --cd74:   create volumes of 650MB (data CD)
  --split:  just splits the input (i.e. preserves original order)
  --iter n: maximum number of volumes (default = as much as possible)
  --link:   create temporary hardlinks instead of copying (faster, but
            won't work if there are files/dirs on different devices)

Example: gaffitter-image --dvd *

The generated image files will be stored in the current directory and they will be named as CD0001.iso, ..., CD000n.iso and DVD0001.iso, ..., DVD000n.iso for CD and DVD images respectively. n is the number of volumes (bins) used to pack the given list of files/dirs.

Note 1: if you want to use mkisofs instead of genisoimage, please replace the command genisoimage in variable ISO_CMD with mkisofs.

Note 2: Because genisoimage/mkisofs merges the given paths instead of storing files with their full paths, it is necessary to copy the selected files/dirs into a temporary directory and finally calls genisoimage/mkisofs passing the path of this directory. In order to minimize this overhead, it is possible to just link the files/dirs (--link option) instead of copying, but this procedure won't work if there are files/dirs on different devices (cross-devices).

K3B and GAFFitter

Please, download this shell script and follow the usage instruction:

Usage: gaffitter-k3b --cd|--dvd|--cd74 [--split] [--iter <n>] files

Where:

  --cd:     create volumes of 700MB (data CD)
  --dvd:    create volumes of 4.38GB (data DVD)
  --cd74:   create volumes of 650MB (data CD)
  --split:  just splits the input (preserves original order)
  --iter n: maximum number of volumes (default = as much as possible)

Example: gaffitter-k3b --dvd *

Note:

  1. This script should work with filenames containing whitespaces and other unusual characters (except for the backquote ` char, which is currently the delimiter used by this script --it can be changed, however).

  2. The mentioned script is designed to work with GAFFitter 0.5.1 (and possibly with later versions too).

Nautilus Scripts for gaffitter-k3b

If you use Nautilus, then the following scripts may be useful:

gaff-k3b-cd
Packs files/dirs into CD volumes and burn them using K3B
gaff-k3b-cd-split
Packs files/dirs into CD volumes and burn them using K3B, but preserves the order
gaff-k3b-dvd
Packs files/dirs into DVD volumes and burn them using K3B
gaff-k3b-dvd-split
Packs files/dirs into DVD volumes and burn them using K3B, but preserves the order

Installation:

Usage:

  1. Under Nautilus, select the files/dirs to be packed
  2. Right-click the selection and go to the entry Scripts on the context menu
  3. Select one of the above mentioned scripts

Note: those scripts require gaffitter-k3b (see gaffitter.sf.net), awk, sed and echo

Resources: K3B website, Nautilus File Manager Scripts

Brasero and GAFFitter

Similar to gaffitter-k3b, there is a shell script called gaff-brasero that integrates GAFFitter with Brasero CD/DVD burner. (Thanks to Mark Edgington)

Note: Unlike K3B, Brasero follows symbolic lynks automatically; at present Brasero doesn't provide means to disable such behaviour. Also, Brasero can only manage one volume at time, so on multiple volumes output Brasero will be called sequentially multiple times.

Following symbolic links using GNU Disk Usage (du)

GAFFitter doesn't follow symbolic links, however, this can be fully achieved by using du for getting the file/dir sizes:

du -Lbs <files/dirs> | gaffitter - --di --di-b <user options>

For example

gaffitter -t 700 --bs 2048 *

is equivalent to

du -Lbs * | gaffitter - --di --di-b -t 700 --bs 2048

except for the symbolic link dereference.

License

GAFFitter is licensed under the GNU General Public License (GPL) Version 3 (or later), June 2007

Related software