twf
is a Perl script that can turn a family tree (documented using what I'll
call a tree walk format flat file) into a groff/dot
diagram. The groff/dot
output generated by twf
can then be post processed (using dot
or any of the
other tools in the groff
tool set) to generate output in other/more human
readable formats, say, for example, pdf, which can show the relationships within
a family tree in a more diagrammatic/pictorial form.
An example file (abe.twf
) shows the input format used by the twf
script and
it depicts the relationships documented in the Book of Genesis in the Bible
starting from Abraham Adam.
Running make
in this directory will generate a pdf file which uses abe.twf
as the input file and documents a minimal long linkage from Adam all the way
to Abraham's family tree as per the Book of Genesis in the Bible.
An informal grammar that describes the tree walk format using something that resembles YACC/Bison (while borrowing some additional functionality such as the '%%' delimiter and "rule" notation from Perl6/Raku) is shown below:
// a "tree walk format"/twf input file consists of ...
twf : ( family '\n' )+ // ... one or more families, listed one per line
family : ( person %% '|')+ // a family consists of persons delimited by '|'
family : include filename.twf // pull data from the named file (in twf format)
person : name ( ',' age )? op // a person has a name and an optional age
person : '?' // a person's name may be unknown
person : '-' // birth order of remaining children is unknown
age : '?' | number '?'? // the age of the person may be unknown ...
// ... which is denoted by a '?' or it could be a known number
// which may be suffixed with a '?' to denote if it is doubtful
number : [0-9]+ // age (at time of death) is just a whole number
name : realname ( \s '(' nickname ')' )? // a person may have a different
// optional earlier name and/or a nickname
realname: [^\,\(\#\|]+ // names can have anything including spaces ...
nickname: [^\,\(\#\|\)]+ // .. except some special characters
op : [\.\!\^]? // the '.' operator denotes no descendants
// a cut/'!' operator indicates a back reference
// a forward pointer/'^' operator indicates that
// the person is resolved later, not immediately
Note that while doing genealogy research, the age of the deceased may be unknown (and sometimes disputed/unbelievable, such as the ones listed for Abraham and his family) and these can be denoted using '?'.
When a marriage between relatives occurs (for example, when Isaac marries his cousin Rebekah), mathematically, a family "tree" becomes a connected graph. To handle such cases, the cut/'!' operator provides a means of denoting that the current referent may be already embedded in the stack due to an earlier reference in the tree. The cut/'!' operator resolves the looping that can thus result from this, and yes, the cut/'!' operator is "borrowed" from Prolog although there is no backtracking involved here so it is not used in quite the same sense as it is used in Prolog.
In some cases, relationships can be complicated enough (for example, see Lot's family tree) that the cut /'!' operator needs a paired forward pointer/'^' reference operator as well. This operator just postpones the lookup of the person in the immediately following family entry and defers it until a later cut/'!' operator resolves it in the stack.
Turning the twf
format into the more well known GEDCOM
format is left as
an exercise for the reader. The primary (only?) advantage of twf
over GEDCOM
is that it is meant to be hand editable using any text editor of your choice.