tree.hh: an STL-like C++ tree class

Kasper Peeters, kasper.peeters (at) gmail.com

Overview

The tree.hh library for C++ provides an STL-like container class for n-ary trees, templated over the data stored at the nodes. Various types of iterators are provided (post-order, pre-order, and others). Where possible the access methods are compatible with the STL or alternative algorithms are available. The library is available under the terms of the GNU General Public License version 2 or 3.
Documentation is available in the form of a pdf file (also available in the tarball as a LaTeX file). See the test program (included in the distribution) for an example of how to use tree.hh. Also look at the simple example below. There is also some doxygen generated documentation.
The tree.hh library is meant for generic n-ary trees. If you are only interested in AVL binary search trees (Adelson,Velskii & Landis), you may want to have a look at the C++ AVL tree template page.

License

The tree.hh code is available under the terms of the GNU General Public License 2 or 3. If you would like to use tree.hh under different conditions, contact me and we will work something out.
If you use tree.hh, please satisfy my curiosity and write me a small email with a bit of explanation of your software and the role of my tree class in it.

Download

Everything (the header file, examples, documentation and all other things referred to on this page) is contained in the tarball
tree-2.65.tar.gz
Feel free to copy the header tree.hh (which is all you need code-wise) into your own source directory as long as you respect the license (see above). The list of changes can be found in the ChangeLog.
See the intro above for links to the documentation. There is a very simple demonstration program available, tree_example.cc (also included in the tarball), which is discussed below. There is also a small test program, test_tree.cc, which makes use of the tree_util.hh utility functions by Linda Buisman; the output should be exactly identical to the test_tree.output file.
The current version works with GNU gcc 3.x and higher, Borland C++ builder and Microsoft Visual C++ 7.1 and higher (I no longer support older versions of Visual C++). It is compatible with STLport.

Mailing list

There is a mailing list for tree.hh, which is mostly used for announcements of new releases, but is also open for discussions about tree.hh which are of general interest. To subscribe, please visit the tree-hh mailing list web page.
I also announce major updates on Freshmeat though not as often as by email.

Projects using tree.hh

The tree.hh library is used in various projects:
Cadabra
A field-theory motivated approach to symbolic computer algebra.
Gnash
Gnash is a GNU Flash movie player. Previously, it was only possible to play flash movies with proprietary software. While there are some other free flash players, none support anything beyond SWF v4. Gnash is based on GameSWF, and supports many SWF v7 features.
Principles of Compiler Design
A course in compiler design at the Simon Fraser University, Canada.
liborigin
A library for reading OriginLab OPJ project files, which is used by QtiPlot and LabPlot, two applications for data analysis and visualisation.
EChem++
A project realizing the idea of a Problem Solving Environment (PSE) in the field of computational electrochemistry. Computer controlled experimental measurements, numerical simulation and analysis of electrochemical processes will be combined under a common user interface.
LZCS
A semistructured document transformation tool. LZCS compresses structured documents taking advantage of the redundant information that can appear in the structure. The main idea is that frequently repeated subtrees may exist and these can be replaced by a backward reference to their first occurance. See the accompanying paper for more details.
libOFX
A parser and an API designed to allow applications to very easily support OFX command responses, usually provided by financial institutions for statement downloads.
A genetic programming project
See this paper for more information.
FreeLing
The FreeLing package consists of a library providing language analysis services (such as morfological analysis, date recognition, PoS tagging, etc.)
Let me know about your project when you are using tree.hh, so that I can add it to the list.

Simple example

The following program constructs a tree of std::string nodes, puts some content in it and applies the find algorithm to find the node with content "two". It then prints the content of all the children of this node. You can download the source tree_example.cc if you're too lazy to type it in.
#include <algorithm>
#include <string>
#include <iostream>
#include "tree.hh"

using namespace std;

int main(int, char **)
   {
   tree<string> tr;
   tree<string>::iterator top, one, two, loc, banana;

   top=tr.begin();
   one=tr.insert(top, "one");
   two=tr.append_child(one, "two");
   tr.append_child(two, "apple");
   banana=tr.append_child(two, "banana");
   tr.append_child(banana,"cherry");
   tr.append_child(two, "peach");
   tr.append_child(one,"three");

   loc=find(tr.begin(), tr.end(), "two");
   if(loc!=tr.end()) {
      tree<string>::sibling_iterator sib=tr.begin(loc);
      while(sib!=tr.end(loc)) {
         cout << (*sib) << endl;
         ++sib;
         }
      cout << endl;
      tree<string>::iterator sib2=tr.begin(loc);
      tree<string>::iterator end2=tr.end(loc);
      while(sib2!=end2) {
         for(int i=0; i<tr.depth(sib2)-2; ++i) 
            cout << " ";
         cout << (*sib2) << endl;
         ++sib2;
         }
      }
   }
The output of this program is
apple
banana
peach

apple
banana
 cherry
peach
Note that this example only has one element at the top of the tree (in this case that is the node containing "one") but it is possible to have an arbitary number of such elements (then the tree is more like a "bush"). Observe the way in which the two types of iterators work. The first block of output, obtained using the sibling_iterator, only displays the children directly below "two". The second block iterates over all children at any depth below "two". In the second output block, the depth member has been used to determine the distance of a given node to the root of the tree.

Data structure

The data structure of the tree class is depicted below (see the documentation for more detailed information). Each node contains a pointer to the first and last child element, and each child contains pointers to its previous and next sibling:
           first_child        first_child 
 root_node-+----------node--+----->-------node
           |           |    |               |   
           |           |    |               V   next_sibling
           |           |    |               |
                       |    |             node
                       |    |               |
                       |    |               V   next_sibling
                       |    | last_child    |
                       |    +----->-------node
                       |                        
                       V next_sibling           
                       |                       
                       |     first_child                  
                      node--+----->-------node
                       |    |               |   
                       |    |               V   next_sibling
                       |    |               |
                       |    +-------------node
                       .
                       .
Iterators come in two types. The normal iterator iterates depth-first over all nodes. The beginning and end of the tree can be obtained by using the begin() and end() members. The other type of iterator only iterates over the nodes at one given depth (ie. over all siblings). One typically uses these iterators to iterate over all children of a node, in which case the [begin,end) range can be obtained by calling begin(iterator) and end(iterator).
Iterators can be converted from one type to the other; this includes the `end' iterators (all intervals are as usual closed at the beginning and open at the end).