Infrastructure Perspective: Forest Controls
Overview
A |
Consider the problem of listing all existing software. Softwares naturally falls into different categories. A simple example follows:
+ Linux - Administrative - Databases - Publishing o Latex o Star Office - Web o Netscape o Mozzilla - Email + FreeBSD + Windows + VMWare + DOS
and so on. This can be conceptually visualised as a tree (a forest is by definition a bunch of unrelated trees), or as representing some kind of a file system. Each node (inode if you think interms of a file system) stores pointers which help us find the actual information. The actual information for each node is stored in static strings, and pointers to them (their s_str_id's) are stored with the node. This kind of hierarchical data cannot be stored in a relational database in a natural format. So we flatten this tree into a list and store it in the database. If you are thinking in terms of a filesystem, we are storing the the full path name of all the files in the filesystem in the database. This flattened information is enough for us to recontruct the tree (i.e. given the full path names of all the files, we can reconstruct the directory structure). The Forest Control implements this unflattening algorithm and also helps render this information in the form of a web page.
The faqs table
T |
| Field | Type | +-------------+--------------+ | doc_node_id | int(11) | | path_desc | varchar(255) | | keywords | text | | long_num | int(11) | | short_num | int(11) | | desc_text | text | | private | int(11) |
The doc_node_id field is the primary key which is there for purely technical reasons. The path_desc field contains the full path name of the node. The keywords field contains the keywords for this node. This is to facilitate searching. The desc_text field contains a small description about the contents of the node. The private field indicates whether this node should be available for public viewing or for techstaff viewing only. The actual data of the node is stored in two static strings, whose id's are short_num and long_num respectively. Except for the path_desc, all other information is optional.
The exact use the short_num and the long_num fields are put to may vary. In our example of software listing, the static string keyed under short_num may contain a short description of the data (more detail then desc_text), and long_num contains the full description. If we are using this to represent a FAQ, then short_num's static string could represent the question the FAQ is answering and long_num the answer to the question. The contents of the table to represent the software listing shown above is:
(path_desc), (desc_text) /linux, Linux /windows, Windows /vmware, VMWare /dos, DOS /freebsd, FreeBSD /linux/adm, Administrative /linux/web, Web /linux/email, Email /linux/publish, Publishing /linux/publish/latex, LaTeX /linux/web/mozilla, Mozzilla ........
How does the Control work?
E |
- Calls display.fetch_records(). It returns a list of dictionaries . Usually fetch_records executes an sql query, modifies each dictionary (if need be), and returns the list of dictionaries.
- Unflattens this and constructs a tree representation of the data. Each node has all the information which was originally there.
- Creates an instance of TheNodeControl (some descendent of NodeControl), which was passed in as a argument of __init__ of this control. NodeControls can have children which are controls. So this is used to create a tree of NodeControls, where each node in the tree is instantiated with the corresponding data.
- Finally it calls the output method of the top level instance of TheNodeControl. Each instance of NodeControl is responsible for making sure that its children get called (if need be).
What does the NodeControl do?
W |
- Call self.allow_output to see if this control should output anything at all.
- If so, it calls node_output node_output should return the container into which all its children should output. It can be the container which node_output got, a new container (which should have been inserted in the original one) or None.
- If the return value of node_output (or the original container if allow_output returned false) is not None, then the output functions of the children are called with this container as the original container.
- Otherwise, the children dont get to output anything at all.
Hence one call to the output method of the top level NodeControl ends up calling the output methods of the entire tree. All the creativity for a particular page goes into designing the node_output functions.
What is a CaseNodeControl object?
O |
class MyNodeControl(CaseNodeControl, faqcontrols.TextOnlyNodeControl, faqcontrols.TextULNodeControl): """This is the node control which masquerades as a TextULNodeControl at level 1, and as a TextOnlyNodeControl at level 2. All the masquerading code is in CaseNodeControl. You have to inherit from the other controls for typecorrectness."""DictClasses = { 1 : faqcontrols.TextULNodeControl, 2 : faqcontrols.TextOnlyNodeControl }
creates a NodeControl object which when displaying nodes at level 1, does what TextULNodeControl does, at level 2 does what TextOnlyNodeControl does and at all other levels does not display anything.
What can I do in fetch_records?
I |
There can be more than one forestcontrol in a page (I dont have an actual instance of such a page), each one of them calls the same fetch_records function which is part of your display class. In order to help distinguish, fetch_records is passed in a parameter which the name of the field which has the path information. So in the previous example, fetch_records is passed the value "node_path" as an argument. YOu can use this to distinguish different controls. I dont expect the need to have multiple forestcontrols in one page anytime soon though.
A working example
F |