|STOP PRESS: VIPER v1.01 released||Visit VIPER VERSION 1.01 for documentation and to download|
This project aims to produce a tool to remove errors in animal pedigree information caused by administrative and data handling faults. Large amounts of animal pedigree and characteristic data are logged and stored during the course of animal breeding studies. However, to be of any use for further programmes or analysis the data needs to be as free of error as possible.
Errors in data storage such as recording the wrong father for an animal or unnoticed change in associated gene data are easy to introduce when hundreds or thousands of individual animals are being dealt with. Unfortunately while it is relatively easy to process this data to find the existence of errors, finding and correcting the cause of the errors is more difficult. For example, it isn't straightforward to know if an error is in the pedigree i.e. the child-parent relationships or in the characteristics associated with the animals. An animal may be recorded as having a certain characteristic that on examination may not be possibly inherited from its two recorded parents.
The error may arise if the recording of one or both of the parents wrong, the recording of the characteristic in the child animal incorrect, or the characteristic in one of the parent animals wrong. To understand what has gone wrong, further examination of the problem animals' relations in the pedigree is necessary. However, in a text or spreadsheet-based document this quickly becomes tedious and confusing even when the operations to detect and show errors in the data are available. However, if we were to switch to a more graphical, user-friendly style of displaying the data then it would be easier to follow relationships in the pedigree. If we added on top the capabilities to interactively show up where errors occurred and where they could possibly be caused from we would have a way of examining the pedigree data and asking questions that would clear up or narrow down errors. Such a way of displaying and interacting with data is called Information Visualisation (IV).
Unlike human family trees, most recorded animal pedigrees have a large degree of in-breeding as scientists and breeders try to encourage certain characteristics through selective breeding. This makes the drawing of animal pedigrees more complex as two individuals may end up being related through two or more routes.
By extending current IV techniques for this type of data this project will make the interface less complex by interactively showing only selected individuals and their relationships. On top of this the scientists will also wish to view some display of the characteristics associated with the animals and again the complexity can be reduced by viewing only a handful of characteristics at a time. Even so, one male animal can easily sire dozens of children who are in turn related to dozens of female parents and then in turn again may have children of their own - and there may be a several characteristics at a time a scientist is interested in exploring for these animals.
Methods for seamlessly moving from showing one part of a pedigree to another will be developed to help scientists explore massive pedigrees. Once an initial interface is built then a means for exploring errors by asking 'what-if' questions will be developed.
The VIPER project is developing an interactive software tool for the visual exploration of pedigree-genotype datasets, which will support the analysis and resolution of inheritance errors in these datasets. The VIPER visualisation will display pedigree hierarchies with genotype inheritance patterns overlaid. Coupled with a robust genetic inference engine backend the interface will allow an expert user to identify and explore inheritance inconsistencies and determine their potential root causes and thus facilitate appropriate data cleaning.