so that my time is effectively used. My new position allows me to put time into OntoVistA, especially where it impacts configuration and implementation of VistA, and where it supports clinical decision support.
On the one hand, there is a LOT of information in VistA. On the other hand, there is a LOT of information in the various ontologies that are extant, such as SUMO and in Cyc. Unfortunately, VistA and the various public ontologies don't represent the same information in the same way. VistA is organized as a dynamic database system, with object oriented references and a multi-dimensional datastore. Ontologies by their nature, express things in terms of axioms, or always true statements. What might be stored in a procedural way in a Hospital Information System will probably be stored in a declarative fashion in a Knowledgebase. I think I might have a way to mechanically convert some of the information from the database-centric fashion of FileMan or a network database into the predicate logic form used by a KB. But with a mechanical conversion, there is the potential loss of meaning. The large size of VistA (tens of thousands of programs and tens of thousands of rows and columns) almost requires a mechanical process, or the encoder will be spending hundreds of hours making the same decision repeatedly. There is no requirement that a mechanical process must be a simplistic one. It is better to spend the human hours on the task reviewing the mechanical process and making it a more robust one that reflects the decisions made by untold analysts and designers in the past.
Since a lot of the meaning in VistA is established by the programs which use the data,
if a reasoning system is going to be able to use the same data, there is a need to make
some facts and information about those facts (sometimes called metadata) explicit.
Humans have a great ability to extrapolate a lot of information from a few facts.
Beyond that, to truly support decision making, the reasoning processes used by people need to become explicit as well. The way the information is used needs to be captured in addition to the way information is stored. I think the infinite regress potential here is rather daunting. It is easy to describe data, then data about data, then data about the data about the data, endlessly.
The most daunting aspect to this recursive process is the idea that we still are just describing what needs to be done, and haven't actually started doing it yet. I don't like the idea of not being effective.
But poorly planned activities many times yield poor results or even wasted results. So putting time in at the beginning of a process can have a lot of impact on the entire process.
I know some folks might feel this is just reviewing what is already known, but I think it is worthwhile to look over things and understand what general decisions need to be made before you spend your time focusing on specific details.