OntoVistA: May 2009

Saturday, May 30, 2009

WHY: OntoVistA as a Medical Support System

The nature of medicine is that it has a incredibly broad base of information that is relevant to its practice. There are medical conditions which are influenced by almost every aspect of life. Not only does there exist conditions that are caused by the presence of environmental factors, genetic factors, and behavioral factors, there are also conditions that are caused by the ABSENCE of the same kinds of factors.

Generally when one looks for some part of human life to computerize, it is useful to limit the area being covered to the bare minimum in that endeavor so that you can slowly scale up the computer coverage from a small working base to a more capable system that covers the day-to-day aspects. This is difficult when the subject area is medicine, because almost any interesting subset of medicine is as complex as the set of medicine as a whole. So you find that bringing in a computer to help a small area quickly means that you are using the computer to handle a huge area. It is as if the only type of problems are toy problems, and life-critical practical systems. This also means that it is usually easier to teach someone who already knows medicine how to get the computer to work for them, than to take someone who is a computer specialist and try to explain to them all the intricate interactions which come about because of a particular medical specialty. VistA, by the nature of its unique history, did that far more than a lot of other systems of comparable size.

VistA started as an effort within the VA hospitals to bring in computers to enhance the provision of care to a largely stable population. The explosion of new veterans after the Vietnam war meant the methods of healthcare delivery developed after World War II simply could not keep up with the demands.

In non-veteran healthcare systems, a constantly changing patient population only come to the doctors or hospitals when an acute care need arises. The medical situation for the non-veterans are at the end of a disease process, when preventative care is no longer very effective, and some acute distress forces them to seek care. The nature of their visits means that they are more willing to spend money to get "fixed", but it also means that the only time the care system receives money is during the crisis. When one reviews the impact on the computer systems, they primarily track billing expenses, and itemize care given, with an underlying assumption that there is no need to keep long term records because when that person is seen again, their needs will be drastically different than this time, because it will be some other acute distress as this one will go away after they leave the hospital or clinic.

In contrast to this scenario, within the VA, the large number of veterans who are seen are seeking care because of an incident that occured when they were in active military service, and this service-connected problem has a continuing impact on their lives. Also, because the patients have already "invested" in their care by serving in the armed forces, the cost of their care is covered by allocations from the U.S. Congress, which usually come in the form of a fixed amount of money per patient per year. The VA, as stewards of tax money, wisely takes a longer view of healthcare, and deems preventative care as a way of dealing with disease processes before they reach a critical acute incident. Since there is a long term commitment to these particular veteran patients, the computer systems also reflect the chronic nature of this reality. Once a patient record is established, it will be available for years to guide healthcare providers. Long term trends as well as cumulative analysis is possible due to the wealth of data kept about a patient. Security and privacy of records is built into the design of the computer system at a deeper level, because there is more information to safeguard. Since many VA facilities are affiliated with a medical school, computer decision support is part of VistA to remind new practitioners of interactions between the various drugs, tests and pre-existing conditions. Since students and residents by the very nature of their activities are transient, checks and balances are built in to allow the established providers to review their work, and reflect the local standards of practice within the customized behavior of the computer and healthcare system.

Since VistA is tracking both healthy and ill patients, it has to track a wider range of human behaviour. The system has to model why particular activities are in the patient record, rather than just recording what occurred in a limited acute incident. The VA has been at the forefront of standardizing medical vocabularies in numerous areas, such as the National Drug File, LOINC coding for lab tests, the HL7 data transmission protocol, describing diagnoses and procedures with specialized nomenclature and coding systems such as the UMLS, ICD-9, CPT, DSM-IV or the SNOMED CT systems. Each of these systems has a lot in common with modern ontologies used in knowledge engineering and the semantic web. The actual syntax or methodology may take a different form than a typical expert system, but many of the basic operations of classification using generalization and specialization, analysis of processes of data collection and deductive processes are in common. And as has been said before, the subset of the world that has to be considered to be described or axiomatized is very broad. The area of overlap between ontologizing the world at large and ontologizing the medical world is large enough that almost all of the same issues come up in both efforts.

An explicit ontology of the medical aspects of VistA would involve developing methods of exposing a lot of the medical knowledge in the system in a form that external reasoning systems could then be leveraged to enhance the support of medical practice already provided by VistA.

Thursday, May 28, 2009

WHY: OntoVistA as a Software Engineering artifact

Solely as a software artifact, VistA is a monumental work. The software necessary to provide support for an entire hospital, yet capable of being configured to handle rather small organizations of only a few hundred employees up to several thousand employees, must have a high degree of configurability and flexibility. Couple that flexibility with the limited hardware resources typically available in tax-funded enterprises, as well as the life-saving nature of the work that VistA supports, and you can see that the traditional values of small size and quick response time are still valued in the codebase and database.

The dynamic, sparse persistent data of a MUMPS-based system in VistA supports a table-driven approach with multiple parameters in the database used for configuration. Storage of program code fragments in the database also leverages the late-binding execution ability of MUMPS to keep a lot of the design uniform, yet able to handle the complexities of a modern healthcare system.

To give an idea of the scope of VistA, there are about 2500 Files (similar to SQL Tables) in a standard installation, each of which may have entries and fields, as well as sub-files with their corresponding sub-entries and sub-fields. Across all the Files, there are over 50,000 fields and subfields (database columns). There are more than 25,000 individual programs, each of them providing some capability related to supporting patient care.

As an artifact, VistA is simply too large for any one person to understand fully. Within the Department of Veterans Affairs, there are multiple people who specialize in developing new code, maintaining existing code, configuring existing and new functionality, supporting quality assessment, testing capabilities, as well as the myriad people who can customize the system to their particular way of practicing medicine.

An explicit ontology of the software and computational aspects of VistA would involve developing methods of describing a lot of the procedural and declarative knowledge in the system in a form that external reasoning systems could then be leveraged to enhance the support of the software artifact that is VistA.

Wednesday, May 20, 2009

FAQ: Why did you choose VistA as the basis for this work?

VistA has several specific advantages:

1) It is easily available, in the public domain, and can run
on common, inexpensive hardware. This means that a personal
instance of VistA can run by participants, if desired.

2) It is extensive. VistA supports the full gamut of healthcare
for adults (veterans) in the United States, using modern and
standardized terminology sets, and covering all aspects of
a hospital. There are a huge number of Fields (over 70K) in a
recent count, with over 20K programs using that data. This
should provide a fertile field to develop knowledge, and recogize
patterns of usage. Since the system is actively in use in over
170 hospitals, there are many documents on the meaning of
the data, and the practice of using it, so less guesswork will
be needed in determining how ambiguities should be interpreted.

3) The resulting artifact/ontology will have value beyond
its use to learn and experiment with ontology modeling.
As there is already a community based around VistA, in both
the private and the public sectors, the artifact, once created, will
be able to be maintained past the modeling phase, and will serve
as a source of information about the usefulness of an ontology
on a long term basis.

FAQ: Can I be involved, even if I haven't had a lot of experience with creating an ontology?

Yes, with the understanding that I do want to be productive.
I have been working with Medical Informatics for over 25
years, and have extensive experience in the practice and
delivery of computer systems that support healthcare.
I am willing to share my knowledge in the general field, and
specifically in the VistA software that I am using as a
basis for the ontology modeling.

I am promoting this effort to build a community which would
be knowledgeable about health ontology, as well as building
an artifact that incorporates a lot of the knowledge within
VistA, and about VistA. I think this will seriously advance
the state of the art, as there is nothing like a working system
to specify and disambiguate issues of meaning.

I welcome involvement by novices and professionals alike.
This should prove to be educational to all of us, for different
reasons, with the understanding that the initial flurry of effort
may be a bit chaotic, however, it should be great fun as well.

FAQ: What ontology technology are you considering using for this effort?

I am planning on using a minimum of three separate
technologies, as I am hoping to "triangulate" on the
meaning of the VistA system's implicit meaning for
Files and fields currently stored therein. My current
expectation is that I will be using SUMO, OWL, and
CycL. Each of these technologies has different strengths,
and I hope to learn more about the details about them
with this project. Each set of predicates make different
distinctions, and I hope to find a combination that will
reflect the actual differentiations which the software makes.

As many of you know, I have been interested, and involved
with Cyc for many years, having written the Unofficial FAQ
for Cyc more years in the past than I want to admit in public.

Likewise, I have been a fan of Adam Pease's work on SUMO
since its inception, and have been favorably impressed by
his dedication, and the professionalism of his team, as well
as their generous spirit with the rest of the community.

I have included OWL, primarily on the strength of Protege,
and the work done by Stanford on medical ontologies.
My hope is that this will increase the likelihood that
the result will increase interoperability.

A careful argument that shows strong benefits for the project
might sway me from this plan, but my current thoughts
show that this strategy has the highest degree of utility
for the project and for the community.

FAQ: What are the plans you have for licensing the content of this ontology?

The VistA system is already a work of the government,
which means that it is public domain. While strictly speaking,
an ontology based on the information and knowledge
implicit in the VistA system can be licensed in many
ways, I would like to use a license that supports the upkeep
of the ontology on a long term basis, and which furthers
my goals to make sure that information stored in a VistA
implementation can be shared easily with other systems.
I don't know enough regarding ontology licensing to know
if a software program license is appropriate, or if a license
used for a shared written work is correct. I am willing to
adopt a license that best supports my goals.

FAQ: What are your reasons for creating this ontology?

I expect that as time passes, more systems are going
to need to be defined by the strict level of details that
are needed when defining a formal ontology of its inner
workings. Especially with computer systems for health
related fields, these definitions are helpful because they
address issues which can enhance the safety and
interoperability of a system. In my opinion, VistA is a
national treasure, in that has been proven to be effective
in outcomes-based studies, in supporting the delivery
of healthcare that has a positive effect on people's lives.

First Post

This blog is intended to record my excursions into knowledge representation of an existing software system using the current ontology and semantic web tools.

My initial message regarding this project was:

I am looking for collaborators willing to be
involved in some work making the ontology
implicit in the Department of Veteran's Affairs
VistA Hospital Information System an explicit
ontology coded in an explicit ontology.

This will be a learning experience, and I expect
that the work will be documented on a website,
probably http://www.vistapedia.net

I am doing this work as a volunteer, but would
not object to collaborators who are able to find
funding through research grants or other means.

If you are interested, please call or write me,
David Whitten, whitten@worldvista.org
(713) 870-3834

OntoVistA