Monday, June 14, 2010

Middle Distance Ontologies: antibodies

I want to push on the whole concept of "Middle Distance Ontology" a bit harder and see how it plays out -- my current plan is to concentrate on the discovery space with two entities: Assay Results and Antibodies.

I'll cover antibodies in this post, assay results in the next.

Now, there are many different perspectives from which to view antibodies, to name a few:
  • As a biologist investigating antibody action

  • As a vendor producing antibodies to meet a specification
  • As a pharmaceutical company procuring antibodies from a vendor to use in an assay

For this exercise, I'll take the perspective of a pharmaceutical company storing/analyzing assay results, since it is the viewpoint I understand best. Antibodies are something that I'm not intimately familiar with, so my approach will be to generate a list of attributes and then evaluate them for inclusion/exclusion/opaque identification.

Here are some antibody attributes I came up with. Many were taken from an interesting white paper from Pierce Biotechnology/Thermo Scientific:
  • Basic Attributes: Primary/Secondary; Monoclonal/Polyclonal; Antigen; Vendor Location; Batch

  • IgG Fragments: IgG Whole Molecule; Gamma Chain of IgG; Fc Fragment of IgG; F(ab ́)2 Fragment of IgG

  • IgM Fragments: IgM Whole Molecule;
    Fc5μ Fragment of IgM

  • Mu Chain of IgM
    Light Chains of Immunoglobulins

We certainly care about the primary/secondary antibody distinction. This captures both the fact that the secondary antibody was used (aka the primary antibody does not have a fluorescent tag (or equivalent)) and the characteristics of this secondary antibody if it appears. Interestingly a quick search was able to discover a reference to tertiary antibodies, so the principles outlined in the scenario analysis section calls for us to provide for these in the design of the core ontology, even if they are unlikely to be used.

In our situation, the factor that couples the primary and secondary (and tertiary) antibodies is their co-occurrence in the assay. A priori, there is nothing that requires there to be anything other than the primary antibody, nor is there any necessity for the antibodies to be able to bind to each other (after all mistakes happen). We might want to say that the primary, secondary and tertiary should (must) be able to bind to each other. However it would be inappropriate to include this as part of this core ontology since we are trying to capture the run of an assay and a run may be erroneous.

Therefore for each experimental run we will have a primary antibody and perhaps one secondary+ antibodies. In addition, we might multiplex the experiment and run more than one antibody set per "container" per experiment. A quick search for multiple antibody assay made me think that "multiplex antibody" was the appropriate search term, which results in > 2,000 hits, which indicates that it is possible.

Antigen is something that we (likely) provide or at minimum specify to the vendor. Although it should consist of a unique sequence, understanding its meaning and role within the overall program would require the ability to support an arbitrary level of complexity. This clearly calls for an opaque identifier.

At the vendor level there we will need to track some vendor identifier (opaque), shipment information (again opaque) and some sort of vendor lot/group identifier (opaque). The scenario that we we wish to be able to support is one in which the vendor ships multiple lots per shipment or spreads one lot across multiple shipments. Tracking the most fine grained vendor location as the opaque identifier at this level protects you from mergers/divestitures and new location startup issues, all of which would potentially be permanently hidden by the use of a larger grained identifier (just think if you only had ONE identifier for all of Thermo!).

When it comes to the more specific characteristics of the antibodies e.g., fragments and chains, they are not attributes that present distinctions which are important to the analysis of the results from the perspective of a pharmaceutical company.

  • Opaque identifiers
    • Vendor facility

    • Vendor lot/shipment

    • antigen

  • Primary identifiers
    • quantity

    • monoclonal/polyclonal
    • antibody

  • Elided Completely (stored seperately)
    • antigen/antibody hierarchy
    • antibody fragments and chains