Views

TOOLBOX

LANGUAGES

Measurement Units Ontology

From Morfeo Wiki

Jump to: navigation, search

Abstract: this document describes an ontology to represent units of measurement in RDF.

  • Authors: Diego Berrueta and Luis Polo (Fundación CTIC)
  • Contributors: José M. Álvarez (Fundación CTIC)


Contents

The issue of representing measurements in ontologies

In the context of the MyMobileWeb project, and more specifically, in the description of the delivery context ontology, we have to represent physical magnitudes using ontologies. This is an awkward problem, because RDF doesn't have any native support to tag a literal value with its unit. In other words, RDF literal nodes can represent discrete numeric values, such as "10" or "9.81", but a literal node doesn't capture the information on which unit is used to express the value.

The same magnitude, such as "length", can be measured using different units, such as "meters", "inches", "feet", "milimeters", and so on. The same length leads to different numeric values when expressed using different units. However, from a conceptual point of view, all of these values are (or should be) actually equivalent.

In a perfect, ideal world, the problem could be avoided by users agreeing on a single unit of measure for each property. To a certain extent, it makes sense that, if users can agree in sharing a common vocabulary (ontology) to model a domain, they should be able also to agree in the units of measure for each property. However, in the real world, this may be too restrictive for some applications, particularly those which want to promote a wide usage. Both practical (some scales are more conveniently captured using certain units) and cultural reasons make it impossible to use a single unit for each property. Therefore, a practical solution must be provided to allow users to express measurable qualities in different units.

Literals in RDF

Unfortunately, RDF does not provide a native solution for this problem.

RDF literal nodes capturing strings can be tagged with the natural language. These tags have a reflection on the serialization. For instance, a string literal value serialized in RDF/XML can use the xml:lang attribute. Equivalently, in N3, the @lang syntax is available. However, this only applies to string literal values. This feature was not intended to be used to represent units of measure.

Similarly, RDF literals can be annotated with a type specifier. Type annotations allow to provide hints to the final users of the RDF literal about how its lexical value should be understood. For instance, "3.14"^^xsd:double indicates that the lexical value should be interpreted as a numerical value; however, "3.14"^^xsd:string or simply "3.14" should be interpreted as character strings. Some alternatives to use the type annotations for encoding unit information are discussed later in this document.

Alternatives to represent measurable quantities in RDF

The lack of a canonical way to represent measurable quantities in RDF has lead to a number of different "patterns" being used in different ontologies and RDF data repositories. The most representative ones are described below. Each of them settles a different balance between expressiveness and simplicity.

Pattern A: units as RDF resources

The key idea of this pattern is that each unit is modeled as an RDF resource. Unfortunately, it is not legal in RDF to create a link from a literal node to a resource, i.e., literals cannot be the subject of triples. As a consequence, another resource is needed to tie together the literal value and the unit.

The link between this auxiliary resource and the unit resource determines two variants of this pattern. If the link is a rdf:type predicate, the units are used as classes to type the auxiliary resource. For instance:

#nokia70 :weight #aHundredAndThirtyGrams .
#aHundredAndThirtyGrams rdf:type ucum:kilogram ;
                        :numericalValue "0.130" .

Alternatively, other predicate (different from rdf:type) can be used. For instance:

#nokia70 :weight [
   :numericalValue "0.130" ;
   :measuredIn ucum:kilogram
]

The :weight property is an owl:ObjectProperty because it relates two resources. It can be defined as follows:

:weight a owl:ObjectProperty ; rdfs:label "Weight" .

Regardless of the variant, the auxiliary resource must be defined (note, however, than the first example uses a named resource, while the second uses a blank node, just for illustration purposes). This additional resource creates an overhead (more resources, more triples) that acts as a burden to the use of this pattern. However, it is possible to use a shortcut that acts as syntactic sugar for most cases, as described next.

Subpattern: dual modelling

This subpattern was inspired by a proposal for the modeling of lexical properties in SKOS.

This proposal involves creating two properties for each aspect. One of them is a datatype property (i.e.: a property which links a resource to a literal), and the other one is an object property (i.e.: a property which links a resource to another resource). The second one has the -R suffix. The datatype property is bounded to a particular unit of measure:

:weight a owl:DatatypeProperty ;
    rdfs:label "Weight (in kilograms)" .
:weightR a owl:ObjectProperty ;
    rdfs:label "Weight (object property)".

Those users who want to use an standard unit may simply use the datatype property. This is very straightforward:

#nokia70 :weight "0.130" .

However, if a user wants to capture the value using a unit different from the standard one, she may use the object property and a slightly more complex syntax:

#nokia70 :weightR [
   :numericalValue "0.287" ;
   :measuredIn ucum:pounds
   ]

Pros:

  1. Simplicity: for users who want to use the standard unit, the usage is simple as it can be (see the first example).
  2. Flexibility: users can use the unit of their choice.
  3. There are "only" two properties for each feature (other proposal involve more than two properties).

Cons:

  1. It is difficult to write a query to get the value of a property, because the user has to be aware of the two different properties (i.e.: :weight OR :weightR).
  2. This proposal uses blank nodes. Many authors discourage the use of blank nodes.
  3. Some asymmetry is introduced. It is harder to use a non-standard property than a standard-property.

This subpattern can be extended by means of a rule to "normalize" the values using the standard unit, i.e., a rule that creates an statement with the datatype property for each statement that uses the object property. However, this rule is outside the expressiveness of Description Logics (it is a production rule, not a logical rule). Anyway, it can be captured with the following pseudo-SPARQL query. Unfortunately, SPARQL doesn't allow arithmetic operators in the CONSTRUCT clause:

CONSTRUCT ?device :weight (?x * conversion_factor)
WHERE { ?device :weightR ?r .
        ?r :measuredIn ?unit .
        ?r :value ?x }

This rule removes the first objection above. Unfortunately, it is unwise to make it compulsory for the applications to apply the rule before querying the ontology.

Pattern B: datatype subproperties for each unit of measure

This proposal uses two sets of properties, each one of them addressing a different abstraction level. The first set contains abstract properties that just capture the semantics of the magnitude being measured. Examples of properties in the first set are: weight, length, etc.

The second set contains properties that capture both the semantics of the magnitude AND the unit of measurement. Examples of properties in the second set are: weightInKilograms, lengthInMeters, etc. Properties of this set are subproperties of the properties in the first set.

:weight a owl:DatatypeProperty ;
   rdfs:label "Weight" .
:weightInKilograms rdfs:subPropertyOf :weight ;
   rdfs:label "Weight measured in Kilograms" ;
   :measuredIn ucum:kilograms .
:weightInPounds rdfs:subPropertyOf :weight ;
   rdfs:label "Weight measured in Pounds" ;
   :measuredIn ucum:pounds .

Therefore, if we have these explicit triplets:

#nokia70 :weightInKilograms "0.130".
#nokia95 :weightInPounds "0.340".

Then we can infer:

#nokia70 :weight "0.130".
#nokia95 :weight "0.340".

This proposal has some advantages:

  1. Given a triplet which uses a property from the second set, we immediately know which unit of measure the value is expressed in. We do not have to check the type of the literal value (as in Proposal C). This makes it possible to write SPARQL queries such as the following:
 SELECT *
 WHERE { ?x rdf:type mymw:MobilePhone .
         ?x :weightInKilograms ?weight }
 ORDER BY ?weightInKilograms
  1. It is easy to extend the abstract properties from the first set in order to introduce a new unit of measure for a magnitude, while at the same time, existing properties from the second set can be re-used.
  2. Even if the same magnitude is expressed with different units of measure, we still can retrieve all the information using an SPARQL query:
 SELECT ?device ?weightInKilograms ?weightInPounds
 WHERE { ?device rdf:type mymw:MobilePhone }
 OPTIONAL { ?device :weightInKilograms ?weightInKilograms }
 OPTIONAL { ?device :weightInPounds ?weightInPounds } 

However, there are some drawbacks:

  1. This proposal introduces a potentially huge vocabulary. For each feature, we need at least two properties (one in each set). But each new unit of measure introduces a new property in the second set.
  2. It is not clear whether having two different levels of abstract is really useful. What value does it provide to know that the "length" is "10" if we do not know the unit of measure? It can be argued, however, that at least we know that somebody is providing information on this feature.

Pattern C: units as different XML Schema types

In this proposal, we define sub-types of the XML Schema numerical types, particularly of xsd:double. These new sub-types are to be defined in an XML Schema file (i.e.: not in the ontology). The following are examples of such sub-types: mymw:meters, mymw:kilograms, mymw:pounds. Consider the following definition:

<simpleType name="mymw:kilograms" base="xsd:double" />

Once defined, these new types can be used to "type" the literal nodes in RDF:

#nokia70 :weight "0.130"^^mymw:kilograms.

Pros:

  1. The most important advantage of this proposal is its simplicity. To some extent, we emulate the semantics of the xml:lang attribute for string literals by assigning semantics to the sub-types.
  2. There is no need for new resources, nor blank nodes, nor external rules to infer new properties.
  3. There aren't duplicated properties for the same feature.

Cons:

  1. The semantics is completely implicit in the sub-type. For instance, from a logical point of view, we can't know which magnitude is represented by a sub-type (how do we know that some quantity measured in kilograms is a weight?).
  2. In RDF, the types are just syntactical: they specify how the lexical form of the literal should be parsed into a value (all literals have a lexical form, which is always a string). For instance, when we write "10"^^xsd:integer, we tell the RDF application that the string "10" should be taken as a number. But this hasn't any impact on the validation of the model. For instance, the literal "foo"^^xsd:integer is valid.
  3. From a semantic perspective, it can be argued that mymw:meters is not a sub-type of xsd:double. Types and classes are different things, and the sub-typing mechanism in XML Schema was envisioned with a different purpose (to constraint the universe of valid values, not to "add semantics" to the super-type).

Pattern D: parseable XML Schema type

This pattern was proposed by Joshua Tauberer. Instead of defining a number of different datatypes, one for each unit of measure, this pattern proposes to define just one new datatype. The unit is represented by its symbol, inside the literal value. For instance, "130 g"^^ex:quantityAndSymbol. Applications that find a literal value typed with this type (ex:quantityAndSymbol) should "parse" the literal value, split the figure and the symbol, and use the figure as appropriate.

Pros:

  1. Avoids having many different datatypes for different units.
  2. The unit information is always stick to the figure. This is very important from an information point of view (it is impossible to "loose" the unit information). Even applications that are not aware of this pattern will keep the figure and the unit symbol together, as they are part of the same literal.
  3. The literal value can be read and understood by humans.
  4. Very simple and intuitive for people trained in sciences.

Cons:

  1. Applications must be modified to make them aware of the special datatype ex:quantityAndSymbol. If they are not aware of this datatype, they won't be able to interpret the literal values, and they will be useless for them. For instance, they won't be able to sum two literals.
  2. There isn't any semantic connection between the unit symbol inside a literal ('"??? m") and the description of the unit as a resource (#meter).

Related work

  • At W3C:
  • Near W3C:
  • Outside W3C:
  • Misc:
    • Can the same techniques be used to represent amounts of money (i.e., currencies)?

Measurement Units Ontology (MUO)

This section describes an ontology of units of measurement that has been defined as part of the MyMobileWeb project.

This ontology is distributed as two files. Current snapshots are available at:

The ontology can also be found at the MyMobileWeb source code SVN repository. You can browse the repository, but to do serious development, please read the instructions on how to check-out the code.

Description of the ontology

This ontology is split in two blocks.

  1. On the one hand, the definitions of the classes and properties. These have been inspired by previous works on metrology and upper-level ontologies such as DOLCE. For a detailed description of the design of this ontology, we refer the reader to the MUO's design section. These classes and properties provide the essential vocabulary to define the semantics of measurements in domain ontologies (for instance, in the delivery context ontology). The ontology models mainly three disjoint entities:
    1. Units of measurement (class UnitOfMeasurement).
    2. Physical qualities that can be measured (class PhysicalQuality).
    3. Common prefixes for units of measurement (class Prefix).
  2. On the other hand, we provide a number of instances for the classes above. These instances have been automatically extracted from UCUM using a procedure described in a section below. The main objective of this set of instances is to coin URIs for the most common units of measure, physical qualities and prefixes, which can be shared and reused in domain ontologies. Every unit of measurement is linked to a physical quality (e.g. meter is linked to length).

Design of MUO

The aim of this section is to explain the design of the MUO ontology. The design of MUO consists on:

  1. A clear interpretation of concepts involved in the measurement of magnitudes by aligning them to the DOLCE ontology.
  2. The usage of the formal framework for qualities developed by Masolo et Borgo to adapt the semantic pattern of proposal A for the MyMobileWeb project.

The Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) is a foundational ontology developed in the Laboratory for Applied Ontology. DOLCE has a clear cognitive bias, in that it aims at capturing the ontological categories underlying natural language and human commonsense. It is an ontology of particulars: its domain of discourse is restricted to them. Universals (properties) are used to organize and characterize the particulars, but are not part of the domain of discourse. In this sense it is an ontology of first-order entities. More information can be found in its wiki page and in [1]about DOLCE's treatment of qualities.

The following are the DOLCE concepts relevant for our approach on to how to model physical properties of devices in the Delivery Context Ontology, such as width of display or color of the screen:

  • Endurants, are entities that are wholly present (i.e. every part of the entity) at any time they are present (e.g. devices, a law, some energy, a person, etc.). As a consequence endurants can 'genuinely' change in time, in the sense that the same endurant as a whole can have incompatible properties and descriptions at different times (e.g. "the color of my car is red in time t, and after painting it is blue in time t+2").
  • Qualities, are the basic entities we can perceive or measure: shapes, colors, sizes, sounds, smells, as well as weights, lengths, electrical charges... In DOLCE, qualities are treated as particulars. Qualities inhere to entities: every entity (including qualities themselves) comes with certain qualities, which exist as long as the entity exists (i.e. the weight or space are quality inherent to every physical object because they are composed of matter) . Qualities can be general, e.g. “height, speed”, or individual, e.g. “the height and width of the display of my mobile device”. They can change through time.
  • Qualia (in singular form, quale), are the values for individual qualities, and they describe the position of an individual quality within a certain quality space. e.g. “the value of the weight of my device”. Technically they are the abstraction of individual qualities from time and from their hosts (endurants). So when we say that two devices (my device and Diego's device) have exactly the same weight, we mean that their weight qualities, which are distinct (e.g. one is the specific weight of my device and the other is the specific weight of Diego's device), have the same position in the weight space (an atomic region), that is they have the same weight quale.
  • Regions or Quality Spaces, correspond to different ways of “organizing” qualia. In our case, the relevant quality spaces are the measurement-units which can be used as reference metrics ("measurement spaces") for other spaces, such as the values for individual qualities, e.g. “ the kilogram unit can be used to measure the value of the weight of my device”. They are "counted by" some number.

Interpreting Systems of Units

The aim of this subsection is to provide a clear semantic framework to interpret physical properties and measurement units aligning them to the ones we have seen in the previous section. The physical properties/quantities such as mass, weight, speed, etc. are kind of properties that can be quantified i.e. that can be measured or calculated. The concept of physical quantity is similar to our notion of quality: both represent entities that can be measured or perceived. Moreover quantities are ascribed to objects or phenomena in the same way as qualities are inherent to endurants.

The science of measurement, metrology, as we have done with qualities, distinguishes between:

  • A physical quantity in the general sense: a kind of physical property ascribed to phenomena that can be quantified for a particular phenomenon (e.g. length and electrical charge), understood as a general physical quality under our approach.
  • A physical quantity in the particular sense: a quantifiable property ascribed to a particular phenomenon (e.g. the weight of my device). These properties are similar, if not the same, as our individual qualities.

The value of a physical quantity Q is expressed as the product of a numerical value {Q} and a physical unit [Q].

Q = {Q} x [Q]

Measurement units are standards for measurement of physical properties or quantities (physical qualities in our ontology). Every unit is related to a particular kind of property. For instance, the meter unit is uniquely related to the length property. From a pure physical perspective a unit is a particular physical quantity, defined and adopted by convention, with which other particular quantities are compared. Under our ontological approach, units are abstract spaces used as a reference metrics for quality spaces, such as physical qualia, and they are counted by some number. For instance, weight-units define some quality spaces for the weight-quality where specific weights of objects, like devices or persons, are located by means of comparisons with the proper weight-value of the selected weight-unit. A quale is considered to be the value of an individual physical property, similarly to the above formula definition for physical properties' values. In this sense, the quality value (the quale) is taken as a combination of a numerical value {Q} and a unit ([Q]).

In the MUO Ontology, a class, QualityValue, has been created for representing the values of qualities. Instances of this class are related to 1) exactly one unit, suitable for measure the physical quality (meters for length, grams for weight, etc), with a functional property muo:measuredIn, and 2) a number, which express the relationship between the value and the unit by means of the rdf:value property. For instance, to represent the width-value of the display of my mobile phone we have the following pattern:

However a depth ontological analysis has to be done to correctly formalize the different kind of measurement units and the relationships between them. In MUO, this task has been done by creating a hierarchy of measurement units (see picture below). This classification takes into account if a unit is a composite one (derived units) or a non-composite one (base units).

Value formal pattern and classification of measurement units

Base Units of Measurement

Base units are units that have not been derived from any other unit. In turn, base units can be used to derive other measurement units. For instance, a kilometer is derived from the base unit meter. The International System of Units (SI), the modern form of the metric system, recognizes several base units for base physical qualities (quantities in the SI) assumed to be mutually independent.

Derived Units of Measurement

Other physical qualities (quantities in the SI), called derived physical qualities, are defined in terms of the base qualities via a system of dimensional equations. The derived units for derived qualities are obtained from these equations combinated with the base units. In the formal representation of physical qualities and associated units of MUO ontology, we have defined a relational property muo:derivesFrom to express the relationship between the derived unit and the units it is derived from. We distinguish between two different kinds of derived units:

  • Simple Derived Units. Units that are derived from exactly one base unit. There are two main possibilities. On one hand, there are units that are derived by adding a prefix to the unit (some of these prefixes are also known as a metric prefix and belong to the SI). This is a name or associated symbol (e.g. kilo) that precedes a unit of measure (e.g. meter) to form a decimal multiple or submultiple (e.g. Kilometer). Prefixes are commonly used to reduce the quantity of zeroes in numerical equivalencies. Derived units, obtained by the application of a prefix, measure the same physical quality as its base unit. For instance, meters and kilometers are used to measure length quality. This can be represented in MUO with the following sematic pattern,
:kilometer rdf:type muo:SimpleDerivedUnit ;
     muo:derivesFrom     ucum:meter ;
     muo:modifierPrefix  ucum:kilo.

On the other hand, there are another kind of simple derived units that are also obtained from exactly one base unit but they measure a different physical quality. They are obtained by changing the exponent of the unit in the dimensional equation. For instance, this is how square meters are derived from meters. This exponent is represented in MUO with the datatype property muo:dimensionalSize

:square-meter rdf:type muo:SimpleDerivedUnit ;
     muo:derivesFrom       ucum:meter ;
     muo:dimensionalSize   "2"^^xsd:float.

Combining this two patterns we can represent units that are obtained from a prefix and that have a dimension size different from 1, for instance, the unit square kilometer:

:square-kilometer rdf:type muo:SimpleDerivedUnit ;
     muo:derivesFrom     ucum:meter ;
     muo:modifierPrefix  ucum:kilo;
     muo:dimensionalSize "2"^^xsd:float.
  • Complex Derived Units: Units that are derived from two or more measurement units (i.e. a derived unit which is defined by means of more than one unit in its dimensional equation). For instance, the complex derived unit meter per second squared is defined by the dimensional equation \frac{m}{s^2} with two units: m and s2. In MUO, the representation of this unit is the following (note that we need to define an auxiliary unit):
:metre-per-second-squared rdf:type muo:ComplexDerivedUnit ;
      muo:derivesFrom  ucum:meter ;
      muo:derivesFrom  :second-squared.
          
:second-squared rdf:type muo:SimpleDerivedUnit ;
      muo:derivesFrom     ucum:second ;
      muo:dimensionalSize "2"^^xsd:float.

Representing Measurements in MUO

In the Delivery Context ontology, it seems reasonable to simplify the way to represent the values of properties of mobile devices or other entities involved. In this subsection, we will take the complete approach of DOLCE for representing physical qualities and adapt it to the the semantic pattern of proposal A. This simplification will be made using the EQlS (Endurants-Qualia-Spaces) system developed by Masolo et Borgo in their ontological framework for the logical treatment of qualities and qualia [1].

Physical qualities are in one-to-one correspondence with quale kinds, e.g. the weight of my device [an individual physical quality related with the dimension of weight] is associated with exactly one value measured with a particular unit [a quale located in a specific position of a structured space]. Our proposal is to leave out quality entities in MUO without losing expressive power of our model by:

  • Introducing a new temporalized property between endurants and qualia: muo:qualityValue, that means the quale of a quality in time t. Actually qualities are mapped to qualia only when they are present, and if they are present, they are necessarily mapped to qualia. Given a time t, a quality is mapped to only one quale (or quality value). In this sense, the muo:qualityValue property reifies the relationships between an endurant and its individual qualities in a time t (dol:has-quality) and the values of these qualities in time t (dol:has-quale). We can define now for an endurant e the qualities of kind Qi as pairs < e,Q^{\mid e \mid}_i > , where Q^{\mid e \mid}_i is the set of quality values of kind muo:QualityValue that are associated to e:
    • Formal definition of the set of value qualities of an specific endurant: Q^{\mid e \mid}_i = \{q \mid QualityValue_i(q) \wedge \exists t (muo:qualityValue(e,q,t))\}
  • maintaining the measured relation introduced in the subsection of measurement units to express how a quality value (quale) is measured with a particular unit. From a technical point of view, a quality value is associated to exactly one measurement space, i.e. the combination of a measurement unit and numerical value. These spaces provide a clear structure to compare and evaluate qualia, and in consequence, to their corresponding individual qualities.

In conclusion, the muo:qualityValue property can be used to represent physical quality values of endurants (devices in the context of the DC ontology and the MYMW project). Moreover, we can refine MUO ontology if needed in our ontologies or semantic models, allowing the creation of subproperties of muo:qualityValue referring to specific physical qualities such as length, mass, electric charge.

Alignment of MUO concepts to DOLCE

DC updating policies and quality values

The relationship between a quality and its associated quale is temporal dependent (e.g. the weight of my mobile phone weight [quality in time t] is a particular weight value [quale]). Some qualities as the length of the display are static: they do not change along the time sequence and can be considered constants. Other qualities, such as the energy level of the battery are dynamic and we have considered them as discrete or continuous variables that change through time. This issue is important as the DC model will be queried to obtain current values of some device's properties (e.g. which is the energy level of the battery?), for appropriate content-adaptations or for taking decisions/suggestions based on user preferences. However, with our approach this information is trivially implicit in the semantic model as the quality values of entities will be updated accordingly to the DC updating policies . Therefore it is not necessary to use a temporal extension for our Description Logics' encoding of the EQlS system. OWL is still rich enough for these representational aspects.

Usage of MUO

The aim of this section is to provide the reader some guidelines to use MUO. This section contains guidance for three tasks:

  1. Defining a new unit. There may be some cases when you want to use a new unit that hasn't been defined yet.
  2. Applying the MUO framework to the design of an ontology. This task is only useful for people that designs ontologies.
  3. Applying the MUO framework to represent a measurable quantity. This is the most common use case, where you are using an ontology defined by other people, and you just need to represent a measurable quantity.

How to define new units in MUO

Follow these steps:

  • Currently, MUO has a lot of unit instances already defined in the UCUM namespace. First of all, make sure that the unit you need is not already defined in UCUM. Check the documentation to search for the unit you are looking for.
  • Please notice that we are considering some extensions to the UCUM unit set. If the unit you want to add is included in this set of potential extensions, please let us know.
  • If finally you need to create your own measurement unit, you have to decide which is the class it belongs to. In MUO, there are three classes of measurement units: Base Units, Simple Derived Units and Complex Derived Units. Use this guidelines to decide which one fits you best:
    1. Use a base unit if you can not re-use any other unit in the definition of your new unit.
    2. Use a simple derived unit if you can define your new unit in terms of a exactly one unit. For instance, this applies to the derivation of a new unit by using a prefix on an existing unit (e.g.: kilometers from meters), or to the derivation of a new unit by changing the dimesional exponent of a base unit (e.g.: square meters from meters).
    3. Use a complex derived unit if your unit is defined in terms of more than one unit (e.g.: \frac{m}{s^2}).
  • Decide the URI you want to use to identify your unit. Remember that you should define your URI in a namespace owned by you (please do not use the MUO namespace!).
  • Declare your unit as an instance of the appropriate class, as explained in step 3. For instance, we are going to define a base unit of mass called "Akebono". This (fictional) unit is named after Akebono Taro.
  <http://example.org/Akebono> rdf:type muo:BaseUnit ;
      rdfs:label "Akebono"@en .
  • We recommend to add a new statement to express the relationship between the unit and the physical quality it measures. For instance:
  <http://example.org/Akebono> muo:measuresQuality ucum:mass .
  • Note that we do not have means to describe the conversion formulae between different units for the same physical quality. For instance, we know that an "Akebono" is roughly 235 Kilograms, but we cannot capture this information in the definition of the unit with OWL language.
  • No more work is needed for base units. However, for derived units, you should capture the relationship between your new unit and the unit(s) it is derived from. The property muo:derivesFrom is used with this purpose. For instance, we are going to define a simple derived unit for area. There is a base unit called "hand" which is used to measure the height of a horse. We define a simple derived unit for areas as follows:
  <http://example.org/Square_hand> rdf:type muo:SimpleDerivedUnit ;
     rdfs:label           "Square hand"@en ;
     muo:derivesFrom      ucum:Hand ;
     muo:dimensionalSize  "2"^^xsd:float ;
     muo:measuresQuality  ucum:area .
  • As a final example, we will define a complex derived unit. Acceleration can be measured in meters per second squared (\frac{m}{s^2}). To define a new complex derived unit we just point to the units that are part of its dimensional equation, as follows:
  <http://example.org/Meters_per_second_squared> rdf:type muo:ComplexDerivedUnit ;
     rdfs:label           "Meters per second squared"@en ;
     muo:derivesFrom      muo:meter ;
     muo:derivesFrom      <http://example.org/Second_squared> ;
     muo:measuresQuality  ucum:acceleration .

How to design your ontology to use MUO

Here you have some advise on how to design your ontology to take benefit of the MUO:

  • Wherever possible, use the URIs of MUO (notably, the URIs of the instances provided with MUO) to refer to units of measure and physical qualities.
    • For instance, if you define a datatype property to measure the weight of a person, and you want to say that the weight must be described in kilograms, then you can include an annotation property to say so. Example:
  :weightOfAPerson rdf:type owl:DatatypeProperty ;
     rdfs:label          "Weight of a person (in Kg)"@en ;
     muo:preferredUnit   ucum:Kilogram ;
     rdfs:domain         foaf:Person ;
     rdfs:range          xsd:float .
  • If you want to allow the users of your ontology to be able to enter the values of a measurable property in different units, then you have to define a relational (object) property (more details can be found at MUO's design section). We strongly recommend this property to be a subproperty of muo:qualityValue, although this is not a requirement. Note that if you define your property as a subproperty of muo:qualityValue, your property will have muo:QualityValue as range. Example:
   :weightValue rdfs:subPropertyOf muo:qualityValue ;
      rdfs:label     "Weight of a person"@en ;
      rdfs:domain    foaf:Person .
  • Some times you need properties to describe the value of a quality for different objects (for instance, the weight of a Person, of a Device and of a Car). Our recommendation is to use the same property for all these objects, because the measurable dimension is the same quality (weight). If you want to specify the domain of this property, you can use the union of these classes, or a superclass.
  • If you want to combine the simplicity of having a datatype property with a fixed unit and the flexibility of having an object property to allow expressing the value in other units, you can define both properties. We are not going to provide all the details here, please take a look at the MyMobileWeb delivery context ontology for an example of how to apply this pattern.

How to represent the value of a measurable property using MUO

In this section, we describe the scenario in which you want to assert the measurable value of a certain property for a resource. For instance, you want to indicate that the area of Spain is 504,782 Km^2. We assume that the property (#area) has been already defined by the ontology designer, and the unit of measurement (square kilometers, #Sq_Km) has also been defined.

The MUO framework requires an auxiliary resource. In this example, a blank node is used, but a named resource could have been used as well.

The domain resource (in this case, the resource that represents Spain, #Spain) must be linked to the auxiliary resource using the domain property (in this case, #area). Then, two properties must be set for the auxiliary resource. On the one hand, the numerical value of the measure. Although not required, it is highly recommended that the literal value is typed with the xsd:double XML Schema datatype. On the other hand, a link is defined between the auxiliary resource and the unit of measurement. In N3 syntax:

#Spain #area [
           muo:numericalValue "504782"^^xsd:double ;
           muo:measuredIn #Sq_Km
           ] .

The same RDF triples can be viewed as a graph. Note that in addition to the 3 triples from the example above, some links are also visible in the image:

Using MUO to assert the area of Spain

Of course, this information can be represented with more than one unit of measurement. For instance:

#Spain #area [
           muo:numericalValue "504782"^^xsd:double ;
           muo:measuredIn #Sq_Km
           ] ;
       #area [
           muo:numericalValue "194897"^^xsd:double ;
           muo:measuredIn #Sq_Miles
           ] .

How to use MUO from Protégé Editor

OWL supports the sharing and reuse of ontologies by making it possible for one ontology to import another ontology. When an ontology imports another ontology, all of the class, property and individual definitions that are in the imported ontology are available for use in the importing ontology. We will provide here some guidelines to use the MUO ontology with any other ontolgy A that requires to be extended with measurement units. In particular, we will explain how the owl:imports mechanism can be applied with the Protégé-Editor.

  1. Importing MUO: Open the ontology A with the Protégé-Editor and go to the "Metadata tab", to the Ontology Browser widget. Then press the button "import ontology". Now you should specify where the ontology MUO where will be obtained from. As MUO is already published on the web, it is enough to write its URI: http://purl.oclc.org/NET/muo/muo-vocab.owl. Protégé retrieves it from the Web and imports all the axioms of MUO into the ontology A. You are ready to use MUO with ontology A.
  2. Visual configuration of Protégé. When importing an ontology with Protégé, some minor adjustments have to be manually fixed to enable the full explotation of the ontology from the User Interface. Basically, the problem is that there is not any widget in the Individual Editor ("Individual Tab") that allows the user to apply the properties of MUO. These widgets need to be activated. Protégé system supports the ability to customize the forms that the user sees for class creation, etc. So, you can navigate to the "Forms tab", where you can easily replace or remove widgets you don't want to appear, and activate the widgets for MUO properties.
  3. Remove import: It is possible to remove the import of the MUO ontology, if necessary. Navigate to the "Metadata tab" again and press the button "remove the import". This operation implies that MUO is no longer imported by ontology A. However, all the axioms build upon the vocabulary of MUO (usually instances definition, such as "the height of an individual") are part of ontology A, therefore they are not removed when this operation is performed.

More information about how to use Protégé Editor can be found in http://protege.stanford.edu/.

MUO Basic Instances: UCUM

In order to make references to units of measure, we need to assign a URI to each one. Note that the symbol of the unit (such as "m" for meters or "N" for Newtons) are not suitable due to the following reasons: 1) not every unit have a symbol; 2) symbols are difficult to represent in ASCII; 3) symbols may vary in different languages.

Fortunately, there are standarized codes for the units of measure. The one that seems more appropriate is UCUM, that consolidates other partial codes, such as ISO 1000, ISO 2955-1983 and others.

UCUM doesn't provide URIs for the units, only a normalized code. However, it is easy to create URIs: just add a prefix to the UCUM code. Although UCUM codes are 7-bit ASCII strings, there are some characters which may be dangerous in the URIs, such as the slash character (as in "m/s") or the percentage symbol.

We provide a number of number of instances. By doing so, we expect to coin shared and re-usable URIs for the most commonly used units of measurement, unit prefixes and physical qualities. A sample of the automatically extracted instances are:

In addition to these instances, we also extracted some subclasses of UnitOfMeasurement from UCUM. They are derived from different classifications of the units in the UCUM listings. Please note that they are not disjoint. However, in the following we will focus on the instances.

The instances have been automatically created from the XML distribution of UCUM using a XSLT 2.0 stylesheet (both files are available in the source repository). To apply the stylesheet, an XSLT 2.0 processor is required. Recent releases of Saxon support XSLT 2.0. Sample usage:

$ java -jar /usr/local/saxon/saxon9.jar \
       -s:ucum-essence.xml \
       -xsl:ucum-to-ontology.xsl \
       -o:ucum-instances.owl

The command above will read the file "ucum-essence.xml", which contains a distribution of UCUM in XML, and will produce the file "ucum-instances.owl", which contains definitions for the instances of UnitOfMeasurement, PhysicalQuality and Prefix.

Namespace and URI design decisions

  • URIs have been normalized: spaces have been replaced with underscores, and some special characters, such as "º" or "'" have been removed or replaced. Even if non-ASCII characters can be used in URIs (this is perfectly consistent with the W3C recommendations, read more on IRIs), we have decided to replace them with similar characters.
  • Physical qualities, units of measurement and prefixes are kept in different namespaces:
  • To coin a URI for each unit of measurement, two components have been concatenated. Firstly, the name of the physical quality being measured. Secondly, the actual name of the unit of measurement. Note that just using the name of the measurement is not enough, as there are some units with the same name for different kind of properties (e.g.: "seconds" is a unit to measure time *and* angles; therefore, we have angle-seconds and time-seconds). Unfortunately, there are some (different) units with the same name for the same kind of property. A notably example is "foot": there is the American one and the British one. In these cases, a number is appended to the URI.
  • It has been suggested that the URIs for the units of measurement can be also derived from the code of the unit in UCUM. Although this can lead to shorter URIs for some units, we opted for the name of the unit for the sake of readability of the URIs, and to avoid as much as possible the need to URL-escape some characters.
  • A slash-namespace is used for the instances. This makes it possible for semantic web agents to retrieve the definitions of each resource (unit, kind of property) individually, i.e., to avoid the burden of retrieving the definition of all the 270 units when the application is interested in just one of them. As a drawback, a slightly more complex Apache configuration is required to publish the ontology (more information in the Recipes document from the W3C).

Other design decisions

  • Two different datatypes are used for symbols. Most of them are just plain strings ("m", "ft"...). However, some symbols are rdf:XMLLiterals, because they contain some markup ("m<sup>2</sup>"). Note that the tags ("sup", "sub") belong to the default namespace, although it could be argued that they should belong to the XHTML namespace.

MUO Services

(FIXME: add a description of the MUO services here)

See our demo: MUO Services Mashup Demo

Other issues

Proposed additions to the units of measurement ontology

Here is a list of the proposed additions to the MUO ontology. Some items are further discussed in the sections below.

  • New dimensions:
    • Resolution
    • Color
  • New units:
    • Pixel
    • RGB/CMVY/...

Proposed addition: Pixel as unit of measurement

This is a very controversial issue. In order to organize the discussion, the problem has been split in several questions. Arguments are provided for each question. Note that a negative answer to the first question makes the other questions irrelevant.

  • Are pixels a unit of measurement?
    • Yes! When the size of an image, or the resolution of a screen, is given in pixels, some measurable quality of the image (or the screen) is being described.
    • No! Counting pixels isn't different from counting apples (or whatever), and no one would say that apples are a unit of measure. The size in pixels of an image is just a number. Moreover, if pixels were a unit of measurement, there should exist a "pattern pixel", i.e., a pixel which is taken as the pattern.
  • Are pixels a unit of measurement for length?
    • Yes! (add here Nacho's argumentation)
    • No! Every unit of measurement of a dimension should be convertible to others using a fixed formula. For instance, meters can be transformed to miles, inches and feet using a well-known formula that is valid in every context. However, pixels cannot be converted to meters with a fixed formula. Therefore, they aren't a unit of length. They are measuring another dimension. Moreover, if pixels were a measure for length, it would be possible to answer this question: what is the distance between the Moon and the Earth in pixels? As there isn't any answer for this question, pixels are not a unit for length.
  • Are pixels a unit of measurement for a dimension other than lenght?
    • Yes! Pixels are measuring a measurable quality of images or screens, but not their length. A given screen, for instance, has a certain width in centimeters, and a certain (X) in pixels. It is not clear which name should (X) have (maybe "resolution" [2] [3]?), but anyway it is not length. Length and (X) can be transformed to each other using a conversion ratio that depends on the context. It is the same that mass and volume, which are different dimensions, but they may be converted into each other with a ratio that depends on the context (in this case, the "density" of the substance).
    • No! See the "pros" of the previous/next question.
  • Are pixels a unit of measurement without dimension?
    • Yes! Pixels are an abstract/logical unit, such as percentage, and thus, they have no dimension.
    • No! See the "pros" for the two questions above. Pixels are either a measure of length or other dimension, but in any case, definitely there is a dimension that is being measured.

Limitations

  • Conversion formulae between measurement units are not included in the current version of the ontology, although they are present in the original source (the XML distribution of UCUM). We are still studying the best way to include this information in the ontology. However, this is a low-priority task for us, because we focus on providing a semantic framework and a set of instances, and not on the computational aspects.
  • The multiplier factor of the unit prefixes (mega, kilo, centi, mili...) are captured in string values ("1e6", "1e3", "1e-2", "1e-3"...). We are considering converting these values into valid xsd:decimal values (note that scientific notation is not allowed as xsd:decimal values).
  • The names (rdfs:label) of the units are only available in English. Translations are welcome.

Conversion between units of measurement

In the context of MUO and the MyMobileWeb project, we are modeling units of measurement and measurable quantities in a formal way using ontologies.

Unfortunately, OWL does not provide support to model, in an operative way, the mathematical functions that are required to describe the conversions between different units of measurement. Therefore, we have chosen to avoid any mention to conversion functions in MUO.

This wiki pages considers some alternatives to implement conversions between units of measurement in an operative way.

External web service

An external web service can be defined with an standard interface to provide a service specialized in conversions between units of measurement. Essentially, the web service interface would have three input parameters and one output:

  • Input:
    • Source unit of measurement.
    • Source numerical value.
    • Desired output unit of measurement.
  • Output:
    • Numerical value of the conversion.

The numerical values would be, obviously, float point values. The source and destination units of measurement may be identified by their URIs.

Note that several public web services in the web already provide this functionality (although the interface may be different from the one described above). For instance, Google can convert between a number of units of measurement (see example).

Rule language

Some rule languages have the expressive power to model and to operate with mathematical functions. Therefore, they can perform conversions between units of measurement. Anyway, a complete rule engine and a set of rules built on top of the MUO ontology may be overkill for many applications.

Query language (extended SPARQL)

Another option is to perform the conversions when required by a query to the model. Unfortunately, the current version of the standard query language for RDF (SPARQL) does not provide enough expressive power to write such queries (notably, functions are not permitted in the projection clause). However, there are some proposed extensions to SPARQL that can be used to obtain the desired expressivity in the query.

An example query follows:

CONSTRUCT {
  ?person :weightInKilograms (?z*0.45359237)
}
WHERE {
  ?person rdf:type foaf:Person .
  ?person :weighyInPounds ?z
}

We are evaluating different proposed extensions to SPARQL to see if they can provide the necessary expressivity.

Future work

In the future, we would like to link these instances to their definitions in Wikipedia/DBpedia. For instance, we would like to add links such as:

ucum:Meter owl:sameAs dbpedia:Meter .

In this way, we would be able to obtain multilingual labels and definitions for our instances, and we will align the MUO ontology with the Linking Open Data initiative.

Other item in our "to-do" list is to provide high-quality human-readable documentation for the ontology.

Slides and public presentations

  • A brief presentation of MUO Ontology, and how to use it in the context of MyMobileWeb project.
  • Slides used in the presentation at the SWIG (Semantic Web Interest Group) f2f meeting at Mandelieu (France), Oct 21st 2008.


References