Thursday, December 10, 2009

XML Schema: Element References

Consider a xml based language consisting of elements (here: nodes) and references to elements (here: nodeRefs). In the following I want to discuss if it is possible to create a xml schema for such a language which

  • allows to express reference dependencies
  • has a base type for an arbitrary number of nodes; a node reference is a node too!
  • is aware of referential integrity

Here is an example of a xml document based on a language which consists of two different nodes and a node reference:

A node either has an unique id (and therefor is enabled to be referenced) or a reference to an id. This property can be denoted with an attributeGroup:

The type of id is not xs:ID because it is possible to omit the id. In this case the node is not enabled to be referenced. The second reason is that references are nodes too. The type xs:ID would force that all nodes have an id - even node references.

In general there are multiple types of nodes which can be referenced. The reference itself is a node too. So we need a base type for node definitions and node references which is an abstract node. In this example a node can have child nodes.

The node definition has optionally an unique id which enables it to be referenced. The ref attribute is prohibited. Because the node type is a restriction the child element has to be denoted again! The node definition is abstract because there are several concrete types of node definitions.

In this example there are two different types of concrete node definitions:

A node reference has no child elements! The id attribute is ommitted because node references cannot be referenced. This can be modeled by using direct references to concrete nodes (no need for transitivity).

The root element encapsulates an arbitrary number of nodes and defines the referential integritiy constraints.

There is one flaw: Xml schema does not allow to create a keyref where the corresponding key is a unique constraint. I don't understand that! It would make sense to allow the specification of attributes which have to be unique within a document for those elements for which they are defined. @W3C: What the reason for that restriction!?

Here is the complete example.xsd:

Of cause there are several other possibilities to model a xml schema for such a language. Here is another (which doesn't match the starting example.xml exactly but shows a different approach with choices):

If you use jaxb to unmarshall a xml file (based on a xsd) I discourage the use of choice elements. They are mapped to multiple java instance variables which are null if not used. Instead I recommend the usage of abstract elements which are mapped to abstract classes. This is a more OO like way.

No comments:

Post a Comment