Enterprise Resource Planning PortalERPGenie.COMEnterprise Resource Planning Portal

   Advertise | BLOG

Web ERPGenie.COM

Home | Vote for us |

ERPGenie.COM -> mySAP.com -> Integration Technologies -> Bus Connector -> XML

eXtended Markup Language

Quicklinks

XML is a subset of the SGML (Standard Generalized Markup Language) defined in ISO standard 8879:1986. XML is formally and historically associated with SGML. It was originally developed by the SGML Editorial Review Board. Later, the editorial board became the XML Working Group (XML WG) organized by the W3 Consortium (W3C). The XML WG receives input from the XML SIG (Special Interest Group). Since February 10th, 1998, the XML specification is a published standard from the W3 Consortium, the Internet standards organization.

Details

XML (eXtended Markup Language) is a language used to describe a class of data objects, the XML documents. XML also partially describes the behavior of programs processing them.

XML is a subset of the Standard Generalized Markup Language (SGML) that is designed to make it easy to interchange structured documents over the Internet.

XML may become the web markup language of the future, replacing the current HTML standard. This is mainly due to two XML features:

  • In contrast to HTML, XML just provides document structure. The visual formatting takes place independently. Formatting languages like CSS (Cascading Style Sheets) or XSL (Extensible Style sheet Language) may be used together for display on a client’s browser screen. XSL itself is an XML application.
  • XML enables the definition of an application-specific markup language. By doing so, documents using a particular markup can be classified according to their semantic contents. This could be used, for instance, to enable a more advanced search procedure of search semantically classified documents.

These advantages make it the language of choice for e-Business.

Application Areas

As XML supports a general data model, the breadth of possible applications seems unrestricted. However, in combination with the Internet, a number of particularly suitable application types may be emphasized.

Any data structures can be represented (in a textual format) in XML documents. This also holds for nested data structures. For interchange, the data is stored in an XML document. Two types of electronic data interchange can exist:

  • Application-specific EDI. This is supported by the XML concept of document types. The application needs to know the specific document type.
  • General data exchange. A general data processor can process the description of the document type of the document containing the data. Thus, it knows how to analyze the data in the document. A typical way to apply this would be a transfer of database tables and their contents. The table schema can be mapped to a document type.

Electronic commerce applications have EDI at their core. XML document types are used to define the financial domain, while the XML document serves as a container for the transactional data. There are two basic types:

  • Retail. On the client end of the commercial transaction, an unknown customer with unknown financial assurances.
  • Business to Business (B2B). The system keeps confidential customer records that identify the customer. For subsequent transactions, it suffices to use the customer reference for identification. During the first transaction, the customer records are established which require complex financial assurances.

 Another large application area that is of interest is the handling of metadata.

This means data about data, generally in the form of a classification of the underlying data (for example, the author of a document). A classic example for this is a repository, which is often part of application development environments.

Basic Concepts

 Figure 1 shows a sample XML document. It represents a product list which includes two items.

<?xml version= "1.0" encoding="UTF-8"?>

<GROUP>

<GROUPNAME> Calculators </GROUPNAME>

<ITEM>

      <PRODUCTNAME> Solar Cell Calculator </PRODUCTNAME>

      <DESCRIPTION>

      <PARAGRAPH> Basic mathematical functions included. </PARAGRAPH>

      </DESCRIPTION>

      <PRICING>

      <PRODNUM> 24336-5 </PRODNUM> <PRICE>14.50 $ </PRICE>

      </PRICING>

</ITEM>

<ITEM>

      <PRODUCTNAME> Simple Calculator </PRODUCTNAME>

      <PRICING>

      <PRODNUM> 24336-1 </PRODNUM> <PRICE> 7.95 $ </PRICE>

      </PRICING>

</ITEM>

</GROUP>

 Figure 1. A simple product list with two items.

The document is split up into two parts, the XML declaration in the upper part and the document contents in the lower part.The document meets the requirements of XML version 1.0, and is encoded in the character format UTF-8.

Within the product list, the first item is a solar cell calculator, the second a simple calculator. Markup is used for the pricing as well: it contains the product number and the price.

In this very simple example, the document is structured by markup tags, each occurring as a pair of start and end tags. In XML terminology, a corresponding pair of tags, including the content in between, forms an element.

For instance, the group element contains the whole document contents. The first item element contains all the information about the solar cell calculator.

In terms of XML, such concrete elements are instances of their item type. For instance, the two items of Figure 1 are instances of the item type. Element types are defined by the element type declaration, which is described later.

Elements can be nested. For example, the first (and also second) item element is contained in the group element.

Document structures are organized in a tree constructed from elements and other XML objects.

In contrast to HTML, for example, the set of element names is not fixed. The application designer can define his own element types. The decision about what element types to use is application-specific.

The next step in this scenario is to formally declare, for a given document, the element types which are being used. This is done in a document type declaration (DTD). In addition, it defines which elements may be part of others, and in which order they appear. The concept will be explained in more detail in the next section

The XML concept of the document type declaration is very flexible. Parts of the definition can be drawn from different sources. As an example, consider an article about microprocessor technology: it could use an article document type together with the application-specific type. The resulting document then complies with definitions of both document types.

Another reason to split up the DTD is that a classification into generic and specific DTDs may enforce DTD reuse. A generic DTD can be regarded as a framework where specifically prepared insertion points enable the DTD author to specialize the framework for a specific application. An example of this is an application framework centered around purchasing. In this context, an item element is declared which is industry-specific.

XML defines the concept of well-formedness and validity. Well-formed documents have a correct markup structure, that is, the start tags have a corresponding end tag and are nested in a well-formed manner. Valid documents are well-formed documents which also comply with the document type declaration, that is, the elements and other document parts are declared.

A program which can parse an XML document and check for validity, that is, which reads the XML document and matches it against the document type declaration, is called an XML processor.

 Document Type declaration (DTD )

A document type declaration for the domain to which the document belongs is necessary for the XML processor to validate the document. In Figure 2, the new document is shown. It adds a document type declaration to the (extended) document of Figure 1.

<?xml version= "1.0" encoding="UTF-8"?>

<!DOCTYPE GROUP [

<!ELEMENT GROUP (GROUPNAME, ITEM+)>

<!ELEMENT GROUPNAME (#PCDATA)>

<!ELEMENT ITEM (PRODUCTNAME?, DESCRIPTION?, PRICING?, ITEM*)>

<!ATTLIST ITEM ITEMLINK CDATA #REQUIRED>

<!ELEMENT PRODUCTNAME (#PCDATA)>

<!ELEMENT DESCRIPTION (PARAGRAPH | IMG)*>

<!ELEMENT IMG EMPTY>

<!ATTLIST IMG

SRC CDATA #REQUIRED

HEIGHT CDATA #REQUIRED

WIDTH CDATA #REQUIRED>

<!ELEMENT PARAGRAPH (#PCDATA | EMPHASIS)>

<!ELEMENT EMPHASIS (#PCDATA)>

<!ELEMENT PRICING (PRODNUM, PRICE)>

<!ELEMENT PRODNUM (#PCDATA)>

<!ELEMENT PRICE (#PCDATA)>

]>

 

<GROUP>

<GROUPNAME> Calculators></GROUPNAME

<ITEM ITEMLINK="6C41394A">

    <PRODUCTNAME> Solar Cell Calculator </PRODUCTNAME>

    <DESCRIPTION>

    <PARAGRAPH> Basic mathematical functions included. </PARAGRAPH>

    <IMG SRC="C:\images\calc.bmp" HEIGHT="100" WIDTH="150"/>

    </DESCRIPTION>

    <PRICING>

    <PRODNUM> 24336-5 </PRODNUM> <PRICE>14.50 $ </PRICE>

    </PRICING>

</ITEM>

<ITEM ITEMLINK="9930442B">

    <PRODUCTNAME> Simple Calculator </PRODUCTNAME>

    <PRICING>

    <PRODNUM> 24336-1 </PRODNUM> <PRICE> 7.95 $ </PRICE>

    </PRICING>

</ITEM>

</GROUP>

 The document type declaration follows the first part which declares the document to be XML version 1.0, and precedes the document contents.

 The document starts with the declaration of the root element type (denoted by !DOCTYPE), which is a distinguished element type, because one of its instances comprises the whole document contents. In this case, the root element is a group element.

What follows are the declarations of the different element types which specify two things at the same time:

  • The declaration of the element type. The syntax is given by the keyword !ELEMENT followed by the element name.
  • The definition of the element type. This describes how an element is constructed from other elements. In XML terminology, this is called a content model. For instance, the group element contains a group name element, followed by a non-empty list of item elements (symbolized by the ‘+’ character symbol).

Apart from the ‘+’ character, the ‘?’ character is used for an element that may optionally appear in its parent element (for example, the description element). Moreover, the item element has child elements of type item marked by a ‘*’character, which specifies a possibly empty element list. In the example mentioned above, this enables an item to refer to related items, for example, to specify items within a common price range.

Another special character is the selection character ‘|’ , which represents a choice between alternatives. For the description element, it states that a description is built from a possibly empty list of paragraphs and images in arbitrary combination.

After the item element type declaration, an !ATTLIST ITEM declaration follows. It declares a list of attributes belonging to the specified element. Here, the item element has an attribute itemlink which is of type CDATA (character data), that is, an arbitrary text.

The attribute declaration for element img declares three attributes src, height and width. These are all of type CDATA and specified by a default declaration #REQUIRED. This means that for all img elements, the attribute values for these three attributes must be specified. Other default declarations either restrict the value of the attribute (keyword #FIXED), a default value is provided (no keyword) or the user is unrestricted inproviding any or no value (#IMPLIED).

In the above example, a simple way is outlined to link document contents to data in database tables. The itemlink attribute could be used to store a key value which identifies a unique table row of a corresponding table.

 XSL(eXtensible Stylesheet Language)

XSL is a language for expressing stylesheets. With these stylesheets an author can manipulate XML elements. XSL is being developed as part of the W3C Style Sheets Activity.

XSL consists of two parts:

  • A language for transforming XML documents, and
  • An XML vocabulary for specifying formatting semantics.

XSL lets authors format XML elements. It uses the syntax of XML, allowing XML specialists to style their documents and data without necessarily learning either CSS or DSSSL. The most powerful aspect of XSL is its ability to map a single data source onto multiple display targets--that is, more than one implementation of an XML or HTML document on-screen--and to style the data in each target.

XSL goes through two main steps in formatting the data. First it produces a source tree by matching specified patterns with XML elements. This tree is then processed to produce a results tree, based on actions specified in template rules. An XML parser then takes the XML data and the XSL formatting instructions and transforms the results tree into an HTML document for display in a Web client. XSL is capable of data reordering, so the results tree can look entirely different from the source tree.

Useful Hyperlinks

 

Contact Us | Polls | Add URL | Contribute | About | Privacy | Terms | Feedback | Help!

Message Board | Discussion Forum | BLOG | Consultants: Post your resume | Companies: Advertise on ERPGenie.COM | Post Job