Importing data from an XML file

Description

The process demonstrates how to import data from XML documents using the Read XML operator. Parameters of the Read XML operator are set via its Import Configuration Wizard. Attribute values are extracted from the XML document using XPath location paths.

Input

The experiment uses population data in XML from the World Bank Open Data website. The data set is available at http://data.worldbank.org/indicator/SP.POP.TOTL in various formats, including XML.

Figure 3.6. A small excerpt of The World Bank: Population (Total) data set used in the exepriment.

<?xml version="1.0" encoding="utf-8"?>
<Root xmlns:wb="http://www.worldbank.org">
  <data>
    <record>
      <field name="Country or Area" key="ABW">Aruba</field>
      <field name="Item" key="SP.POP.TOTL">Population (Total)</field>
      <field name="Year">1960</field>
      <field name="Value">54208</field>
    </record>
    <record>
      <field name="Country or Area" key="ABW">Aruba</field>
      <field name="Item" key="SP.POP.TOTL">Population (Total)</field>
      <field name="Year">1961</field>
      <field name="Value">55435</field>
    </record>
    <record>
      <field name="Country or Area" key="ABW">Aruba</field>
      <field name="Item" key="SP.POP.TOTL">Population (Total)</field>
      <field name="Year">1962</field>
      <field name="Value">56226</field>
    </record>
    <!-- ... -->
</Root>

Output

An ExampleSet that contains data imported from the XML document.

Figure 3.7. Metadata of the resulting ExampleSet.

Metadata of the resulting ExampleSet.

Figure 3.8. A small excerpt of the resulting ExampleSet.

A small excerpt of the resulting ExampleSet.

Interpretation of the results

Video

Workflow

import_exp4.rmp

Keywords

importing data
XML

Operators

Read XML