Large Scale administration and information-gathering using CIM information modelling.

System Data Layer

The System Data Layer extracts and transforms the information contained on the configuration sources on a system. Normally, these sources are on flat files, directory structures, command output (like ifconfig), databases (like registries), and can be read using various interfaces, from normal file access functions to instrumentation systems like Windows WMI. The purpose of this layer is to gather and homogenize all this data as CIM instances.

Extraction mechanisms

In the case of Windows'™ WMI, the information is already consolidated as CIM classes, and information is simply encoded as a simplified XML rendition of CIM (miniCIM). This rendition, as discussed in the modelling layer, only represents attribute-value pairs and key references, as instance names and key properties are derived from a XML Schema transformed from the original schema.

Flat files instead must be parsed and represented as CIM instances (bot normal and associations). This is achieved using a Python module for text-to-XML parsing called Martel, which, presented with a grammatical description of the configuration text files, outputs a structured XML file. This XML file is then transformed by a very simple XSLT template that outputs miniCIM instances.

The degree to which flat file structure is represented in the grammar can be variable. Since most configuration files have very similar structures (line-based, section-based), a general grammar can be used. But this approach also offers flexibility needed with complicated or very version-dependent formats.

Directory structures and command output can be trivially redirected to a file and processed as flat files. In rare cases, interactive legacy administration tools can also be screen-scraped or automatically accessed using tools like Expect. XML-format files can be directly transformed using a custom XSLT stylesheet.

A domain example: inetd.conf

The next listing is and example using the inetd.conf network configuration file found on various Unix-like systems. This sample is from a Linux Debian system, is line oriented and the fields are space-delimited. Commented lines are indicated by an initial '#' character, but if the initial characters are '#<off>#', the line must not be interpreted as commentary, but as a disabled service.

# /etc/inetd.conf:  see inetd(8) for further informations.
#
#
# Internet server configuration database
#
#
# <service_name><sock><proto><flags><user><server_path><args>
#
#:INTERNAL: Internal services

#echo	stream	tcp	nowait	root	internal
ftp 	stream 	tcp	nowait	root	/usr/sbin/tcpd	/usr/sbin/proftpd
#<off># sgi_fam/1-2 	stream	rpc/tcp	wait	root	/usr/sbin/famd

The following listing is a Martel program that can parse the file above. Line 1 and 2 define identifiers and whitespaces, respectively. Line 8 defined the structure of the document as a succession of commentaries, disabled service lines and service lines.

 1 import Martel; from xml.sax import saxutils
 2 def Item(name): return Martel.Group(name,Martel.Re("\S+\s+"))
 3 fields=Item("name")+Item("socktype")+Item("proto")+Item("flags")+Item("user")+Martel.ToEol("args")
 4 offline=Martel.Re("#<off>#\s*")+Martel.Group("off",fields)
 5 commentary=Martel.Re("#")+Martel.Group("com",Martel.ToEol())
 6 serviceline=Martel.Group("service",fields)
 7 blank=Martel.Str("\n")
 8 format=Martel.Group("inetd",Martel.Rep(Martel.Alt(offline, blank, commentary, serviceline)))
 9 parser = format.make_parser()
10 parser.setContentHandler(saxutils.XMLGenerator())
11 parser.parseFile(open("inetd.conf"))

Line 5 defines a commentary as any group of characters between '#' and a newline (to be grouped in a >com< element), In line 4 disabled service line are defined as normal service line (l. 6) beginning with '#<off>#' and consisting on a series of fields (l. 3), which are Items, or non-whitespace characters separated by whitespace characters (l. 2).

Applying this program to the original flat file yields this XML file:

<?xml version="1.0" encoding="UTF-8"?>
<doc>
   <commentary> <com>#</com>echo	stream	tcp	nowait	root	internal </commentary>
   <line><id>ftp</id><ws> </ws><id>stream</id><ws> </ws><id>tcp</id><ws> </ws>
         <id>nowait</id><ws> </ws><id>root</id><ws> </ws><id>/usr/sbin/tcpd</id>
   <ws> </ws><id>/usr/sbin/proftpd</id></line>
   .........
   <off> <com>#</com><off>#<ws> </ws>
      <line><id>sgi_fam/1-2</id><ws> </ws><id>stream</id><ws> </ws><id>rpc/tcp</id>
      <ws> </ws><id>wait</id><ws> </ws><id>root</id><ws> </ws>
      <id>/usr/sbin/famd</id><ws> </ws><id>fam</id></line>
   </off>
   </doc>

This XML file is processed by an XSLT template which assigns values to the properties of an miniCIM (an abbreviated XML syntax for CIM instance) instance. Each field is identified by its position in the line, as a legacy program reading the original file would have to do. The result of parsing one line of the file is shown below:

<CIM>
   <CIM_InetdService namespace="dc=udc">
      <SystemCreationClassName>
          CIM_ComputerSystem
      </SystemCreationClassName>
      <SystemName>shalmaneser</SystemName>
      <CreationClassName>CIM_InetdService</CreationClassName>
      <Name>ftp</Name>
      <SocketType>stream</SocketType>
      <Protocol>tcp</Protocol>
      <User>nowait</User>
      <Command>root/usr/sbin/tcpd</Command>
   </CIM_InetdService>
</CIM>

Following similar steps, configuration sources can be extracted and collated in CIM format for their representation, consumption and persistence.