Accessing data from storage system using XAM API



If you are familiar with POSIX which is a collection of standards which enables portability of applications across OS platforms. It provide a standardized API for an OS, eg:- functions like 'open' to open a file, 'fdopen' to open stream for a file descriptor, fork to create a process etc. It defines file descriptor as a per-process, unique, nonnegative integer used to identify an open file for the purpose of file access. The Unix systems programming API is standardized by a large number of POSIX standards. Most operating system APIs share common functionalities like listing directories, renaming file etc.

XAM stands for "eXtensible Access Method". The XAM API will allow application developers to store content on a class of storage systems known as “fixed-content” storage systems.

The data centers have evolved into a complex ecosystem of SANs, fiber channels, iSCSI, caching technologies etc. Certain storage systems are efficiently designed to store the documents that donot change,  which is useful for preservation or archival. Digital imaging storage systems which us used to archive XRAY DICOM images, ie. documents that are not edited and to be kept for a long time need can use fixed storage. The applications accessing these solutions need to be written using only those vendor-specific interfaces. What if one changes their infrastructure? The application has to be rewritten. This scenario is an analogy to the need for JDBC/ODBC driver specifications as a standard API to interact with different vendors. XAM API specified by SNIA is the storage industry's proposed solution to the multi-vendor interoperability problem.

Consider EMC Centera, the data written to it is fixed in nature. Most common systems uses a file location (address) based approach to store and retrieve content. Centera has a flat address scheme called content address. When the BLOB object is stored, it is given "claim" check derived from its content which is a hash value (128-bit) known as Content Address (CA). The application needn't know the physical location of the data stored. The associated metadata like filename etc is added into an XML called C-Clip Descriptor File (CDF) with the file's content address. This makes the system capable of WORM. When one tries to modify/update a file, the new file will be kept separately, enabling versioning and unique access through its fingerprint checksum. Also, there is an essential attribute for retention that tracks the expiry of the file. The file can't be deleted until time surpasses the defined value.

This is a basic vendor specific architecture. They provides the SDK/API to acces the data. So the industry came up with standard in which XAM API provides the capability to access these systems independently.

XAM has concepts like SYSTEMS, XSETS, properties and streams ("blobs"), and XUIDs.

SYSTEM/XSYSTEM - is the repository
XSET - the content
XUID - identifier for the XSET

All these elements include metadata as key value pairs. There are participating and non-participating fields. Participating fields contribute to the naming of the XSET known as XUID. If XSET is retrieved from a SYSTEM and a participating field is modified, it will receive a different XUID. XUID created must be formed by using the content of the property or stream (e.g. run an MD5 hash over the value).

So a vendor specific library implements the XAM Application Programming Interface (API) which is known as XAMLibrary connecting to the Vendor Interface Module (VIM), which internally communicates to the storage system. Usually, VIMs are the native interfaces. Once the application load XAMLibrary and connect to the source, it needs to open the XSET by providing the XUID which will open up the content as a stream, called XStream. The XStream object is used to read the transported data which can be a query result or a file.




A sample code to read the data,

// connect
String xri = “snia-xam://"
XSystem sys = s_xam.connect(xri);
//open the XSet
XSet xset = sys.openXSet( xuid, XSet.MODE_READ_ONLY);
//getting metdata value for the key
String value = xset.getString( “com.example.key” );
XStream stream = xset.openStream( “com.example.data”,
XStream.MODE_READ_ONLY);
stream.close();
xset.close();
sys.close()

XAM fields have type information described using MIME types.
eg:-“application/vnd.snia.xam.xuid" for the an 80-element byte array, xam_xuid.

The XAM URI (XRI) is the XAM resource identifier specification associated to a vendor identifying the storage.

As XAM generates a globally unique name (address), the objects may move around freely in time, changing their physical or technological location so as to enable a transparent information lifecycle management (ILM). So the metadata information and retention can be externalized ie, even if the application is gone ie. retired, the information management can be handled through an API. This makes the data center more adaptive to organizational information management.

And, unlike the file systems, XAM tags the information with metadata and provides a technology independent namespace, allowing software to interpret the content independent of the application.


Architecture specification


No comments:

Post a Comment