Archive for 18. April 2008

Turkey Hunting

I was able to get out for opening day of Maryland wild turkey season this morning.  I didn’t get a turkey, but it was a great morning to be in the woods.  In Maryland I hunt at McKee-Beshers Wildlife Management Area, which is a wonderful area for anyone who enjoys the great outdoors.  It’s nearly 2,000 acres of habitat in Poolsville, MD, along the Potomac river.  If you live in or near Maryland you should check it out sometime.  I saw geese with their goslings, wood and mallard ducks, and a doe who almost walked into my lap trying to figure out what my turkey decoys were doing under “her” oak tree.

VDI (Virtual Data Integration)

What is virtual integration and when is it appropriate to use? Virtual data integration (VDI) is the use of a software layer that interfaces with the reporting or end-user delivery application, in place of direct access of a data repository.  The software layer provides a mapping between the conceptual data model that the user interfaces with, and the physical model(s) and underlying data stores. The reason I say “model(s)” is that one of the most powerful aspects of VDI is the ability to make multiple data stores (whether they be transactional, ODS, data warehouse/marts, or others) look like one integrated repository.    Below are the pros and cons of using VDI verses physical data integration. 

Pros:

        Shortens implementation time since data does not have to be physically integrated

        Provides short term benefit to the business while buying time for technical team to integrate/retire existing redundant and fragmented data architecture

        Can reduce data storage requirements

        Source data updates are available immediately to end-users

        Eliminates cost and time associated with extraction and movement of data (e.g., no batch window)

        Business rules can be modified “on the fly” 

Cons:

        Data is not accessible if source system is unavailable

        Complex transformations and aggregations can impact report response time

        Updates to source systems can result in out of sync data on reports

        Complex match-merge routines may not be possible outside of batch processing

        Accessing large data sets can impact report response time

        Query processing can impact source system performance 

In my mind, the key differentiator is the volume of data being analyzed.  If end users are accessing relatively few numbers of records, a VDI solution is viable.  But when large data volumes need to be accessed, such as in a data mining or long term trending exercise, performance may become an insurmountable issue, both from the end user as well as the source system owner perspective.

|