Dataupia

I spoke with Jim McManus, VP Channels and Alliances, from Dataupia last week and wanted to share my impressions of the company and primary offering, the Satori Server.  Dataupia is based in Cambridge, MA, and has received venture funding from Polais Venture Partners, Fairhaven Capital Partners, Valhalla Partners, including a recent $16M infusion last fall.  The company was founded in 2005, and went general availability with the Satori Server in May, 2007.  Foster Hinshaw, the CEO, is known as the “Father of the Data Warehouse Appliance”, and founded Netezza before leaving to start Dataupia.  They list several customer success stories on their web site, and currently have 4 customers according to Jim.  They recently released a press release with Subex announcing the deployment of a 150TB Oracle OSS system on Satori Server.

The Satori Server is their flagship product, and is touted as a true data warehouse appliance (think Netezza and Teradata).  It includes server, storage, and “optimization software” bundled into one package.  The hardware is built from common off the shelf components, but is engineered to handle data intensive operations. The “optimization software” includes a scaled back Postgress open source relational database management system, and a vanilla Red Hat Linux operating system.  The system is configured in a shared-nothing or MPP architecture.  They currently store data in a row alignment, but are considering offering column based storage.  According the Jim, the system is currently targeted for operational reporting purposes, which lend itself to row based processing.  Satori Server performance enhancement relies heavily on a dynamic aggregation capability that is defined at loading time.

The differentiator for Dataupia is their “Omniversal Transparency” – basically they support queries in Oracle, DB2, and SQL Server natively.  As an example, take a data warehouse sitting on an Oracle database.  The data warehouse might have a large number of tables, most of them reference or smaller volume tables along side several large “fact tables”.  Once the Satori Server is connected to the network, the Oracle DBA can map the larger volume table(s) and migrate all data.  After the migration is complete, queries run as normal within Oracle, with processing on the larger tables handled within the Satori Server.  The benefits of this approach are significant, as it requires no change to applications currently built to access the existing RDBMS.  The DBA can handle most of the application monitoring from the existing RDBMS console, although Dataupia does include a console for configuring and monitoring the Satori Server directly.

Dataupia provides a number of benchmark results on their web site, and include load times (70 MB/sec), refresh rate, and drill down capabilities (24 months of detail data accessed in 5 sec).  They have not as yet released TPC-H results (planned for this summer) which allow for a more direct comparison to their competitors in this space.  A potential concern is the reliance on data aggregation to boost performance results, which brings scalability into question.  A heavy reliance on aggregates can result in significantly increased data sizes to support a mixed reporting environment, particularly a heavy ad-hoc user base.  The Subex announcement provides one scalability data point, but I’d like to see some additional production implementations before eliminating this concern.

Dataupia offers consulting services, which consist of a 1 week jumpstart engagement that hooks up the Satori Server, does initial configuration, migrates selected data, and runs a number of tests to ensure everything is working properly.  They rely on their system integrator partners to handle any required work outside the scope of this jumpstart effort.

They have a well defined partnership strategy, with three categories:

Ø     Solution Partners that extend their product in areas such as business intelligence and data mining

Ø     System Integrators that work with customers to implement the Satori Server within the larger data warehouse initiative

Ø     Technology Partners that provide complementary hardware or software components that are integrated into the product

In summary, a customer with an existing legacy system should strongly consider this solution, due to the enormous cost savings associated with not having to migrate data integration, reporting, or other existing applications.  This represents not only a labor cost savings, but an opportunity savings reflected by limiting impact to the existing analytical business processes in place.  The greenfield situation is not as clear, due to the lack of industry benchmarks and the fact that Satori Server is not a stand alone solution.  In either case, scalability needs to be thoroughly tested as this is a new product with a small installed customer base.

Leave a Reply

You must be logged in to post a comment.