I am in the process of briefing with vendors and developing survey questions for a research study to be kicked off in the very near future. The survey-based research, entitled “Integration Technologies for Cloud Services”, will be followed later this year by an EMA Radar Report on a similar topic.
There can no longer be any doubt that Cloud computing has come into its own as the biggest game changer since the Internet. Nevertheless, as with any radical change in application architecture, it also introduces a host of new management challenges. One significant challenge relates to integration across the various permutations of on-premise and public Cloud application environments.
In part to explore possible solutions, I attended Pervasive IntegrationWorld in Austin last week. During the primarily customer-focused event, Pervasive offered analysts a glimpse into the kinds of integration-related challenges their customers are solving with Pervasive solutions.
I did not attend the event last year, primarily because EMA’s BI team was covering (and continues to cover) Pervasive’s Data Migration and ETL tools. However, from my perspective, the company’s application-related message may be equally (if not more) compelling.
Here’s why. During one of the keynotes, a Pervasive exec spoke of an “Integration ecosystem” encompassing connectivity, data quality, Big Data, and security. I found this definition to be particularly apt as a new way of thinking about both integration and Big Data. In fact, it appears that Big Data has now overflowed the BI bucket, and is spilling into a host of adjacent areas, including APM. Analysis of Big Data has become an important tool for understanding and managing any system that is dependent on understanding and analyzing massive amounts of data. And that last phrase definitely describes today’s complex application ecosystems.
As part of the customer interviews for my “APM for Cloud Services Radar Report”, published earlier this year, I spoke with a Netuitive customer who is using Netuitive to process a billion—yes, that’s billion with a “b”—metrics per day. Netuitive helps him make sense of the massive amounts of log file data, SNMP and WMI metrics, transaction tracing and network sniffing metrics, etc., thrown off by production mobile application execution in a carrier environment.
Everyone talks about application complexity, but the scope of this complexity is difficult to comprehend without a detailed understanding of modern application deployments and real-time application execution. While BI and APM are similar in that both “crunch” massive amounts of data to siphon out important information, there is one big difference. For production APM, “Big Data” has to be analyzed in near real time. There’s no time to call in a DBA to run batch ETL or a BI expert to create and run reports on an OLAP cube. Data collection and analysis must be done in an automated fashion, in near real time, and on the fly.
With these and similar challenges always in the back of my mind, one of the most interesting presentations was by Mike Hoskins, Pervasive’s CTO. Pervasive has consistently invested heavily in R&D over the years, and Mike’s topics included Pervasive’s innovations in the areas of Big Data and Hadoop.
Pervasive has succeeded in creating a massively parallel processing system with its DataRush product, capable of scaling across multiple cores with near 100% processing utilization at each core. The only other vendor I am aware of that has invested so heavily in this type of core utilization is Citrix, which re-wrote its NetScaler code to scale across multiple cores with near-100% utilization. Regarding the Pervasive solution, Hoskins stated, “We have solved the multi-core programming problem”, enabling Pervasive data transformation or analytic applications to run at full bore on all cores, scaling from desktop to server to Hadoop cluster with no changes to the code.
Pervasive is also investing in the next generation of Hadoop by building replacement layers for faster computation inside Hadoop. The resulting marriage between DataRush and Hadoop will be “effortless scaling” creating a foundation for “Big ETL”.
My thought is this. What about going beyond “Big ETL” to “Big APM”? What would have to happen for Hadoop to become the basis for analyzing the Big Data underlying today’s most sophisticated application deployments? Applying Hadoop to APM would require near real-time processing which has not been a focus of traditional BI systems—but which might very well be possible with the ability to process data at “multi-core” scale.
The more I think about this idea, the more I like it. But while I have heard APM vendors messaging on “Big Data”, I haven’t heard of any that are leveraging Hadoop in this way.
I have to tell you, however, that I have spoken with several enterprise IT organizations toying with the idea of building Hadoop-based enterprise management analytics on their own (specifically for CMDB/CMS and application dependency mapping). If anybody out there is doing this—vendors or IT teams– please get in touch and I’ll include a shout-out (anonymous if necessary) in a follow-up blog. Meanwhile, it is certainly food for thought, as APM-on-Hadoop may well be the “killer app” that extends beyond BI to “EI”—Enterprise Intelligence.