Platform for “machine data” Splunk aims to climb the value chain
By Tony Baer, Principal Analyst, Enterprise Solutions, Ovum
Splunk, which specializes in delivering a data platform for “machine data,” is approaching a turning point. The explosion of sensory data – part of the Big Data phenomenon – is pulling the company in different directions. With a base as the data platform for IT systems management and security programs, Splunk could expand to other forms of machine data such as smart public infrastructure.
Or, as implied by the recruitment of key product executives from SAP and Oracle, it could venture higher up the value chain, developing more business-focused solutions around this competency. Either way, Splunk must choose its targets carefully. As a $150–$200m company, it can’t be all things. Splunk is already promoting itself as an operational intelligence platform that provides quick visibility of trends from low-level data. However, Ovum believes that the company could get more mileage in the market by positioning itself as an analytics platform that focuses on the velocity aspect of Big Data: Fast Data.
So far, so good
Splunk has a strong base with large enterprises, having penetrated at least half of the Fortune 100. Having gone public back in April, Splunk has so far proven an exception in a year of disappointing technology IPOs, such as Facebook or Zynga. Share prices have not only held their 90% first-day gains, but risen an additional 10%. Results for Q2, announced at the end of August, slightly topped analyst estimates.
Splunk’s base has been the data center, where its data platform ingests and analyzes log files generated by servers, storage systems, network nodes, and mobile devices, along with software applications and databases. The growth has been driven by growing traffic to and from websites – organizations are seeking to understand the performance of their Internet operations while ensuring security.
Growing out from data center roots
Splunk stores log files that are emitted by devices and software programs managed primarily by corporate data centers. These log files have traditionally been stored in silos, maintained by the respective application, operating system, or hardware provider, and are a natural complement to clickstream analytics.
Those log files tell the story of what happens to customer transactions; for instance, analysis can detect if an abandoned shopping cart on an e-commerce site coincided with an internal problem or from the user’s conscious decision. Likewise, log files pinpoint security breaches, which have become a common use-case for Splunk customers.
Splunk’s value-add is providing a common platform that accepts this log file data and indexes it, making it searchable. This has spawned a partner ecosystem and an active OEM business. For instance, the Splunk database is embedded in appliances from Cisco, F5 Networks, Juniper Networks, Coradiant, and others. It is also embedded in management tools from providers such as RightScale, Double-Tale Software, NetMRI, and IBM Tivoli. Splunk also maintains partnerships with Microsoft for instrumenting all levels of the Windows and Microsoft Office stack, and with VMware, for providing visibility to server virtualization.
So where does Splunk go from here?
Splunk has several options. It could start actively pursuing machine data originating outside the data center such as data from power grids or other utilities, transportation networks, weather stations, manufacturing, or supply chains. That would require targeting new verticals; as a $150–$200m company, Splunk needs to aim carefully.
Instead, Splunk is redoubling on what it knows best – the data center. But it is also looking to go higher up the value chain. Following the IPO, the company recruited key product executives from SAP and Oracle.
Today, Splunk’s products are intuitive only to application developers, systems admins, or systems operators. The interface is search-based, which requires domain knowledge of log file descriptors. A semantic metadata layer atop Splunk’s indexes would be necessary for providing more abstracted views that would be meaningful to business users; as the company’s new senior vice president of products is an SAP veteran who oversaw integration of the Business Objects acquisition (and its Universe metadata layer), such a direction would not be surprising.
Splunk is not the first to have this goal. Presenting higher-level views of infrastructure data has long proven an elusive goal for systems management providers such as CA, IBM Tivoli, and HP. They have long sought to transform their monitoring consoles into dashboards that show availability of processes such as order-to-cash process or available-to-delivery process, but have had little success.
For Splunk to get there, it must connect to the right data; that will require partnerships with the IBMs, SAPs, and Oracles of the world to get visibility to application metadata that could provide context to buckets of log files. To its credit, Splunk has pulled off similar feats with infrastructure providers who have traditionally jealously guarded their turf; it must do it again with a new level of partner. The next piece of the puzzle involves learning to target higher up in the IT organization, which will require changes in how Splunk goes to market.
Operational intelligence = Fast Data
With machine data playing a big role in the explosion of data, it makes sense for Splunk to ride the Big Data wave. From the technology side, it has introduced adapters to read Hadoop logs, just as it already does with commercial SQL database platforms. It has also begun developing tools for utilizing Hadoop as an extended data store for Splunk data with connectors that stream or archive data that would otherwise be purged. With the new version 5.0 release, Splunk has added the ability to federate its indexes, which improves performance and scalability.
Ovum believes that Splunk’s own positioning as a platform for “operational intelligence” at the systems or device level provides a strategic tie-in to Big Data: it performs rapid analytics across larger samples of operational data. While not a realtime platform in the strict sense, Splunk can provide current snapshots by aggregating and making sense of variably structured log data. Splunk has a good opportunity to distinguish itself for mastering the velocity side of Big Data, which Ovum defines as Fast Data.