My Splunk Origin Story
A World Without Splunk
In my pre-Splunk days, I spent significant time leading the vision for standards and automation in our company’s large distributed IBM WebSphere Network Deployment environment. Even though we used standard build tools and a mature change process, significant entropy and deviations were introduced into the environment as a product of requirements for tuning, business, infrastructure, security, and compliance.
As a result, we were unable to recognize the scope of impact when it came to security vulnerabilities or violations with 3rd party compliance. Even worse for us, we spent way too many staff-hours trying to replicate issues between production and quality assurance environments because we had no easy way to recognize the contributing configuration differences.
It’s a Bird, It’s a Plane, It’s Splunk!
Given the challenge of aggregating and correlating disparate data, my searching eventually led me to Splunk.
I quickly grew acclimated to Splunk. The Search Tutorial walked me through the install. The documentation was easy to read and like nothing I had ever seen – I didn’t need a PHD to understand it and I was immediately seeing how I would get value. After some time playing and working with the developers of the [then beta] add-on for WebSphere (latest is at https://splunkbase.splunk.com/app/2789/), we were up and running with WebSphere’s configuration files populating in Splunk. We were finally able to compare environments to each other, themselves over time, or against the entire infrastructure thanks to Splunk’s search processing language, field extraction, and native processing of XML. Furthermore, Splunk’s Schema-on-the-fly meant that as WebSphere’s XML object structure changed we could make the minor tweaks and adjustments without having to rebuild a database, or re-code a custom solution, or wait for an updated product from the vendor.
Most importantly, the model around data aggregation with forwarders eliminated the risk of entropy inherit in other solutions. What we saw in Splunk was what was set on the infrastructure, regardless of changes. No one had to manually update records according to documented changes. In fact, because Splunk was read-only, it satisfied audit and change control concerns that other products presented. Lastly, because Splunk is the platform for machine data, regardless of structure, we were able to correlate problems and configuration with data from the java runtime (JMX), system metrics from the operating system (cpu, mem, disk, etc…) for both Windows and AIX, and both jvm and application logs. The compounding value from these otherwise disparate data sources was astonishing.
And There Was Much Rejoicing!
Thanks to Splunk’s dashboarding capabilities, I was able to create dashboards that dynamically presented configuration discrepancies between two JVMs. This addressed our challenge with identifying what discrepancies were contributing to unexpected runtime behavior. Another dashboard showed what JVMs were missing or had a non-standard value from a dynamically populated a drop down of all JVM custom properties that existed throughout the infrastructure. This satisfied our challenge of understanding the scope of impact for vulnerabilities and compliance addressed by such properties. Given the small and infrequent volume of configuration changes, this solution could be implemented with the free Splunk license. Using the free license obviously has limitations on functionality, but I mention this because its worth highlighting how much value we were able to get with such a small Enterprise License.
The Cliff Hanger
Unfortunately, I left that company (to be a full time Splunk admin!) before the solution was fully deployed. But that’s why I wanted to share it with you. I’d love to hear how others are able to demonstrate amazing value by adding configuration data to the operating system and log data that they are already Splunking. Please build on this solution and share it in the comments…or even create your own app or Data Models for Splunk! Show the rest of us how you’ve found your own way to “make machine data accessible, usable and valuable to everyone!”