Currently , we are collecting and sending our application logs through Kafka broker to Splunk for daily monitoring and it is around > 80GB/day. We are paying heavy amount to keep these logs into Splunk (90 Days retention period).
We would like to reduce the logs that stored into splunk that financially viable for our budget. Also exploring, if it is better to have a separate database made in another tool like S3 and then query that with splunk to avoid storing logs directly to Splunk?
2
Answers
Splunk used to be able to search Hadoop file systems, but that ability hasn’t been supported for a few years. Now, the only place Splunk will search is its own database.
If you run Splunk in AWS or Google Cloud then consider using SmartStore to off-load some of your storage to S3 (or the GCP equivalent). That can save some money.
To reduce your storage, try reducing your ingestion. Bring in only the data that fits your use cases. Don’t add debug logs to Splunk. Trim the rest to remove unneeded text, especially in Windows event logs.
Another option is using a third party tool called Cribl Stream to sit in between your sources and your Splunk indexers. It will give you an easy to use UX to reduce data. The windows pack alone will reduce data volumes by 25-20% with no dropped data. If you choose to drop data you can reduce you volume even more.
Stream also has an easy to use feature that lets you clone your data stream and write your full stream of logging to an object store like AWS S3, Azure Blog and so on and then write the processed smaller copy to Splunk.
Then use Stream Replay to search for data in the object and restore it back to Splunk on-demand. You get smaller daily volume in Splunk but the safety of knowing you have the option bring back anything from storage very quickly and with little effort.
My team got massive value from using Cribl Stream with Splunk. Changed everything we did with data and opened up a ton of new options we never had before.
The community version is free up to 1TB per day so you can get fast value with minimal investment.