Basic Log Storage Calculations
Determining the sizes of log management systems requires knowledge of the number of devices being monitored and the anticipated event rates for each class of system. In many customer engagements, Professional Services time may be required to measure the event rate calculations from all of the monitored devices. This is important since there are too many variables to predict the average or peak Events Per Day (EPS) of any given system. I would caution any customer that if the vendor they are working with gives them “magic” calculations and pricing without gathering the necessary information regarding customer-specific speeds and feeds, they can expect to spend a lot more money later once the vendor gets their foot in the door. Basically, poor planning will result in unavoidable OP/EX costs later.
EPS is one metric used by many log management and SIEM vendors to determine such factors as licensing, storage and peak system loads. Another variable used could be Events Per Day (EPD), especially when it relates to storage sizing and license enforcement. This is why it’s imperative that accurate device counts and product types are audited when planning a centralized log management or SIEM solution.
As an example, a PIX firewall logging via Syslog using a notification level of logging could be anywhere from 1-20 eps, depending on location, susceptibility to un-trusted networks, number of filters, 3DES services (SSH or IPSEC), proper configuration and many other factors specific to each configuration and networks that is intended to protect. If that same firewall were logging with an informational level or debugging level of logging then it would generate between 3-5 times the events that informational level logging would generate.
Next, event size is crucial in properly designing log management as every device vendor will have a different log format, event size, transport mechanism, logging levels, etc. The difference between a 300 byte message and 700 byte message is significant when you are capturing >1000 EPS (~26 GB/day vs ~60 GB/day). Syslog messages in accordance with RFC 3164 may not be larger than 1024 bytes, but structured or “normalized” event data can reach upwards to over 5,000 bytes (with padding and fragmentation). In some cases, when a vendor tells you they normalize all of the event data, this simplifies the sizing and capacity planning because every event message, regardless of vendor, will be a consistent size (not to mention easier to read, search, index, etc). Some vendors even allow customers to normalize the and reserves a field within the normalization schema to attach the original “raw” event. This is great for litigation and forensics but almost doubles the storage requirements ! As an example, if the normalized event becomes 1500 bytes (regardless of whether the raw event was only 600 bytes) the final event size, with a 500 byte “raw” event attached, would be somewhere around 2Kb.
One way of measuring the event rates and event sizes for Syslog is to use a protocol analysis tool such as WireShark, Etherpeek or TCPdump to capture the events on the sending or receiving host or off of a spanned port on a layer 2 switch. Filter the capture for only UDP 514. The analysis does not need to capture any payload and can be run for 24 hours. Once the capture is complete, take the total count of UDP 514 packets and divide that number by 86400 (number of seconds in a day) and that should give you a rough average of the total events per second (eps).
Additionally, calculating the EPS generated by a log file is much easier since you just need to count the number of lines captured to a log file in a 24 hour period and, again, divide that number by the number of seconds in a day (86,400). Then you would multiply either EPD or EPS by the message size to determine storage.
My explanation to on sizing log management has always been:
RAW event = ~600 bytes
NORM event = ~1500 bytes
DAY = 86,400 seconds
EPS = Events Per Second
EPD = Events Per Day
SIZE = Amount in bytes
DISK = Disk space requirements
COMPRESS = Assume 10:1 ratio
First, we must determine the EPD, therefore:
EPS x DAY = EPD
(i.e 1000 EPS x 86,400 seconds = 86,400,000 EPD or 86.4 MEPD)
Then we must determine how much disk space that will yield depending on whether they are RAW events or normalized events:
EPD x RAW = SIZE
(i.e. 86.4 MEPD x 600 = 51,840,000,000 bytes)
~ or ~
EPD x NORM= SIZE
(i.e. 86.4 MEPD x 1500 = 129,600,000,000 bytes)
Then we need to compress those events with 10:1 compression to get an approximate daily disk requirement. To do this we divide the maximum daily size allocation for events by 10:
SIZE / COMPRESS = DISK (RAW)
(i.e. 52 GB / 10 = 5,184,000,000 bytes)
~ or ~
SIZE / COMPRESS = DISK (NORM)
(i.e. 129 GB / 10 = 12,960,000,000 bytes)
Finally, we determine the annual required disk space by calculating the daily disk requirements by 365:
DISK (RAW) x 365 = YEAR
(i.e. 5,184,000,000 x 365 = 1,892,160,000,000 or 1.8 Terabytes)
~ or ~
DISK (NORM) x 365 = YEAR
(i.e. 12,960,000,000 x 365 = 4,730,400,000,000 or 4.7 Terabytes)
Once the EPS has been determined for each device across all categories of devices, that number can then be summed by the number of monitored devices in total to provide an estimated total average EPS.
To determine the amount of storage requirements for this measure of EPS, use the formula described above which is:
EPD * RAW / 10 * 365 = YEAR (compressed)
~ or ~
EPD * NORM / 10 * 365 = YEAR (compressed)
I hope this is useful in determining the amount of storage required. I have a handy calculator you can use once you determine the EPS for all the different event source types at http://www.netcerebral.com/?p=125#more-125
2 Responses to Basic Log Storage Calculations
Leave a Reply Cancel reply
- SIEM-as-a-Service: do the survey and let me know if you’re an early adopter… July 6, 2016
- Chronology of a Ransomware Attack December 2, 2015
- AIO WP Security Firewall Log Hacks August 12, 2015
- Are you a Security PreSales Ninja? July 28, 2015
- SCAM: Call from Computer Maintenance Department July 22, 2015
- Event Log Convergence = Business Intelligence January 18, 2015
- Mandatory Firewall Rules for Internet Facing Firewalls July 23, 2014
- How to Become a C.S.I. – Enterprise Forensics using a SIEM March 26, 2014
- Determining Peak EPS Calculations in Logging January 21, 2014
- NetCerebral’s Device EPS Calculator January 21, 2014
- URL spoofing – what it is and what to do about it [VIDEO]
- iOS 12 is here: these are the security features you need to know about
- Here we Mongo again! Millions of records exposed by insecure database
- Years on, third party apps still exposing Grindr users’ locations
- How Facebook wants to protect political campaigners from hacking
- Intel releases firmware update for ME flaw
- Hackers selling research phished from universities on WhatsApp
- 91 “child friendly” Android apps accused of exploitation
- State Department scores an F on 2FA security
- Vote now! Which web browser do you trust the most?