The following form assumes you have done the preliminary math of determining your number of devices and the total anticipated Events Per Second (EPS) you will be collecting from all of your logging devices. The calculator uses EPS to determine the Events Per Day (EPD), amount of raw and normalized log data you will generate daily and then use retention and compression values you set to determine the required amount of storage as well as IOPs required.
Stay tuned and I will be building another calculator that will allow you to specify number of devices by device type, which will give you your estimated EPS (needed as a starting point for this calculator).
This is a tutorial I posted on Anti-Online back in 2006 – just thought I’d update it and pass it along. It makes me laugh when I see some of this old scripting “Kung Fu” I had to do with Grep, Awk, Sed in order to do something that takes seconds with a good CLM or SIEM tool!
DISCLAIMER: This is a tutorial of sorts that takes you through a day-to-day problem and solution that I was often faced with in my Security Planning / Operations role for a large Telecommunications company. I am not making any assumption as to where in the curve people reading this will be situated and I don’t even guarantee this will be a good read. In fact, given my exposure and expertise of the tools used in this article, I may be missing the plot and some may find an easier, softer way of doing what I was tasked to do. Having said all of this, for those I’ve confused, sorry, I tried to provide links for further reading. For those I’ve disgusted with my simplicity or seeming Lamer approach, well, like you, I’m always learning and I’m open to criticism and advice.
Why is it when you Google for something you absolutely need you can never find it? Well, case in fact, I had a Squid proxy server left over from a decommissioning project that was still seeing tons of traffic when it shouldn’t be seeing any! The Linux server was locked down using sudo and no one knew the root password so we had very little choices as to what programs we could run to view activity. The server was flaky and Netstat would never finish outputting the current activity. So the server folks approached me and asked if there was any way to find out what unique IP addresses internally were connecting to the five pre-configured proxy ports (8080, 8082, 8084, 8086, 8888).
As it turns out, the Squid admin user had access to the Tcpdump application and could run the application against Eth0. I got him to run Tcpdump and output it to a dump file for three hours worth of activity during the lunch hour web traffic spike. This produced a 470MB text file that I had to SFTP from his server to my Linux box.
Alrighty then! What do I do with a honkin’ text file that repeats the same info endlessly? We have hits from employees and internal servers hitting the proxy ports, the proxy itself establishing connections to the web, the foreign sites replying to the proxy and then, finally, the proxy returns the data to the corporate host. One conversation from an internal host connecting to the homepage of their favorite security tutorial site could warrant four times the number of HTTP flows. I needed to strip out extraneous information and narrow down the million+ lines of data to something sensible. So, I started thinking of the commands that would be required so that eventually I could write a shell script.
Many of the competing log management and SIEM tools on the market these days use some variation 0f the Events Per Second (EPS) metric to determine the licensing, sizing and storage requirements for scalable solution. Unfortunately, none of the devices that are to be monitored have a specification associated with the amount of logging which will be generated per second (or volume for day, for that matter!) by the device. Moreover, many of the same device type from the same vendor will generate varying amounts of log volume daily and it’s more of an art than a science when determining what the total volume all of the corporate devices will generate daily.
Determining EPS isn’t a problem for existing log management or SIEM customers looking to upgrade to a new solution as they can generate reports from the old log management/SIEM tool and provide a break-down of device type and the daily volumes generated by each device category. However, prospects looking for a proposal for a net-new solution are plagued with the following tasks to properly design a log management or SIEM solution:
- Complete inventory of all assets they plan on monitoring
- Determining average, sustained event rates expressed as an EPS metric
- Understanding how logging levels impact the volume of logs that are generated
- Retention periods, storage options, use cases, regulatory requirements, ad infinitum
Fortunately, once you have a device count and can determine the EPS generated on average by each of the different device categories you need to monitor, the math is easy to determine the licensing, storage, system performance and archiving needs. My post “Basic Log Storage Calculations” http://www.netcerebral.com/?p=208 can assist in the sizing, as this post is geared more towards guessing the EPS averages for each device types.
In my roles as a presales SE that sold log management and SIEM we often were asked by prospects for budgetary quotes, proposals and architecture with little to no empirical data. In most cases the best we could get out of the prospect is an itemized inventory of the number and types of systems they would like to monitor. Without an understanding of the log volumes generated by devices, unique to every customer’s environment, we had to come up with a system of determining the EPS for the different device classes and using this as a starting point for calculating daily storage (EPS * Event_Size * 84600 / Compression Ratio).
The list below is an example of lessons learned in the field from actual customer environments and a document provided by SANS (sponsored by NitroSecurity – now McAfee) called “Benchmarking Security Information Event Management (SIEM)” (found at http://www.sans.org/reading_room/analysts_program/eventMgt_Feb09.pdf). With the information we collected we devised a list, which is a cross-section of averages per event source.
I hope you find this helpful:
It’s been a while since I had to put my SANS Incident Handling hat on or did root-cause analysis and Network Forensics on an actual attack this close to home. December 13th, 2011 marks the day that 144 websites mapped to the same IP address hosted by HostPapa were injected with a number of files that replaced their home pages with that of some script kiddy’s – website defacement on a large scale. Admittedly, netcerebral.com was one of the 144, as were two others, that I manage part-time.
The attacks appeared to originate from Kuwait (inconclusive) and when I traced the names of the attackers, their email addresses and the “Muslim Hackers” they were sending “GR33T5” to, it became evident that this was “bragging rights” under the shroud of “hactivism”. In fact, the hackers went as far as to list all 144 websites hacked at the same HostPapa IP address on www.zone-h.org, a pubic attribution of website defacements where hackers brag and place “mirrors” of the website defacements as proof of their misconduct.
The hackers jointly go by the alias of “7rb-team” and, according to zone-h.org, have successfully defaced 3,414 homepages since December 2nd, 2011 (and are currently still active with almost 100 defacements daily, in January 2012 alone).
Since HostPapa has not provided the access logs for the date of the attack (they had been requested but HostPapa doesn’t keep archives) we are left to assume the attack vector that was used to inject the PHP code into the websites. I have narrowed it down to either a SQL Injection or PHP URL Inclusion. The sites all had “wp__” ID tags on the WP core, no .htaccess files, out-dated WP PHP plugins and a number of other vulnerabilities, inherent to WP (themes are another possibility). I suspect the attackers used recon scanning to detect the open vulnerabilities on the site and then compromised the vulnerability to write files to the root of the virtual directory.
Once the PHP shell was injected, they connected remotely and ran the Syrian Shell which automated the creation of all “index.htm” files and downloaded all of the other artifacts that I found on the site.
The service provider detected the mass infection across the customer’s sites a day after the attacks and shut-down the sites. They opened a ticket and notified one of the billing contacts that the site had been shutdown and instructed us to backup the site so they could wipe it away and we could then manually restore the site. Fortunately, I had backups that I had done months prior to the attack but some of the newer posts were missing. The other issue is that, while I had backups of the site directories and MySQL for each, the attackers had injected files to the home root directory that needed to be cleaned up as well (directories such as /cgi-bin, /cpanel, etc were all infected).
I eventually decided to backup the entire site with all three domains, download and unzip them on my local PC, where I had Apache, PHP and MySQL running in a VM sandbox. I went through the painstaking task of removing 50+ occurrences of “index.htm” (the defacement page) and 5 instances of PHP shell kit code that had been injected in the root of the parent website. Next, I dropped all of the tables in two of the databases (the third site doesn’t use a DB) and restored from backup in MyPHPAdmin. Once the sites were functioning the way I wanted, I upgraded the WP core, updated all plugins and then installed WDS Security plugin, which found additional vulnerabilities, which I cleaned up on both sites.
The Evil Script
One other advantage of having the VM sandbox is that after I made a backup and export of the sanitized site, I reverted the VM snapshot back to when the site was infected and played around with the Syrian Shell (not recommended in a prod environment!) and could replicate what the attackers did once they had the PHP file uploaded to the site.
When you open up the code in an editor the first line of the code reads:
# syrian shell is a php evil script , please use it against Israel Only
Apparently the attackers didn’t read this line and showed no discrimination about who their targets were going to be.
The malicious script also comes with a GNU Public License disclaimer with more preamble about attacking Israel and then proceeds to allow the attacker to configure their own password for the shell.
The script then immediately starts to list privileged functions such as:
- Get Real IP Address
- Open Base Directory
- Base64 Encode/Decode
- Safe Mode (Read-Only)
- Search and Count a File Name (such as index.html)
- Suicide (aborts and deletes shell)
- CMD Shell (Win/Linux)
- Index Changer (supports multiple CMS tools)
- Get Passwords (reads /etc/passwd, domainalias and shadow files)
- System Info (runs netstat, arp, routes, ls, etc)
- MD5 Password Hashing
- Database Tools (Oracle, MS SQL, MySQL and PostGRES
To be fully convinced that I was no longer at risk, I upgraded to the latest WP v3.3.1 on all sites, updated all plugins, disabled any that weren’t in use, created .htpasswd and .htaccess files and installed the IP Filter plugin to block a list of bad IP addresses and installed WDS Security on all sites (and corrected an y issues detected by WDS). I have since started to automate backups of the MySQL database and WP files so next time I get hacked, I can simply drop all the tables in the DB and restore from a backup.
I have definitely learned a valuable lesson in how vulnerable PHP/WP is and will stay on top of the site with updates, etc.
While we may have a great library of RFP responses, every new RFP has those challenging questions that will require creative writing. They’re always scenarios that no vendor (including the competitor) can address because prospects are looking to combine functionality from multiple security projects and have the SIEM tool save them budget or provide a “Swiss Army Knife” solution. These sorts of questions require clarification, advanced technical writing skills and consume hours while you ponder the final response.
An average RFP will be between 100-200 questions in length and take each resource approximately 20 minutes per question to research, cut and paste the answer, embellish, format and correct grammar before moving to the next question. With that in mind, it’s no wonder why a 100 question RFP would take an SE 30-40 hours to complete, much to the sales reps chagrin. Final formatting, cover letters and waiting for 5-10% of responses that have been farmed out to engineering, marketing or sales teams to complete, usually adds another 16-20 hours, thus totalling more than a week’s worth of work for the SE.
Now, if you don’t have an in-house, dedicated RFP response team, extensive knowledgebase or boilerplates, and you have limited SE resources that are busy with customer meetings, demos or POCs, now you have to increase the response time given that the SEs will only have time in the evenings to work on the RFP.
Having written RFPs in the past, I know it takes months to put together the requirements and agree on what product features you seek from vendors and then do your homework so you know what vendors to include in the response – yet the response deadline is usually between 7 to 15 days.
To streamline and provide appropriate resource coverage here are some things we tend to do:
- Assess the product fit and decline to respond if the RFP is clearly influenced with competitor differentiators
- Immediately ask for an extension (sometimes this is best done by the SE manager – never say you are too busy!)
- If working with a channel partner, ask them to assist in the responses (provide them with boiler-plates, past RFP responses)
- Evaluate the schedules of your SE team and determine who has the most cycles to contribute to the bulk of the work
- Consistently update a central repository of RFP knowledge with any unique questions discovered during an RFP
- Build response templates and distribute to the SE team immediately
- Get the Sales Rep involved during the RFP event, providing cover letters, company background, perhaps some of the easy technical questions
- Farm out a portion of the RFP to other SE organizations outside of your region
- Seek out internal project management teams that may be dedicated to RFP responses – they may be able to answer the easy questions, manage formatting, printing, binding and can manage the resource deadlines
- Establish and maintain relationships with some of the prospect’s technical owners, getting them to assist in wording, proposals or additional clarification questions after the question deadline (this is a gamble)
I was using the cloud service by EditGrid but they went offline – Use the three calculators I built below instead
NOTE: EDITGRID IS NO LONGER IN BUSINESS…
Select the “click to edit” button at the top of the spreadsheet to start entering data. Select the drop-down button in the top left corner for features such as full-screen, download as excel and info related to EditGrid.
To use, just enter total quantity of each device type into the “Device Quantity” column. The “Per Device EPS” column provides industry averages for the event per second (eps) rate from each device type and you can change the values with your own. Next, modify the values next to the text highlighted in red under the “Event Capacity Planning” section to finish your planning.
You may want to do this separately for every remote site you plan on aggregating event for to model the bandwidth and storage planning. Continue reading
Cloud Computing describes systems that provide computation, software, and data access services without requiring end-user knowledge of or dependence on the system’s physical location and configuration
As an example, take an online vacation reservation system that may be a hosted cloud model such as Software as a Service (SaaS), in which your business would host an application that consists of a web front-end, database, storage and billing services.
While the cloud provider provides an Application Programming Interface (API) and access to the various components through traditional interfaces such as SSH, FTP or SOAP, there is limited access to the underlying systems as they are usually multi-tenancy in which multiple customers share their applications on the same system. This creates challenges for monitoring and controlling the security controls governing your application.
Cloud providers will provide SLAs and frequent security reports but there is no visibility into who is administering the systems hosting your application or what vulnerabilities may be present that will allow attackers to successfully compromise the systems using SQL injection or Cross-site scripting attacks.
Cloud providers will usually allow you to conduct third-party web application penetration testing against your own URL but will not allow you to monitor their servers nor will they send you events from their network security devices (IDS/IPS, firewalls, etc), which would allow real-time correlation and threat mitigation. Essentially, you lose control of your sensitive data and who may be accessing the systems in adherence to your security policies.
With the rise of Botnets, Scareware, Phishing, Brand theft, social network vulnerabilities and many other forms of evolving malware, Cloud Computing companies that will be most successful will be those that offer security monitoring services with logical segregation that uses context regarding your business, such as:
- Real-time threat feeds
- Lists of nefarious IP addresses
- Countries of concern
- Export control
- Software vulnerabilities
- Geo-spatial disparity
- Customer activity profiling
- Privileged user accountability
- Perimeter threat baselining
- Terminated employee monitoring
With this context information correlated with real-time events gathered from all of the control points between the cloud components, customers could receive real-time alerts from the cloud and would access a GUI to drill-down and conduct post-analysis of threats and then create their own dashboards or reports regarding attackers, application issues and administration accountability.
This model would alleviate the loss of visibility by placing applications into the cloud and ensure your auditors have access to the security and compliance data they need during an assessment.
Wikipedia states that Network Forensics is “…proven techniques to collect, fuse, identify, examine, correlate, analyze, and document digital evidence from multiple, actively processing and transmitting digital sources for the purpose of uncovering facts related to the planned intent, or measured success of unauthorized activities meant to disrupt, corrupt, and or compromise system components as well as providing information to assist in response to or recovery from these activities…”
This business case requires a number of different tools, the most important of which is an enterprise-class Security Information and Event Management (SIEM) tool, which becomes the epicenter of all investigations and workflow. The SIEM must have some mandatory features which I will cover later in this article. But first, I would like to tell you how it’s done without SIEM.
In a previous job as a Network Security Specialist, I was in charge of tapping the wire for employee investigations and handling the data with chain-of-custody. This served as a daunting task as I would start my data captures with Open-Source software and use the spread-sheet kung-fu method of mapping all of the user activity and log data into digitally-signed archives, pending possible litigation. I established all of the guidelines and processes with support from our Legal and Corporate Fraud teams and built the procedures around the following processes:
Being tasked with selecting a Security Information and Event Management (SIEM) tool for your organization can be a bit overwhelming. I’ve been there and chosen poorly (in my last life)! The questions you need to ask the SIEM vendor you are buying from are limitless as every customer’s needs are different and the business drivers range from “check-box” compliance to actual enterprise incident handling and response.
Numerous customers have approached me with what they thought were straight Log Management (LM) requirements, since they have only ever had the luxury of manual log review using the “Grep”, “Awk”, “Sed” approach or “spreadsheet Kung Fu”, while others have the budget and want to “boil the oceans”. There are hurdles with both approaches, while the former may be the way to “grow” into a mature concept such as a SIEM tool and the latter will never be outgrown.
In fact, before you can perform real-time analysis on all of the logs to detect threats as they occur, you need to capture all of the event data from the plethora of heterogenous event sources and store the logs in a centralized location. Therefore, I believe log management is an essential part of SIEM because, with the right tool, 100% of your logs are readily available with automated archiving and retention. Additionally, since you have mandated all of the logs from the various technologies to be sent to your central facility, the teams that manage the devices will need an easy-to-use tool that will allow them to do their day-to-day tasks such as troubleshooting network issues, application development debugging, long-term investigations and possibly the last six months of an employees activity for HR or litigation purposes.
Regardless, you should have a strong command of what it is you need SIEM for and use vehicles such as Request For Information (RFI) or Request For Proposal (RFP) to rate each vendor on the top mandatory requirements vs. the “nice-to-have’s”. For this purpose, I have compiled a list of questions that you may determine to be useful when creating your vendor ratings criteria. Here are what I believe to be essential 70+ requirements for the ultimate SIEM and Log Management tool: