Event Log Convergence = Business Intelligence
I have come across many prospects over the last 15 years that are only trying to acquire a SIEM solution to satisfy a compliance requirement, or what we call in the industry, “check-box purchasing” – they have a minimum set of requirements specific to only one business unit or compliance mandate that is completely siloed from the rest of the organization.
Here is how this conversation usually goes:
Client: “we would like a SIEM tool that will help us monitor our 200+ Windows Servers for PCI-DSS compliance”
Me: “What other event sources are you going to be monitoring with the solution?”
Client: (Stunned look) “we only need to monitor our servers.”
Me: “PCI-DSS requirement 10 states you have to monitor the logs from all of your security devices and servers that are deemed critical assets.”
Client: “our department is only responsible for the servers we listed.”
Me: “To get value out of a SIEM solution and monitor all 12 PCI requirements you need audit logs from all of your devices and contextual information regarding your network, asset and vulnerability data – and that will just get you started.”
Client: “Perhaps we need to increase the scope – we’ll get back to you.”
While the Centralized Log Management (CLM) and Security Information and Event Management (SIEM) vendors will be lined up around the block to influence the sale, the vendor you choose should be a trusted advisor. They will be interested in providing you the most value from your investment and assist you in designing a solution to satisfy many business problems that goes beyond a traditional security-centric SIEM. This is why you will need to identify key device types and the value that can be derived by cross correlating the log data with business context to align monitoring with your governance, security and compliance initiatives.
The SIEM Value Derived from Heterogeneous Device Logging
While each of the event sources you collect events from will provide distinct reporting and alerting value, combining many different “types” of event sources will derive immediate intelligence about the business and help analysts establish baselines of threat activity. One of the other benefits is that incidents can be prioritized by business value, threat classification and the additional context can help reduce the plethora of false-positives or false-negatives that plagues every CLM solution.
Additionally, the multitude of the various technologies have their own management and reporting solutions that become “silos of information” that only the black-belts responsible for each of the device types are able to decipher. This makes security intelligence and investigations near impossible when an analyst has to request log data from the owners or log in to many different systems to find the evidence to support their cause. Essentially, they would have to piece together the clues and manually normalize the data using a technique called “Spreadsheet Kung Fu”, which would be fraught with assumptions and inference.
Below is a list of different device types and the value that can be derived from each when correlated together. The list isn’t exhaustive and I’m not suggesting you need everyone mentioned to successfully deploy a SIEM, but the more data feeds you can correlate, the more intelligence you will have available in the future to expand and grow with your business (click “more…” for complete article):
Vulnerability Management / Asset Inventory: Vulnerability data is important in that it helps the SIEM tool identify and “model” the context about what the business deems critical infrastructure. Often, the SIEM tool will allow customers to automatically import vulnerability or code-level pen-test data and populate an asset and network inventory that customers can then logically group and tag with business or compliance weightings, allowing the correlation engine to raise or lower threat priorities and only alert on incidents that is important to the business. This critical component differentiates SIEM from a simple CLM solution. While you can configure a CLM to alert on thresholds such as an IPS reporting an attack on a PCI asset, it does not have any concept of target asset susceptibility, and thus, even a SQL injection attack targeting a system with no database will raise an alarm. Without business context, every incident becomes a fire drill!
NMS / ITSM / Change Management: This is another example of valuable business context, Network Management systems aggregate data from multiple event sources while also providing key details regarding IP address assignment, network device configurations and network issues which may impact business continuity or which may have origins related to an attack on a different part of the infrastructure. IT System Management solutions usually include asset inventories, patching history, key performance indicators (KPIs), service level agreements (SLAs) and other valuable meta-data that can be injected into the correlation process as every event is inspected. Change Management systems are crucial real-time intelligence sources, which can provide correlation logic regarding systems slated for changes and provide deltas for those systems that were modified outside of a change window. Consider the following: if you extract data from a change management system, which gave you a list of all PCI assets (stored in memory on the correlation engine) that will be changed or perhaps rebooted during the nightly change window, then there would be no need for the SIEM to alert you regarding these assets when they have been rebooted. This will raise the priority and add certainty to the alerts, which indicate other PCI assets without documented exemptions that have been modified or rebooted as well.
Operating Systems: Correlating logs from *nix, Windows, Mainframe and Midrange systems will help answer some very important questions about the overall compliance of your systems and valuable insights on what your users are doing. Managing the profiles on event collection across a cross-section of different OSs will be challenging as every vendor has a different logging mechanism (and in some cases, multiple mechanisms such as AIX syslog and AIX PR flat-file) but the benefits are endless, especially to meet compliance or governance mandates. Insider Threat, Login Success/Failure, Privileged User Monitoring, User Attribution, Session Correlation, Sensitive Data Monitoring, Generic Account Monitoring, Malware / Rootkit Infections, Local Account Creation, System Modifications, Application Logging, Vulnerability Reporting and many other use cases can be satisfied just by having verbose logs streaming in from all of your operating systems. Without OS logs you will not be able to answer many compliance questions such as “who added new users?”, “has my host-based security been disabled?” or “has someone cleared my security logs?” to name just a few.
Authentication Devices: Enterprise Directories, Identity and Access Management (IAM), Two-Factor Authentication and local host authentication are all valuable feeds that provide important clues regarding user and IP address attribution. Say for example I have a user that logs into the Windows Domain as “John.Doe”, who then opens a Secure Shell to a Unix system and logs in as “root” and from there they log into a custom internal web application as “GenericUser_1” – the log events from all three systems will only contain the destination username, which is impossible to attribute back to “John.Doe”. The common denominator in all three log entries is the fact that the user logged into the domain first and there should be a session history for each entry that the correlation engine would then interpret as “John.Doe logged into UNIX as Root” and “John.Doe logged into a custom app as GenericUser_1”. The logic is much more complex than the example I have given and the problem is bigger, especially when you consider most users have many aliases that may be associated with their actual human identity. By incorporating all of the user identities and attributes as context into the correlation engine, a user’s many aliases would then be mapped back to a unique identifier, similar to how IdM/IAM solutions work. The obvious benefits with this approach are the number of man hours that will be saved when analysts or forensic investigators are trying to generate a report on all activities associated with a particular user. This eliminates the need to generate a different report for every permutation of a user’s aliases or missing a user’s activities when they were logged in with generic account names.
L3+ Services (DNS / DHCP / WINS): An improperly configured Domain Name System (DNS) could reveal sensitive information about your hosts and IP address assignments to attackers performing recon attacks. The value of having this information tied into a correlation engine is the ability to perform attack history. If for example, an external IP performs a recon scan against your DNS and a week later, they are back trying to penetrate the system, the SIEM will remember them and raise the priority of an alert. Dynamic Host Configuration Protocol (DHCP) also assists with historical investigations and reporting as users and hostnames would be mapped to different IP addresses across different parts of the infrastructure (LAN/VPN/Wi-Fi) and correlation should be able to map the names to IP addresses. This information should include both dates and times so that an analyst need only search for a hostname and the SIEM should be able to provide query results that show which IP address that host had on different dates and times. This is especially important in circumstances where the employee was being investigated in a call center where they are given a different workstation every day or if they are teleworkers and connects through VPN multiple times, every day.
Network Routers/Switches: These essential network devices can provide important statistics for security incident correlation. Layer2/3 switches can assist as lightweight anomaly detection for deviations in “normal” baselines over hourly or daily averages. Additionally, monitoring critical applications and servers requires an understanding of the entire underlying Layer2/3 infrastructure – a properly modeled asset inventory will help security teams with visibility in all tiers and dependencies of the application/service stack. Additionally, when relating to compliance reporting, these devices provide an audit trail for network configuration changes, network errors (i.e. spanning tree loops, flapping interfaces) and also critical events such as router/routing protocol attacks and port mirroring (used by sniffers). Finally, poorly configured devices with no AAA or passwords configured on VTY ports will be evident in the network device logs.
Network Flow Aggregation: IP Flow data (NetFlow, J-Flow, sFlow) provides unidirectional (one-way) meta-data regarding all hosts communicating on your network, session handshakes, network statistics, bandwidth utilization, routing information and much more. However, the IP Flow data is usually unidirectional (one-way) and can saturate a networks with the voluminous flow exports. Many customers utilize a Flow Aggregation tool that will then capture all of the flow data, aggregate it into bidirectional conversations and include details such as the amount of data transferred during a network session, filter out the chatter and add context regarding zones, subnets and other important data. The SIEM tool can then utilize this information to monitor bandwidth thresholds, detect network protocol anomalies, profile and baseline “normal” application behavior and detect zero-day malware, botnets and insider threats. Flow data correlated with authentication attribution data can solve many problems with who was on the network at the time of an attack.
VPN (Remote Access / SSL / Site-to-Site): A majority of businesses allow their employees to access the company intranet and applications from home and have replaced costly Frame-Relay or ATM circuits to their remote sites with Internet connectivity and VPN. Additionally, out-sourcing and providing VPN connectivity for partners for such purposes as managed services, call centers, financial services, etc., has added additional complexity to what are considered “private” networks and compliance mandates dictate you must watch every ingress point into your environment. Some of the value that SIEM brings to these increasingly complex relationships is the ability to map DHCP addresses (private) to Internet addresses, users to aliases and business “zone” information to corporate policies. Populating the SIEM with an understanding of what networks are deemed “partner” or “remote access” can help establish policy enforcement for critical threats such as “why is my supplier trying to access my PCI assets?” or “why is my remote site failing IPSEC authentication, while originating from the wrong public IP?”. Furthermore, if you are asked by HR to put an employee, who just tendered their resignation, on a watch list and report on all of their activity, you will need to monitor everything they do from inside the network as well as what’s done through remote access.
Firewalls / Proxies: These event sources provide the “low hanging fruit” for security monitoring. Firewalls and proxies provide tons of information regarding traffic and bandwidth consumption that is entering (ingress) or leaving (egress) your network. In many cases, this source data will reveal malware beacons or APT bots trying to call C&C networks, network user non-compliance (P2P, IM, bad sites, etc.) or large attachments being sent to competitors or countries of concern. However, part of the problem with firewall and proxy data (as with most network devices) is that it does not have any context about the environment it is protecting such as username (unless you have NTLM transparent auth on your proxy), susceptibility of target assets, history of attack or the IP/domain reputation of the source or target addresses. SIEM can enrich the firewall and proxy data with context about the enterprise, users and their roles, asset criticality, false-positive reduction for top “N” reporting and assist in pinpointing origins of attack or misconfigurations. Additionally, when monitoring critical servers, your auditors may want to know which ones are communicating directly with the internet or which administrators have been downloading nefarious content onto production servers – all of these answers can easily be available ad hoc with a SIEM tool.
Anti-Virus / Anti-Spam / Anti-Malware: these tools essentially are used to keep bad stuff out of the organization and assist in independently fingerprinting, identifying and quarantining the threats against your users and systems. Back in the early days of the IT security evolution, anti-virus was in most cases the only line of defense that a company would employ to ward of the attack vectors because there was little known about network security. While all three of these technologies have their own centralized management and reporting systems that can generate alerts, monitoring these events in a silo will provide little value, especially with blended or multi-vector attacks plaguing the organization. Correlating these events with other network and system activities allows a SIEM to pinpoint where attacks originated during a post-mortem and escalate severity when a high-profile asset has had its anti-virus disabled (as one example). Another important element is that the SIEM provides an advanced reporting engine, again, because it provides a context about the users, systems and compliance mandates that your security devices do not. If an analyst sees over the last 6 months that the same employee has disabled his end-point protection or AV so they can install software, then I can easily generate a report to present to their manager.
Data Leak/Loss Prevention (DLP): one fact that has plagued DLP is the high number of false-positives / false-negatives that are generated by a poorly tuned system. Many of the tools offer the ability to “auto-magically” discover and classify sensitive data but in reality, customers must do this modeling to confidently and accurately protect their intellectual property. As such, the SIEM is a great tool to not only correlate user permissions with events and raise or lower severity based on a user’s role, but it provides a great reporting tool to understand how well the DLP solution is performing and track sensitive data correlated against context such as “countries of concern”, “large email attachments” and “after-hours access to sensitive data”. This context can all be added to the bigger picture if data loss is only a small portion of the overall investigation. The SIEM would identify “who” leaked the data, from “what” critical asset, “when” it started to occur and “how” they accessed the data in the first place – a complete post-mortem and evidence trail.
End-point Protection / HIDS / NAC: while end-point protection generally has an accompanying management server that collects all of the update and violation events and can dynamically pass enforcement back to the clients, again, this is done as an autonomous “island of defense” – only aware of enforcement and monitoring for well-defined policies. The value that a SIEM brings to this point solution is client violation history, enforcement reporting that can be correlated with other events associated to that user or workstation as well as a single-pane of glass to detect persistent threats across multiple business units. As an added advantage, some SIEM tools allow bidirectional integration with end-point protection technologies that can allow the Security Operations Center (SOC) to enforce manual, “on-demand” quarantine actions and then correlation rule actions could automate the business policy enforcement. If, for example, the user’s role in AD does not provide them the privilege to access a medical application but the user (logged into the domain) has obtained generic login credentials and is attempting to access the application. Correlation will compare the business rules against the user’s role and if the user does not have the necessary permissions, SIEM could automatically invoke quarantine enforcement through the end-point protection manager and block the host session.
Network Intrusion Detection / Network Intrusion Prevention (NIDS/NIPS): While IDS and IPS technology has come a long way, now with a whole new “Next Generation” category on Gartner’s Magic Quadrant, one issue that still plagues this technology is their understanding of the target’s susceptibility to an attack. Most IDS/IPS (including WAF) vendors claim almost 80-100% false-positive reduction as part of their marketing metrics and this usually refers to their signatures and their accuracy in detecting the attack vector during the reassembly of packet chain data. The false-positives I am referring to are at the IDS/IPS reporting/alerting layer and the manager’s understanding of the destination host. Having the NIDS/NIPS reporting to a SIEM tool will overlay the necessary intelligence that is lacking by most NIDS/NIPS vendors. Consider this: An attacker on the Internet uses Nessus with 10,000+ attack vectors to recon scan a host in your DMZ. Of the 10,000 attacks, only 10 of them match the target’s susceptibility and maybe only one of them finds an actual vulnerability on the target. The SIEM tool should provide some formula to compare each high priority event generated by the NIDS/NIPS against an asset inventory that lists the OS, applications, listed ports, business criticality and known vulnerabilities and confidently tag each event with its own risk rating that calculates all of these elements together into one number. As an example, if the NIPS detects a SQL Injection attack against a critical PCI asset that has no vulnerabilities, no database installed and port 1433 isn’t listening on the asset, the vendor event priority is lowered to a 2 (because you still want to report at some point about “all attacks against my PCI assets”). Additionally, many SIEM tools and NIPS vendors allow scripted integration so that if a seemingly benign attack passes through the NIPS and the SIEM sees that the attack originated from a suspicious IP or country of concern, the SIEM can respond by instructing the NIPS to automatically block the attack and notify someone to investigate.
Packet Inspection / Payload Capture: the genesis of these two different technologies had similar roots, in that, network engineering and telecommunications service providers originally used hardware and software to “passively” or “promiscuously” record network traffic for the purposes of customer usage billing, network error detection, protocol analysis and many other engineering related use cases. As network security evolved, commercial and open-source vendors started to apply the concepts of capturing and inspecting the network traffic data to create niches within the IT Security realm. These newly created opportunities included:
- Intrusion Detection/Prevention (comparing known malicious payload signatures against packet chains)
- Lawful Intercept (using a court order to access a user or business’s data stream)
- Stateful Packet Inspection (inspecting headers of TCP packets for spoofing, malformation and other anomalies)
- Deep Packet Inspection (advanced analysis of complete TCP packets for cybersecurity, policy compliance and forensics)
- Network Behaviour Anomalies (NBA baselines normal traffic and user activities and then compares nefarious behaviour)
Many of today’s competing SIEM vendors recognize the immense value in combining this network intelligence with traditional log data to help eliminate false-positives/negatives, provide forensic “replay” of incidents and application-layer context to real-time correlation. Branded as Next Generation SIEM (NG-SIEM) or SIEM 2.0 by some vendors, this technology requires well architected meta-indexing, compression, storage, retention and retrieval components and seamless integration at the SIEM tier to be a viable investigation and workflow solution. While some SIEM vendors are making acquisitions in this arena, others are collaborating with “best-of-breed” solutions to provide customers with a vast selection of technologies that suit their needs.
Web Servers: to begin with, web logs can grow incredibly large depending on what type of web application being presented and to what audience. Web servers that are sitting inside a DMZ that’s public-facing are constantly subjected to reconnaissance attacks from attackers globally and, depending on the logging format configured, can get very verbose and chatty. The second point that plagues most web servers (especially inside a DMZ) is that they are deployed with the notion that they can be deleted and re-deployed using orchestration – supporting the concept that web servers are “sacrificial lambs” and when they are compromised, the IT ops team will remove the infected system or virtual machine and replace it with a new image. Both of these issues alone dictate that the logs need to be centralized and offloaded from the web servers immediately to free up storage space and have a copy of the logs in case the system needs to be wiped. The typical use cases associated with correlating web server data with other event sources are:
- Correlating geographic data with countries of concern
- Correlating data with bad IP/domain reputation feeds
- Recorded history of attack
- Regular Expression analysis of SQL injection or cross-site scripting attempts
- IT operational monitoring of load-balancer effectiveness
- Statistical or heuristical network utilization
- Application “trace” and session replay for either forensics or application performance.
Mail Servers: if you ask a Governance Officer or CISO the top things that keeps them up at night they’ll likely tell you that protecting their company’s intellectual property and detecting or stopping information exfiltration to competitors is near the top of the list. Mail servers, and particularly, the Simple Mail Transfer Protocol (SMTP), has long been used as one of the most popular methods by insiders to leak corporate data either maliciously (intended) or accidentally (e.g. spear-phishing) out of an organization. Data Leakage Prevention (DLP) technology can detect and prevent this but many organizations, weary of the ROI or the significant classification of data effort required by DLP solutions, are slow to adopt this method of stopping exfiltration. This is one area the logs from the mail server correlated with events from Intrusion Prevention Devices, file system sensitive data, privileged user activities, or even the size or name of attachments can provide a DLP-like solution to detect exfiltration. While your company may not have all of the newer technologies to detect and block exfiltration, the SIEM can definitely assist in the post-mortem after an incident by showing all of the events that led up to the incident itself. For example, one mail server event is received from a domain with a bad reputation score. Hours later the recipients workstation is detected by a proxy for downloading from a malicious website and hours later large files are copied after work hours by the workstation off the engineering share and multiple emails with .ZIP attachments are sent to a bad domain or competitor. This illustrates how the combined events from multiple devices assists in pinpointing the source of the exfiltration and all events that provide investigators with the name of the infected user, the domain names involved, the size of the attachments, names of attachments, etc. Other use case examples for mail server logs may be generating top large attachment reports, employees sending out resumes, abusive email “flaming” by employees, detecting breeches or brute-force attacks against mail services, etc.
Subscription-based Feeds (Threat/Reputation Intelligence/Geo-spatial): Gartner defines “Threat Intelligence” as:
Threat intelligence is evidence-based knowledge, including context, mechanisms, indicators, implications and actionable advice, about an existing or emerging menace or hazard to assets that can be used to inform decisions regarding the subject’s response to that menace or hazard.
To put this into context regarding SIEM, Threat Intelligence is one of many services provided by SIEM vendors, partners or Security as a Service (SaaS) providers, which enriches real-time log data with threat scores regarding the source, destination or attack vector. This statistical information has been gathered by research against IP address / domain reputations, malware, phishing, viruses or advanced persistent threats in the wild. The information is then stored in aggregate databases that collect terabytes of information from honeypots, sinkholes or open source intelligence such as SANS that customers can subscribe to. Alternately, vendor created databases are populated with statistics gathered from customers (e.g. IPS/Firewall customers) combined with internal / independent researchers that vendors hire to analyze and classify zero-day threats that are emerging.
Additionally, information regarding vulnerability CVEs and GPS coordinates can enrich log data through subscriptions with the SIEM vendor or through integration with third party sources as well. Services that provide geographical mapping of IP addresses to GPS coordinates are not always accurate but definitely pinpoint the country that the attackers originated from, allowing customers to alert on attacks foreign countries whether they are obvious attack vectors or seemingly legitimate access, but from a country of concern.
Physical Security (card readers, biometrics): In many organizations, there is a functional segregation between logical IT security and physical building access. While in my experience this has been mostly a separation of duties or a political matter, some industry verticals or compliance initiates (such as NERC CIP or protecting other critical infrastructure that depends on providing government, energy or financial resources). If accountability and compensating controls need to be monitored to identify who is accessing a secure location and an audit trail of everything they do on logical systems needs to be reviewed, there needs to be physical and logical convergence. One such example would be monitoring for lost or stolen authentication credentials. If the SIEM detects that there has been a physical card swipe and a successful login through VPN by the same user (from a different geo-location) within the same timeframe, then it is possible that the user’s access card was stolen, someone hijacked their VPN credentials or they are sharing their credentials with other users.
Applications (ERP / DB / home-grown): with growing frequency, businesses are using a hybrid approach to delivering content and applications these days, with disparate components of the application stack being hosted onsite or within the cloud. Pinpointing attacks becomes increasingly difficult in this approach because in order the source of the attacks and the vectors used, IT Security needs logs from Web Servers, Java Application Logs, Firewalls, Reverse-Proxies, Databases and Authentication Brokers. Without all of these components, IT Security investigations are not seeing the complete picture and automated monitoring by the SIEM tool doesn’t have enough residual threat data to detect SQL Injections, Cross-Site Scripting, Man-in-the-Middle or Insider Threats. By collecting and correlating logs from all of these different elements, an IT security monitoring campaign can look for use cases such as:
- Attacks against the application (SQL Injection, XSS, etc.)
- Key performance indicators of the overall stack by monitoring all dependent components (BCP)
- Role violations and data exfiltration by privileged users
- Generic username application access and auditing
- System-level security violations that threatens applications
- PCI / SOX compliance monitoring (e.g. app server is now accepting unsecured Telnet sessions)
- Monitoring legacy, home-grown applications with no logging ability using logs inferred from other dependent components (e.g. firewalls, DB queries, authentication services)
- Activity patterns that imply fraudulent transactions
- Users accessing applications outside of permitted geographies
The more diverse the event source types are that you collect and send logs from to your SIEM the more situational intelligence, baselines and heuristics you’ll gather that will benefit the multitude of business cases you can address with real-time correlation. With all of these various devices and applications, reporting to the correlation engine you will have better building blocks to create dashboards, reports and automate the incident handling process. While this may seem obvious to some, one of the biggest challenges company’s face in making the SIEM work for them is getting past the tunnel vision and understanding that SIEMs are designed to handle millions of disparate events and correlation rules are hungry for multi-faceted log sources.
If you only have a few types of event logs reporting to your SIEM, you may be missing the “big picture”.