Monitoring your IT environment can be a daunting task, especially depending on your current security posture, the resources you have available to you (tools, people, and knowledge), and the scope of the environment (single site, multi-site, on-premise, cloud, hybrid, etc.). It is very easy to get overwhelmed and lose track of what we are really trying to protect. Understanding what makes up your environment, what are your most important assets (the “Crown Jewels”), and what are the main threats facing your organization are critical to developing any type of plan to protect your organization. What do you have to protect? Do you know what your adversaries want to take from you/do to you? Have you the methods and vectors they will most likely use against you?
There is a lot of time and effort that goes answering those questions correctly. A lot of frameworks and assessments have been developed to help organizations articulate the answers to those questions so they can move forward with a solid security plan. Today we are not necessarily going to discuss that process, but more so, some of the basic things an organization should be monitoring regardless of those answers. The nice thing about this list is it is not solely tied to potential Indicators of Compromise (IOC) but could also just be general administrative and operations information that might indicate other types of misconfigurations or problems in your environment. Obviously this list is not all inclusive and every environment is different, but this is a good place to start.
“These go to Eleven!”….but seriously… there’s only 10
New Users in the Environment (especially local accounts)
Any new accounts created both in Active Directory/LDAP or locally on devices is definitely worth noting. If you have an account management process these should mostly match up with new user accounts. This can also be coupled with seeing what groups or permissions are being assigned to existing accounts. Pay special attention to local accounts being created, especially if your standard policy is to use some type of central authentication.
|Windows Server 2008 R2+ Log Event ID||Simplified Explanation|
|4720,4722,4725,4726||User account was created/enabled/disabled/deleted|
|4728, 4732, 4756||User account was added to a security group(global, local, universal)|
|4740||User account locked out|
|Linux Log Event|
|sudo: john : TTY=pts/2 ; PWD=/home/john ; USER=root ; COMMAND=/usr/sbin/adduser nigel.tuffnel||Admin john created user account(nigel.tuffnel)|
|groupadd: group added to /etc/group: name=nigel.tuffnel, GID=1008||User added to group|
|useradd: new user: name=nigel.tuffnel, UID=1008, GID=1008, home=/home/nigel.tuffnel, shell=/bin/bash||Another log showing “useradd”|
|Cisco Log Event|
|May 15 15:35:45.049: %PARSER-5-CFGLOG_LOGGEDCMD: User:admin logged command:username john privilege 15 secret *||User john added on a Cisco Catalyst switch|
|Wed May 15 10:23:57 2019:type=update:id=10.10.10.|
configure terminal ; username john role network-admin password ******** (SUCCESS)
|User john added on Cisco Nexus switch|
|Juniper Log Event|
|May 15 14:08:51 FW-JBC mgd: UI_CFG_AUDIT_SET: User ‘admin’ set: [system login user john class] <unconfigured> -> “Admins”||User john added on Juniper|
|May 15 14:08:51 FW-JBC mgd: UI_CFG_AUDIT_SET: User ‘admin’ set: [system login user john authentication] <unconfigured> -> “plain-|
|User john password set on Juniper|
|May 15 14:08:53 FW-JBC mgd: UI_CMDLINE_READ_LINE: User ‘admin’, command ‘set system login user john class Admins authentication|
|Output of of total command from Juniper|
Monitoring Logins with emergency/local accounts
So, hopefully your organization has emergency local accounts on their devices. I have greater hope that they also have these documented and secured in a safe manner for when real emergencies come (and they will). My even greater hope is that no one (or at least a minuscule amount of administrators, like 1-2 people) knows what they are until they are needed.
Starting from this assumption, it would be very odd to see any type of logins with these types of accounts, especially when it is a non-emergency situation. So monitoring for login attempts from these accounts might show either nefarious actions or poor account hygiene. It could also be a script or job running with the account, which would be very bad practice as well. This is like turning on the lights in a kitchen and watching the cockroaches scatter. Monitor for those accounts which should not be being used!
Port Security Violations
Port security violations let you know when an unexpected device is plugged into a switch port that it has not been previously configured for (as long as you have the proper port security configurations on the switch). A lot of things could be the reason behind this, often times it is a misconfiguration or some lack of coordination during implementation for a new device or device movement. But, depending on the environment and how easily folks have access to connecting devices to the network, it is a very easy path for nefarious individuals to travel down. So, this means you should have some type of port security and for the love of everything sacred, disable all unused ports and put them in something other than VLAN 1. Network device port hygiene is very easy, and can add a layer of frustration to someone trying to do something bad.
|Cisco Log Event||Simplified Explanation|
|May 1 04:03:29.915: %PORT_SECURITY-2-PSECURE_VIOLATION: Security violation occurred, caused by MAC address 001c.0e44.0a3f on port GigabitEthernet1/0/3.||Port security violation for MAC address ending in 0a3f on port GI1/0/3|
New Devices in the Environment
This can be similar to the events about port security (and may be caused by the same thing) but can be captured in a few different ways. Periodic scans of IP ranges or alerting on IDS/Network Monitoring devices to new MAC/IP addresses can make administrators aware of new or undocumented devices.
Failed Logins to Crown Jewels
This one takes knowing a bit about your environment and what data is most important to your organization. It might also be the most important item in this list! What is the most critical information in your environment?
For years people have defaulted to identifying the domain controllers (DC) as the most important, without really taking into account the environment and the data inside of it. If you pop a DC you have the keys to the kingdom. But, what if the real valuable data is confidential documents or other sensitive client data. What if that data resides on a system that has different protections and vulnerabilities than your domain controllers? I surely hope it is not on your domain controller, if it is, we have a whole lot of other issues to deal with.
The point being, an attacker might not have to get domain admin (DA) to get what they need. Since you know that this specific data is the most valuable to the organization, I would recommend adding an increased set of monitoring to it and the system that it resides on. You can get very specific with the logging and alerting, depending on the frequency of how often the data is accessed.
Monitoring Interactive Login with Service Accounts
In Windows environments, please do not use service accounts to do things that you should be doing with your regular user accounts. Likewise, your service accounts should not be logging in locally or remotely (via terminal services). Most places will implement this via GPO to block the group that the service accounts are in from doing this. However, it is worth monitoring just in case (match up your service accounts with these event IDs).
|Windows 2008 R2+ Log Event ID||Simplified Explanation|
|4624 Type 2||This is an Interactive logon(requiring local access and input of credentials).|
|4624 Type 10||This is a Remote Interactive logon(similar to type 2, but utilizing Remote Desktop (RDP) or Terminal Services).|
|4624 Type 11||This is a CachedInteractive logon(this occurs when a interactive logon to a domain account is attempted when the domain controller is not accessible).|
Blocked Remote Access Attempts
I find this to be somewhat useful, but might be too noisy or not viable if the organization isn’t controlling their data flows in a way that would generate these type of logs in an efficient manner. Ideally, there are only certain allowed remote access data flows in the environment. If this is the case, any attempts to connect outside of these documented data flows over SSH, RDP, etc. could be considered interesting. This will not necessarily scale well in large organizations, but if you couple it with the “Crown Jewels” principle, you could monitor for this type of traffic in critical areas of your infrastructure.
There are a lot of reasons why you should not still be using NTLMv1 in your environment. There are numerous blogs and articles on the vulnerabilities with it and many of the well known attacks that can be perpetrated using it (i.e. Pass the Hash). Firstly, if you are not enforcing NTLMv2/Kerberos at this point, you should be monitoring this to see what is actually using it and being very persuasive with the system owners to get them update to best practices. Secondly, since it is a widely used vector, it might be good to know if there is someone unexpected who has dropped into your environment that is throwing around some NTLMv1 packets.
|Windows 2008 R2+ Log Event ID||Simplified Explanation|
|4624, Parameter 11 (Authentication Package) = NTLM, Parameter 15 = NTLM V1||Successful account logon with NTLM V1|
|4624, Parameter 11 (Authentication Package) = NTLM, Parameter 15 = LM||Successful account logon with LM|
Domain lookups to known hacking tool domains
This is a little bit of a long shot, but could pay off depending on your environment and the level of skill of who ever might be trying to do something bad inside of it. If someone is using Kali Linux for instance and forgot to disable autoupdates, you might be able to pick up on some DNS resolution for updates to the Kali repositories. Likewise, this could just be fed into whatever type of DNS blacklist/alerting you might have set up based on threats specific to your organization.
Alerts from AppLocker
AppLocker can be used to block applications from being installed on Windows machines. If you are not using it already, there are a lot of good use cases for it. It can also be set in audit mode, so you do not need to worry about blocking mission critical applications on servers while you build your whitelist up. You could also use it strategically, only placing it on servers of importance (or with known weaknesses). I would also recommend some further monitoring of the cert store that it utilizes, to ensure that if someone does try to do something funny with it (i.e. load certs for a bad application, thus bypassing it’s protection), you are aware that might be happening.
|Windows 2008 R2+ Log Event ID||Simplified Explanation|
|8003, 8006||Audit mode that says .exe/.dll or .msi/script was allowed to run, but would have been blocked if the policy was enforced|
|8004, 8007||.exe/.dll or .msi/script was blocked by policy|
|8002, 8005||.exe/.dll or .msi/script was allowed to run. This is useful when trying to troubleshoot and see what is being allowed to run|
The Wrap Up
I think this list is a good jumping off point for starting to monitor and understand things going on in your environment. This list is by no means all inclusive and there are other and better things to monitor depending on the specifics of every organization. The purpose here was to lay out some solid recommendations that hopefully your organization is currently doing. Ideally, the majority of these can be logged, collected and alerted on through some type of Security Information and Event Management (SIEM) tool. The real power there would come from being able to do correlation, trending, and real time alerting in a way that is tailored to your specific needs, technologies and environment.
Some great resources on some of the topics mentioned above.
Find out more about J.B.C.’s Cyber&Sight™ blog here.