Blog Series – Using ChatGPT for cyber defence part 2
Helping SOC analysts write log queries
Whether it’s for creating detection rules, building reports and dashboards or transforming threat hunting ideas into queries, SOC analysts are often faced with having to create log/SIEM queries.
This can take up a lot of time and often involves searching for ideas or existing queries that can be used as a starting point. This isn’t really what ChatGPT is designed to do, but there have been plenty of conversations around using it to write code, so let’s see if ChatGPT can help us by asking it to create queries in Kusto (KQL) and then running them in a test environment.
Example 1 – Creating a detection to look for .lnk file
A frequent method of Initial Access is via .lnk file delivery. Let’s ask ChatGPT to write a Kusto query that we can use in Microsoft Log Analytics or Sentinel:
This one doesn’t actually run as there is no field called ‘user’ for the operator ‘project’.
If we simply remove ‘user’ it works:
• where EventID == 4688
• where CommandLine contains “.lnk”
• project TimeGenerated, Computer, CommandLine
That returns results in our test environment after opening a .lnk file with PowerShell.
The query is useful as it only required Windows Event Logs (Process logs) to be in Log Analytics (i.e. not Sysmon, etc) so easy enough to do for most organisations.
It’s interesting that the code provided doesn’t actually run. It’s possible that it’s using older KQL syntax but it’s a consistent problem we run into, and in some cases the queries it returns have some bigger errors.
Let’s try some other ideas:
Example 2 – write a Kusto query to find .lnk files delivered by email that haven’t been blocked
Write a kusto query to find .lnk files delivered by email
The above query doesn’t work either (‘Attachments’) and is not supported, instead you need something like:
• where FileName contains “.lnk”
To ignore emails that are blocked you need something similar to:
• where DeliveryAction != “Blocked”
Instead of ‘IsBlocked == False’ as suggested.
It’s probably not the best example as it’s a hard query to make and typically .lnk files are delivered inside something else like a .zip or a .iso, so I decided to leave this here and try something else.
Back to detecting .lnk files…using the knowledge that attackers hide PowerShell scripts as .lnk files I tried again, asking ChatGPT to help me write a query to detect PowerShell scripts ‘hiding’ as .lnk files.
The above query doesn’t work. It was easy to fix though, and I came up with this query:
• where EventID == 4688
• where CommandLine contains “.lnk” and CommandLine contains “powershell”
• project TimeGenerated, Computer, CommandLine
This query was able to detect me running a .lnk file with PowerShell. It’s not a great query but it does work for at least one of my tests.
I then asked it to enhance the above query by identifying who was logged on the computer at the time and the result came back with more errors, for example line 4 should read | join SecurityEvent :
I decided to move on as I wasn’t getting very far and tried another approach by asking:
write a Kusto query to detect PowerShell scripts that connect to internet
The first answer was below:
On first pass, it looks to be a reasonable starting point (it projects ‘User’ again which is not valid, removing it makes it work) but after testing it, I realised that the logs for Event ID 4688 don’t show the PowerShell command line, at least not in my test environment using these scripts. So, the query runs and finds nothing. This isn’t expected as I have been running test PowerShell that should match the query.
This is the first lesson – even if the output seems to work it doesn’t mean the thing you’re trying to do will! Ensure you have good test cases to test your queries and detection logic! This is true for any detection you create, regardless of where or how you made it.
What was interesting, however, was the command lines it suggested. I wasn’t familiar with ‘Invoke-RestMethod’ and this made me want to review what other PowerShell commands could transfer files and hence have the potential to be used maliciously.
So, I asked ChatGPT:
List all commands that can be used in PowerShell to transfer files
Which of these allows transfer of files over the internet?
Here we have a range of incomplete results that require further analysis. The conclusion here is that it wasn’t great at writing the actual queries but did give me some ideas on how to improve the detection logic (i.e., ensuring all possible methods of transferring files with PowerShell are covered).
I tried various attempts to get a full list of all possible commands, but ChatGPT reminded me that it’s a ‘natural language model and there are too many options…’. It did keep coming up with commands I didn’t know about though but to get the true list I did indeed need to take its advice and RTFM!
This needs to be remembered when using ChatGPT – use it for what it was designed for or expect inconsistent, incomplete responses.
I had one last attempt:
As I wasn’t having much success, I tried a different approach – can it help the SOC by explaining what log sources/types are needed for a query to work? Below is a sample output; it is useful but vague – I tried asking it why my Event ID PowerShell logs didn’t show the command lines I was looking for and I couldn’t get an answer. But then again, I couldn’t get an answer to that question from traditional search engines after twenty minutes of trying!
So far, it’s clear that asking ChatGPT to write log queries isn’t going to provide you with accurate answers, but it is still useful if you can fix the queries and use ChatGPT more as a way of exploring ideas and looking for things you may have missed. You need to test everything it does create as it can also create working queries that don’t do what you need, so be careful!
A better way of using it may be to ask it to help you test detections/log queries. This is often tricky and analysts can be great at Kusto queries and poor at creating test scripts, so I looked into that.
Here, I have pasted my Kusto query/detection logic into ChatGPT and asked it to create a sample PowerShell that I can use to test it. Below are a couple of example results:
The results needed modifying. They are a useful starting point if you’re really not sure where to get started. After some modification I was able to create a script to copy a file on my test machine using PowerShell, run the script and then check the file was copied. I then ran my detection query in Log Analytics to see if I could find the evidence.
I decided to leave this here and pick this aspect up in more detail later and have one more go at getting help writing detections:
Write some Kusto to threat hunt for rundll32.exe abuse
That didn’t provide any meaningful results, so I tried:
How to threat hunt for malicious macros in word documents
How to detect macros in word documents using kusto
Not much luck with these either.
I changed approach and asked:
I want to do some threat hunting using Kusto for suspicious emails, can you give me some example queries
Here are two of the results:
I then asked:
Add to 1. emails with attachments or external url links
As usual the above query doesn’t work and needs a lot of work to fix.
What have I learned from this?
ChatGPT is not great at writing queries but it is useful at helping me find more ideas for new detections, as well as improvements to existing detections. It also pointed me in the right direction so I can create the query myself. It also seems to have the potential in helping to write test logic for existing detections – more on that in another blog.
I had one final idea – ask it something that it’s better suited to answering…
generate a threat hunting hypothesis based on any Mitre ATT&CK initial access method
Pretty good starting point that can be refined and improved so I kept going with this approach:
Again, reasonable starting point and something I will pick up in a later blog.
Rob Demain, CEO