Various datasets online can help with practicing threat hunting. The Boss of the SOC dataset v1, released by Splunk is one of them. So I deployed a Splunk instance on a local VM, loaded the dataset inside, and began the challenge (be aware that depending on your Splunk version, there might be some compatibility issues).
First of, you can start by exploring the data to know what type of logs you have access to. The following Splunk command shows a table of available sourcetypes with their number of occurrences in the recorded data:
index=botsv1 earliest=0 | stats count by sourcetype
You can also use the Splunk metadata command
| metadata type=sourcetypes
Scenario 1 : Web defacement
The first scenario consists in investigating a web defacement at Wayne Enterprises, on the site http://www.imreallynotbatman.com/
Question 101
What is the likely IPv4 address of someone from the Po1s0n1vy group scanning imreallynotbatman.com for web application vulnerabilities?
Known facts: Target is imreallynotbatman.com and attackers are looking for web application vulnerabilities
Consequences: Examine HTTP logs, in stream:http
for example; get the source IP addresses of the requests
index=botsv1 sourcetype="stream:http" imreallynotbatman.com | stats count by src_ip
Two IP addresses are identified: 40.80.148.42 and 23.22.63.114, with a lot more events associated with the first IP.
Hypothesis: The IP with the highest number of requests could be the one initiating the scanning, what are the response status codes returned by requests initiated by each IP address and different from successful response status codes?
index=botsv1 sourcetype="stream:http" imreallynotbatman.com NOT status=200 | stats count by src_ip, status
When examining the results, the IP 40.80.148.42 generates 14600 redirection messages, client or server error responses, while the IP 23.22.63.114 “only” generates 412 such events.
Hypothesis: We can cross reference with logs generated by Suricata and see what type of alerts are generated for these IP addresses
No results are found for IP 23.22.63.114. For 40.80.148.42, many alerts are found related to scanning for web application vulnerabilities.
index=botsv1 sourcetype="suricata" imreallynotbatman.com event_type="alert" src_ip="40.80.148.42" |stats count by alert.signature| sort - count

Conclusion: We can deduce that the IP address scanning iamreallynotbatman.com for web application vulnerabilities is 40.80.148.42
Question 102
What company created the web vulnerability scanner used by Po1s0n1vy? Type the company name.
Hypothesis: The name of the web vulnerability scanner could be revealed in the User-Agent used by the attacker
index=botsv1 sourcetype="stream:http" imreallynotbatman.com src_ip=40.80.148.42 | table src_headers
By analysing the content of the HTTP header, we discover that the attacker uses a web vulnerability scanner provided by Acunetix.
Question 103
What content management system is imreallynotbatman.com likely using?
The previous query also answers this question: Joomla is the content management system used by imreallynotbatman.com

Question 104
What is the name of the file that defaced the imreallynotbatman.com website? Please submit only the name of the file with extension
Hypothesis: The attacker could have provided the malicious file by tricking the web server into downloading the file to its folder. In that case, since the web server initiates the connection, its IP address would be in the source field of the request.
First, identify the IP address of the server.
index=botsv1 sourcetype="stream:http" src_ip="40.80.148.42" imreallynotbatman.com | stats count by dest_ip
The IP address of the server seems to be 192.168.250.70, given the number of requests received by this IP. However, a second IP address emerges, we can save it for later: 192.168.250.40

Then, filter the logs to identify requests where the server initiates the connectionindex=botsv1 sourcetype="stream:http" src_ip="192.168.250.70" | table request
Here is the file defacing the webserver

The file is also present when examining the Fortigate logs…index=botsv1 sourcetype="fgt_utm" srcip="192.168.250.70" | table url,catdesc

… and the Suricata logsindex=botsv1 sourcetype="suricata" src_ip="192.168.250.70" event_type=http

Question 105
This attack used dynamic DNS to resolve to the malicious IP. What fully qualified domain name (FQDN) is associated with this attack?
The FQDN is revealed by looking at the full event in fgt_utm
, stream:http
or suricata

Question 106
What IPv4 address has Po1s0n1vy tied to domains that are pre-staged to attack Wayne Enterprises?
Still looking at the full Suricata event generated, we notice that the IP address associated with the FQDN is recorded: 23.22.63.114
We can confirm it by looking at the DNS logsindex=botsv1 sourcetype="steam:dns" prankglassinebracket.jumpingcrab.com
A DNS query is made by the machine with IP 192.168.250.20 to google.com DNS server 8.8.8.8 for the IP address of prankglassinebracket.jumpingcrab.com
, the host address returned is 23.22.63.114

Question 108
What IPv4 address is likely attempting a brute force password attack against imreallynotbatman.com?
Hypothesis: Brute force login attemps are generally performed by issuing HTTP POST method, and targeting the login form. The form_data of the associated HTTP request can contain keywords such as username=admin, passwd or password.
index=botsv1 sourcetype="stream:http" dest_ip=192.168.250.70 form_data=*passwd* http_method=POST imreallynotbatman.com | table form_data, src_ip
This query shows the brute force attempts for IP address 23.22.63.114
Question 109
What is the name of the executable uploaded by Po1s0n1vy?
Hypothesis: Uploading files via a webserver generally uses an HTTP POST method. The infected server would be 192.168.250.70, and the file could be uploaded with a common windows extension such as .exe. We can also exclude the requests made by the web vulnerability scanner identified earlier.
index=botsv1 sourcetype="stream:http" dest_ip=192.168.250.70 http_method=POST src_headers NOT "acunetix" *.exe
The executable uploaded by Po1s0n1vy is 3791.exe
Question 110
What is the MD5 hash of the executable uploaded?
Hypothesis: The MD5 hash might be seen in other log events. If this file was executed, it would generate a Process Creation event in Sysmon log
index=botsv1 sourcetype="XmlWnEventLog" EventCode=1 CommandLine="3791.exe"

Question 111
GCPD reported that common TTPs (Tactics, Techniques, Procedures) for the Po1s0n1vy APT group, if initial compromise fails, is to send a spear phishing email with custom malware attached to their intended target. This malware is usually connected to Po1s0n1vys initial attack infrastructure. Using research techniques, provide the SHA256 hash of this malware.
Using a search engine, we can look for malware hashes associated with IP 23.22.63.114. We reach the following link on ThreatMiner with different MD5 associated with this IP address, including the one for 3791.exe: https://www.threatminer.org/host.php?q=23.22.63.114

Upon following the link related to the hash detected by several analysis software, we can have access to the file metadata, including its SHA256 value.

Question 112
What special hex code is associated with the customized malware discussed in Question 111?
This hex code can be found by looking at the community section of Virus Total when looking for the hash of that file:
53 74 65 76 65 20 42 72 61 6e 74 27 73 20 42 65 61 72 64 20 69 73 20 61 20 70 6f 77 65 72 66 75 6c 20 74 68 69 6e 67 2e 20 46 69 6e 64 20 74 68 69 73 20 6d 65 73 73 61 67 65 20 61 6e 64 20 61 73 6b 20 68 69 6d 20 74 6f 20 62 75 79 20 79 6f 75 20 61 20 62 65 65 72 21 21 21
Question 114
What was the first brute force password used?
We would need to sort the results by time in order to see the first bruteforce event
index=botsv1 sourcetype="stream:http" form_data="username=admin" http_method=POST | sort _time|table password

The first password is 12345678
Question 115
One of the passwords in the brute force attack is James Brodsky’s favorite Coldplay song. We are looking for a six character word on this one. Which is it?
We should start with a list of six characters Coldplay songs and perform a lookup for these songs in the form data
index=botsv1 sourcetype=stream:http http_method=POST | rex field=form_data "(?i)passwd=(?<password>[a-zA-Z]{6})" |search password IN (yellow, violet, trouble, sparks, shiver, clocks, square, always, ghosts) | table src_ip, password
The password is yellow
Question 116
What was the correct password for admin access to the content management system running “imreallynotbatman.com”?
Hypothesis: A successful password would be used more than once, a would generate a 200 status code
index=botsv1 sourcetype=stream:http http_method=POST form_data="*username=admin*" dest_ip=192.168.250.70 |rex field=form_data "passwd=(?<password>\w+) |stats count by password| sort -count
batman emerges twice. We can see if using it as password in the form generates successful connection

12 successful connection were made on the Administration page, using that password by IP address 40.80.148.42

Question 117
What was the average password length used in the password brute forcing attempt?
index=botsv1 sourcetype=stream:http http_method=POST dest_ip=192.168.250.70 src_ip=23.22.63.114 | rex field=form_data "passwd=(?<password>\w+)" | eval length=len(password)|stats avg(length|eval rounded=round(average_length,0))
The passwords had a mean length of 6.
Question 118
How many seconds elapsed between the time the brute force password scan identified the correct password and the compromised login?
We can use the Splunk transaction command, which automatically adds a duration column for the difference between the first and last events of the transaction
index=botsv1 sourcetype=stream:http http_method=POST uri="/joomla/Administrator/index.php" dest_ip=192.168.250.70| rex field=form_data "username=(?<username>admin).*passwd=(?<password>batman) | transaction password | eval rounded_duration=round(duration,2)| table rounded_duration, eventcount
92.17 seconds elapsed between the scan and the compromised login
Question 119
How many unique passwords were attempted in the brute force attempt?
The following query will give the answer to that question
index=botsv1 sourcetype=stream:http http_method=POST uri="/joomla/Administrator/index.php" src_ip=23.22.63.114 dest_ip=192.168.250.70| rex field=form_data "passwd=(?\w+)" | dedup password | stats count by password|stats sum(count) as count