Content area
Full Text
Abstract: The field of computer security offers solutions that capture events in the form of security logs. A security log file is a file that records events that occur in an operating system between devices. Such log files (e.g. access, system, firewall, anti-virus logs, etc.) gather vast collections of data. The data captured in security log files can offer valuable intelligence should smart analysis be applied. Proprietary logging applications process information and generate network health intelligence, however, the mechanisms for those analyses are not fully known. With the volume of data ranging between 15TB to 50TB, security log analysis can be difficult to digest without expensive automation tools. Organizations are constantly experiencing cyber-attacks and the security logs that their networks generate could early warning to imminent threats, also shedding light on how to predict and prevent them. Manual analysis of security logs is not feasible because of the volume of generated data. Machine learning is a perfect tool to organize, visualize and produce usable intelligence from network logs. ML can be used to learn the behaviour of the network and predict network threats. Using appropriate ML models and training them to understand network behaviour by combining time series analysis principles with the results of ML models can lift the burden of data crunching. The mission of this paper is to collect security logs from a live environment, apply the full ML pipeline to uncover abnormal incidents that lead to a breach. The findings will provide insight into the types of security logs that have to be examined and the ML models that work best for logs analysis protocols.
Keywords: event management, machine learning, network security logs, time series analysis, security operations center, early detection, anomaly detection
1. Introduction
Computer networks deploy various defensive mechanisms such as Intrusion Detection Systems (IDS), Anti-viruses, firewalls etc. These defensive mechanisms are aimed at protecting the organisation's resources from cyber-attacks. Traditional defensive mechanisms have proven to be ineffective against more sophisticated attacks, i.e. Advanced Persistent Threats and evasive malware (Saunders, 2020). This is because defensive mechanisms are either signature-based IDS detecting only known attacks (Masdari & Khezri, 2020), or Host Intrusion Detection Systems (HIDS) which monitor system-related events. A system monitored with a HIDS may be compromised, leading to...