Real-time Threat Detection with Machine Learning
Ateleris worked with a US security company to develop a new machine-learning-based cybersecurity algorithm that detects malicious login attempts to cloud services.
The US security company develops and sells products in the area of IT security, with a focus on “Threat Detection & Mitigation”. Due to the increasing use of cloud architectures, detecting suspicious or malicious login attempts is essential. Ateleris was tasked with engineering a new machine-learning-based detection method to lower false alarms and increase the detection rate of actual threats.
To that end, Ateleris replaced simple, static detection rules of malicious login attempts with a flexible machine learning model (gradient boosting) that assesses logins based on all available data over time. An intensive feature engineering phase preceded the gradient boosting algorithm’s implementation to identify the model’s best possible features.
Due to the vast quantities of training and validation data needed, efficient approaches had to be developed to extract the data from the data lakes, amounting to terabytes of data to be loaded, processed, and searched for patterns. For this purpose, specific low-level data connectors were developed employing memory optimizations, parallelization strategies, and remarkably efficient data structures to load the login data from the Amazon AWS data lake and clean and preprocess it for analysis.
After successful small field tests, the client’s engineering team integrated our algorithm into their live production system, collecting and processing over 4 million daily login requests. With our new machine-learning-based approach, the false positives, i.e., false alerts, dropped by 50-60%, while the detection rate of actual attacks increased significantly.
Key Technologies/Terms
- Feature Engineering and Data Wrangling
- Gradient Boosting
- C#, .NET, Python
- Cybersecurity