Source: Deep Learning on Medium
Decoding the AI Defense System Behind Alibaba Cloud Web Application Firewall (WAF)
Web applications are vulnerable to all types of attacks and bear the brunt of injection and cross-origin attacks. Web Application Firewalls (WAFs) are designed to prevent and block attacks targeted at web applications and are continuously updated to protect them from security threats. A range of machine learning algorithms and models are continuously proposed and applied to security products such as WAF to eliminate security threats. However, most of these algorithms focus on supervised learning. They are used to build a classification model for specific attack types by tagging positive and negative sample data.
Similar to traditional security detection technologies, machine learning algorithms alone fail to eliminate the discrepancy between false negatives and false positives or strike a balance between coverage and detection performance. The security field still faces the challenges of open-loop problem spaces and asymmetric positive and negative spaces.
Alibaba Cloud WAF was recognized by Gartner, being placed as a “Niche Player” in the Gartner 2019 Magic Quadrant for Web Application Firewalls (WAF)
This article describes how the AI kernel of the intelligent defense system of Alibaba Cloud WAF resolves such challenges. The AI kernel of Alibaba Cloud WAF provides core machine intelligence capabilities, along with refined, personalized, and intelligent protection to minimize security threats for customers. AI-driven intelligent security systems are on the rise and bring greater benefits.
Alibaba Cloud WAF — AI Kernel Technology
The intelligent defense system of Alibaba Cloud WAF has a built-in AI kernel, which is different from algorithms or rules that focus on attack detection. Based on intelligent security concepts such as layered traffic management and targeted protection, the AI kernel divides traffic into three layers: white, gray, and black, respectively. Each layer is deployed with a different type of machine intelligence model, such as the Active Defense Model, Anomaly Detection Model, Locate-Then-Detect (LTD) Model, Fault Warning Model, False Negative Detection Model, and False Positive Detection Model.
The intelligent models at different layers perform their specific functions, are self-consistent, and are fully linked to form a set of decision-making intelligence against basic threats at the application layer. Protection rules or models are automatically created by machine intelligence for each specific website based on the services of the website.
Therefore, the number of custom defense systems is equal to the number of websites, and such a large number of defense systems form a refined and personalized intelligent security system to protect against hacker attacks.
Active Defense Model
This model utilizes the Alibaba Cloud’s in-house traffic model learning algorithm to automatically learn the legitimate traffic of domain names. Further, it learns and describes the legitimate access traffic of every website through unsupervised learning. The server automatically creates whitelist rules for traffic that is allowed to pass the firewall. Millions of rules are created online to protect the website.
Anomaly Detection Model
Based on the concept of targeted protection, this model uses anomaly detectors to identify the gray traffic of every website by checking request segments and time series. The server automatically creates millions of models for detecting gray traffic.
Locate-Then-Detect (LTD) Model
The LTD model detects attacks through machine vision and deep learning. It consists of two deep neural networks: Payload Locating Network (PLN) and Payload Classification Network (PCN). The PLN and PCN are combined to accurately locate a malicious payload as well as identify its type. Based on the powerful feature extraction capability of deep learning, the LTD model enhances generalized threat detection to identify more attack variants. The LTD model combines object detection and attention mechanisms to solve the interpretability problem of deep learning related to cyberattack detection. IJCAI 2019, a top-level AI academic conference included this achievement in its latest edition.
The AI kernel of Alibaba Cloud WAF also provides the fault warning model, active detection model for false negatives, and active detection model for false positives.
The AI kernel of Alibaba Cloud WAF is designed with major technological innovations in its layered traffic management and targeted protection features. It is a general intelligent security system applicable to application-layer attack detection and a wide range of other security scenarios.
Introduction to Alibaba Cloud WAF
Based on the big data and intelligent computing capabilities of cloud security, Alibaba Cloud WAF defends against SQL injection, cross-site scripting (XSS), common web server plug-in vulnerabilities, Trojan upload, unauthorized access to core resources, and other common Open Web Application Security Project (OWASP) attacks. It protects websites or applications by identifying the malicious features of service traffic, and forwards normal and secure traffic to the origin server. Alibaba Cloud WAF also protects website or app servers from intrusions, safeguards core data, and prevents server performance degradation due to attacks.
Based on the powerful computing and data processing capabilities of Alibaba Cloud, Alibaba Cloud WAF improves the detection rate and reduces the false positive rate through the industry-leading AI deep learning method. It uses a client SDK to access model collection and big data analysis capabilities to process important requests in quasi-real-time. Alibaba Cloud WAF also supports synchronous delivery and updates of automatic alert and global response rules.
Common application scenarios of Alibaba Cloud WAF includes security protection of websites or web apps in the finance, e-commerce, O2O, gaming, governance, and insurance fields.
Alibaba Cloud WAF solves the following application security problems.
- Prevents breach of core data on websites caused by injection attacks.
- Prevents challenge collapsar (CC) attacks and safeguards website availability by blocking massive malicious requests.
- Prevents Trojan upload and webpage tampering to ensure website credibility.
- Provides virtual patches that enable a quick fix for newly discovered vulnerabilities.
Located at the network egress or ingress, Alibaba Cloud WAF combines the intelligent protection engine, expert protection rules, active defense and detection engine, and cloud threat intelligence capability to identify web attacks and malicious web requests in real-time.
Alibaba Cloud WAF also provides real-time defense based on predefined protection policies to ensure the security and availability of websites and apps.
Main Technologies of Alibaba Cloud WAF
- Dual-engine detection based on regular expression and AI.
- Anti-crawler based on the real-time anti-bot model and algorithm.
- Threat intelligence based on big data and one-click blocking of millions of blacklisted crawler IP addresses.
- Data breach protection
- Storage and intelligent retrieval of massive logs
Features and Benefits of Alibaba Cloud WAF
Alibaba Cloud WAF meets a range of requirements of both on-premises and off-premises users ranging from web security, CC attack protection, application-layer load balancing, and throttling, to service security and data risk control. Alibaba Cloud WAF integrates a series of technological innovations based on the traditional WAF architecture.
In addition to web attack protection, CC attack protection, and webpage tampering prevention, Alibaba Cloud WAF provides the following new features:
- Service security protection, covering malicious queries and seat occupation through online ticketing systems, forum spam posts, malicious registration, and high-risk payment.
- Client SDK security linkage, with no need to modify the server logic.
- Attack detection through semantic analysis and deep learning in neural networks.
- Anomaly detection based on probability analysis of request content types.
- Analysis and tracing of targeted hacker threat intelligence.
- Malicious crawler protection.
- Detection of and protection from sensitive information breaches.
- Linkage of tens of millions of databases that store malicious IP address entries.
- Data risk control to protect mobile numbers, bank cards, and identity information.
- Performance analysis of website services.
- Storage and custom analysis of massive access logs and attack logs.
- WAF app store to allow one-click enablement of the security features of third-party SaaS providers.
- Cloud-based access and central management across cloud environments.
Alibaba Cloud WAF is available as SaaS, with access nodes all over the world. Its international edition supports global synchronization of settings and smart access from nearby nodes in any country.
Innovative Methods of Threat Detection and Interception
1) Real-time Analysis and Interception Based on Deep Learning
Alibaba Cloud WAF uses convolutional neural networks to illustrate the text in HTTP requests as graphs, and train samples of different attack types while eliminating the need to manually extract and maintain features. It further improves the detection capability of models by adding samples.
The discrete GPU processing platform reduces latency to less than 1.5 ms through model optimization and inference engine optimization. In normal cases, latency is more than 5 ms for common platforms.
2) Data Risk Control and Service Security Defense
Alibaba Cloud WAF uses the real-time response script injection technology to inject scripts. Therefore, eliminates the need to modify your service logic for access.
Big data risk control and bot detection capabilities are also integrated in the cloud.
3) Intelligent CC Attack Protection
Alibaba Cloud WAF creates baseline models for the normal traffic of all users and detects abnormal traffic in real-time to identify CC attacks. Alibaba Cloud WAF also automatically creates regular expression rules for generating and delivering decision-making actions. This simplifies the previous experience-based configuration of CC attack protection rules, makes it easier for you to complete the learning process, and eliminates false positives and false negatives.
4) Analysis of Abnormal Requests Based on the Implicit Markov Process
Alibaba Cloud WAF normalizes and maps the request parameters’ text in normal traffic, builds implicit Markov chain probability models for string distribution and string length, and intercepts abnormal requests in user traffic that deviate from the normal probability in real-time to identify attacks.
5) Semantic Analysis and Interception Engine
False positives often occur when SQL injection attacks and XSS attacks are detected based on keyword-specific regular expressions, which are not effective in detecting and preventing advanced attacks, such as comment deformation and string syntax deformation.
The semantic analysis and interception engine combines semantic and syntactic analysis of SQL and XSS statements with threat level analysis to detect and intercept advanced deformation attacks.
6) Behavior Analysis Engine
Traditional WAF detection engines identify attacks based on specific attack features and fail to detect exceptions at the service layer, such as ticket brushing, red envelope snatching, and malicious seat occupation.
The behavior analysis engine of Alibaba Cloud WAF defines and identifies key behaviors in requests. The engine identifies service-layer exceptions by analyzing behavior context information, such as behavior distribution, historical features of individual behaviors, behavior jump probability, stay duration, and distribution features across time and regions. According to actual online service tests, verification code and slider pop-up are removed from 99.8% scenarios, which improves user experience.
7) Global Distributed Throttling
The traditional token bucket mechanism is an effective throttling method for individual devices. However, it does not work for distributed throttling of off-premises services and apps across devices, clusters, and regions. The global distributed throttling system scalably manages resources on a global scale, with low latency, based on the distributed protocols and the scheme of estimation, lease, and action.
The system is designed with the “match” interface and “action” interface to minimize the impact on user experience due to throttling. Throttling can be based on user traffic value or wait time.
8) Interception Based on the Cloud and SDK Technologies
Deployed at the gateway end, traditional WAFs cannot directly obtain information from clients to implement strong authentication. Alibaba Cloud WAF works with security SDKs and integrates terminal fingerprints, cloud-based threat detection, and man-machine interaction (through manual slider dragging and verification code input) to provide strong authentication and communication tunnel encryption.
9) Cache-Free Detection
Traditional WAFs consume many memory resources in high-concurrency scenarios due to caching of detected data. Alibaba Cloud WAF removes data caching by detecting the snapshot status of a state machine during the cache detection process. It supports deep detection of more than 1 GB data, compared to the detection depth of less than 100 MB for commercial WAFs.
10) Real-Time Response Script Injection Technology for Modifying Traffic Data
Alibaba Cloud WAF uses its detection engine to modify the content of processed traffic based on HTML tags by dynamically inserting new elements and replacing existing traffic data. This allows you to modify your business logic and insert executable code without modifying the server code.
11) Active Defense Model
Alibaba Cloud WAF optimizes protection by actively learning the traffic of domain names to determine which types of traffic can be added to a whitelist. Alibaba Cloud WAF normalizes the valid URLs and parameters in user traffic and expresses them through regular expressions that are automatically generated based on a model.
Innovative Method of Security Event Analysis
1. Hacker Tracing
Alibaba Cloud WAF analyzes the attack chain of a single hacker and captures the targeted web attack of a hacker in real life by persistently tracking the attack sessions and paths of hackers.
2. Big Data System Linkage and Malicious IP Address Intelligence System
Alibaba Cloud WAF analyzes the features of traffic logs to mine intelligence about malicious IP addresses, such as proxy, crawler, and zombie IP addresses. The malicious IP address intelligence system and the cloud protection engine are linked for collaborative defense.
3. Full Log Storage, Analysis, and Retrieval
Based on the Apsara big data infrastructure, Alibaba Cloud WAF fully stores processed data at the petabyte level with user consent, quickly customizes real-time analysis and reports based on custom statistic statements, and imports the stored data as a data source to users’ security data analysis systems.
4. Customer Service Quality Analysis
Alibaba Cloud WAF analyzes the actual status and quality of customer services based on the service-returned values, latency, and access distribution. This provides performance optimization recommendations to customers.
5. Live Dashboard
Alibaba Cloud WAF provides a live dashboard based on big data analysis and 3D data presentation and rendering, which shows the interception and alert status. The dashboard allows users to determine security threats in real-time. The dashboard can be projected by using a web client or YunOS smart device.
Technical Architecture Innovations
- Large-scale distributed forwarding clusters at the application layer
- In-depth application-layer defense system
- Central online and offline security management
- Management APIs
- Security and forwarding planes separation, and service sandbox
Alibaba Cloud WAF: Awards and Recognitions
- Alibaba Cloud is the only manufacturer in China to be included in the 2019 Gartner Magic Quadrant for WAF and was included in the 2018 Gartner report for the Asia-Pacific region.
- According to Frost & Sullivan, Alibaba Cloud WAF has the largest market share among cloud WAFs in Greater China for two consecutive years.
- Alibaba Cloud WAF was ranked number one by the CNCERT for innovative network security products in 2018.
- The anti-bot capability of Alibaba Cloud WAF enters the first division of Forrester’s global technology evaluation.
- Alibaba Cloud WAF was granted the “Cloud Security Product and Service of the Year” title at the FreeBuf 2016 Internet Security Technology & Innovation Summit.
- Alibaba Cloud WAF won the Apsara Award and Top Cloud Connect Award of Alibaba Cloud in 2017 and 2018.
About the Judges
Jin Xiangyu, Founder of the Sec-UN Website and Founder of Threat Intelligence
As AI enters a new stage of the application, the industry pays more attention to the service implementation effect compared to the basic technologies and platforms. Alibaba Cloud integrates AI into its mature WAF product to extend the rule- and feature-based detection function of traditional WAFs to anomaly detection, attack detection, fault warning, active vulnerability detection, and active false positive detection.
Alibaba Cloud WAF turns passive defense into active defense and is widely adopted by users of Alibaba Cloud. The AI-driven intelligent defense system of Alibaba Cloud WAF is one of the substantive innovations in the cybersecurity field and one of my recommended projects for WitAwards this year.
Hui Zhibin, Director of the Internet Research Center of Shanghai Academy of Social Sciences and Chief Researcher at Cyber Research Institute
Based on layered traffic management and targeted protection, the AI kernel of Alibaba Cloud WAF is designed with innovative technology to automatically create intelligent protection rules or models that are adapted to the services of protected websites. Alibaba Cloud is the only manufacturer in China to be included in the 2019 Gartner Magic Quadrant for WAF and was included in the 2018 Gartner report for the Asia-Pacific region as well. Alibaba Cloud leads the cloud computing market in China with the most extensive external attack scenarios.
According to Frost & Sullivan, Alibaba Cloud WAF has the largest market share among cloud WAFs in Greater China for two consecutive years. The AI intelligent defense system of Alibaba Cloud WAF can continuously evolve its learning capability with positive prospects for technical application, which deserves the attention of the industry.