Building your own network protection with Hybrid WAF and iptables

23 Apr 2025

In this article I’ll introduce types of firewalls, more specifically IPS/IDS, WAF and iptables. And how to implement your own WAF and iptables rules with python.

Animated representation of firewall (taken from: exa.net.uk)

Code in this article is publicly available right here:

GitHub - BitR13x/AI-WAF: Web Firewall that is powered by AI and signatures

So theory comes first, what is firewall?

It is usually a hardware or software-based system which monitors all incoming and outgoing traffic and, based on a defined set of security rules, it accepts, rejects, or drops that specific traffic. Web application firewall (WAF) is simply a firewall for web application.

In order to secure the internal network from unauthorized traffic, we need a Firewall.

How about the IPS/IDS?

Intrusion Detection System (IDS): Looking for the signature of known attack types or detecting activity that deviates from a prescribed normal and then report it.

Intrusion Prevention System (IPS): Basically the same, but it can block the packet/request.

Example of an open-source IPS/IDS: Suricata

Interesting sources related to IPS/IDS:
Suricata cheetsheet
nDIP — Deep Packet Inspection
Intrusion-Detection-CICIDS2017 — in-depth analysis of the CICIDS2017 dataset

Comparison IPS/IDS and firewall:

The differences are on thin ice, so I will not go deeper, because you could argue that the WAF or iptables can detect attacks and block them, meaning it would be something more complex than just a firewall.

Let’s move on, implementation of iptables?

By combining iptables and Python, you can create a custom intrusion detection and prevention system (IDS/IPS) that can intercept network packets before they reach the target service.

For example, using libraries such as NFQueue or Scapy in Python, you can:

Analyze packets in real time
Detect suspicious patterns
Perform an automated action (e.g. dropping a packet or blocking an IP address).

For example, here is a python script that blocks all IP addresses that are not in whitelist.txt:

def process_packet(packet):
    scapy_packet = IP(packet.get_payload())
    features = extract_features(scapy_packet, packet)
    with open("./whitelist.txt") as file:
        for line in file:
            if line[0] == "" or line[0] == "\n":
                continue

            if scapy_packet.src == line.strip().replace("\n", ""):
                print(f"❌ Dropping Packet: {scapy_packet.summary()}")
                packet.drop()
        
    packet.accept()


# Start Netfilter Queue
nfqueue = NetfilterQueue()
nfqueue.bind(1, process_packet)

print("Monitoring Traffic in Real-Time")
try:
    nfqueue.run()
except KeyboardInterrupt:
    print("\nStopping Firewall...")
    nfqueue.unbind()

It works by using NFQUEUE, which is queueing packets to userspace. In simple terms, you get access to the packet using python. But you need to specify a queue number:

sudo iptables -D INPUT -p tcp --dport 80 -j NFQUEUE --queue-num 1

And you can do pretty neat stuff with this.

Implementation of WAF

We need to intercept HTTP(S) requests before they are delivered to the target server. This gives us the ability to analyze the content of the request and decide if it contains potentially malicious code.

We implement this step using a reverse-proxy component that I have programmed in Python using the library HTTPServer (a possible different approach is with the mitmproxy library).

The reverse-proxy acts as an intermediary between the user and the server. All traffic goes through this proxy first, which then decides whether to forward the request.

(The whole code is in the repository mentioned in the introduction)

This is how we forward the request:

def do_GET(self, body=True):
  url = protocol+f'{hostname}{self.path}'
  resp = requests.get(url, headers=self.headers, verify=False)
  self.wfile.write(resp.text.encode(encoding='UTF-8',errors='strict'))

If we want to check the URL, we simply do something like this:

def do_GET(self, body=True):
  url = protocol+f'{hostname}{self.path}'
  if verify_url(self.path): # we know that hostname is safe (from init)
    resp = requests.get(url, headers=self.headers, verify=False)
    self.wfile.write(resp.text.encode(encoding='UTF-8',errors='strict'))
  else:
    self.send_error(403, "Not allowed")

Signature-based

A signature-based WAF is a security mechanism designed to detect known patterns or signatures of malicious activity.

I’ve collected rules from coreruleset, user-agents and suricata.

Then made a regex search to match if any string is in the request or response.

    def __get_files_in_dir(self, dir_path: str) -> list:
        if os.path.isdir(dir_path):
            return os.listdir(dir_path)
        else:
            return []

    def __search_string_in_file(self, file_path: str, string: str) -> bool:        
        with open(file_path, "r") as f:
            for line in f:
                # empty line or comment
                if line[0] == "#" or line[0] == "\n":
                    continue
                
                # remove newline
                if line[-1]:
                    line = line[:-1]

                if re.search(re.escape(line), string):
                    return True # packet dangerous

        return False

    def __verify_signature(self, string: str, var: str) -> bool:
        files = self.__get_files_in_dir(self.signatures_paths[var])
        if len(files) > 0:
            for file in files:
                file_path = os.path.join(relative_path(self.signatures_paths[var]), file)
                if self.__search_string_in_file(file_path, string):
                    return False
        
        # returning True if path does not exist!
        return True

AI-based

AI-powered Web Application Firewall uses artificial intelligence and machine learning to detect and block web attacks in real time.

I used fwaf-dataset and PayloadAllThethings to train two models using PyTorch.

I didn’t create new model, but fine-tuned distilbert-base-uncased (you can use llama or any different model for classification).

    def verify_url(self, url: str) -> bool:
        inputs = self.tokenizer(url,
            truncation=True,
            padding=True,
            return_tensors="pt",
            max_length=500
        ).to(self.device)

        with torch.no_grad():
            outputs = self.url_model(**inputs)

        probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
        predicted_class_index = torch.argmax(probabilities, dim=-1).item()

        predicted_class_name = self.url_labels[predicted_class_index]
        logging.info(f"Probabilities for each class: {probabilities.numpy()}: {predicted_class_name}")

        if max(probabilities[0]) < self.probability_catch:
            # We are not that certain
            return True

        if predicted_class_name == "goodqueries":
            return True
        else:
            return False

Main disadvantages

Complexity: Setting up and keeping up a firewall can be time-consuming and difficult
Limited Flexibility for developers
Limited adaptability: frequently rule-based for signatures and AI can’t be that much trusted

Possible upgrade to this project

One option is anomaly-based detection.
Once we gather enough data to define normal network flow, we can take action on anomalies.

Abstract example of statistically represented network flow in graph

Conclusion

Creating a comprehensive security system for a server environment is not only possible, but also highly effective when using a combination of multiple layers of protection, including WAF and iptables.

A good security architecture can significantly reduce the risk of successful cyber-attacks such as XSS, XXE, CSRF, DDoS or SQL injection.

If you enjoyed this article, clap and follow me! Thanks for reading, and I wish you the best luck on your journey👋.

Building your own network protection with Hybrid WAF and iptables was originally published in System Weakness on Medium, where people are continuing the conversation by highlighting and responding to this story.