CensorLab Documentation

CensorLab Reference Documentation

This document covers general usage of CensorLab, the configuration file format, the Python censor language (PyCL) API, and the CensorLang DSL.


General Usage

CensorLab is a censorship emulation testbed that intercepts network packets and processes them through configurable layers with optional Python scripts or ML models for custom censorship logic.

Installation

The fastest way to get started. No Rust toolchain or system dependencies required.

git clone https://github.com/SPIN-UMass/censorlab.git
cd censorlab
git submodule update --init

# Interactive shell
bash docker/censorlab.sh --shell

# Run directly (NFQ mode — requires sudo for live interception)
sudo bash docker/censorlab.sh -c demos/dns_blocking/censor.toml nfq

# Run directly (PCAP mode — no special permissions needed)
bash docker/censorlab.sh -c demos/dns_blocking/censor.toml pcap traffic.pcap 192.168.1.100

The Docker wrapper auto-detects NFQ vs PCAP mode and configures networking and capabilities accordingly. The image is automatically rebuilt when source files change.

Build from Source

You need a Rust toolchain. Nix users can run nix develop for a complete environment.

git clone https://github.com/SPIN-UMass/censorlab.git
cd censorlab
git submodule update --init

# Build (release mode recommended for performance)
cargo build --release

# Build with wire mode support
cargo build --release --features wire

# Set required network capabilities
sudo ./set_permissions.sh

# Run tests
cargo test --verbose

The set_permissions.sh script grants CAP_NET_ADMIN and CAP_NET_RAW capabilities to the binary, which are required for packet interception.

Pre-built VM

Pre-built VM images with everything pre-installed are available on the VM Info page. Useful for classroom environments or fully isolated setups.

Environment Setup

For accurate packet data, disable hardware offloading on the network interface CensorLab will use:

sudo ethtool -K eth0 tso off gro off gso off lro off

Replace eth0 with your actual interface name.

Running CensorLab

CensorLab has three execution modes: NFQ (netfilter queue), PCAP (offline analysis), and Wire (inline bridge). Each mode is selected as a subcommand.

Quick Start

# Run with a Python script in NFQ mode
censorlab -p censor.py nfq

# Run with a full configuration file
censorlab -c censor.toml nfq

# Analyze a saved packet capture
censorlab -c censor.toml pcap capture.pcap 192.168.1.100

Global Options

FlagDescription
-c, --config-path <PATH>Path to the TOML configuration file
-p, --program <PATH>Path to a censor script (overrides the one in config)
-v, --verbosity <LEVEL>Log verbosity: trace, debug, info, warn, error (default: info)
--ipc-port <PORT>Port for IPC commands

NFQ Mode

The most common mode. CensorLab hooks into the Linux netfilter queue to intercept live traffic. It automatically creates iptables rules on startup and removes them on shutdown.

censorlab -c censor.toml nfq [OPTIONS] [INTERFACE]
FlagDefaultDescription
--client-ip <IP>(auto-detected)IP address considered the "client" for direction calculation
--no-dir-action <ACTION>ignoreAction for traffic without a determinable direction
--iptables-table <TABLE>rawiptables table to intercept at
--iptables-chain-in <CHAIN>PREROUTINGiptables chain for inbound packets
--iptables-chain-out <CHAIN>OUTPUTiptables chain for outbound packets
--queue-num-in <NUM>0NFQUEUE number for inbound packets
--queue-num-out <NUM>1NFQUEUE number for outbound packets
--force-iptablesForce rule insertion even if conflicting NFQUEUE rules exist
[INTERFACE](auto-detected)Network interface to use for sending packets

CensorLab determines packet direction by comparing source/destination addresses against the client IP. Traffic from the client IP is considered client-to-WAN; traffic to it is WAN-to-client.

PCAP Mode

Analyzes a saved packet capture file offline. CensorLab processes each packet through the censor pipeline and logs what actions it would have taken.

censorlab -c censor.toml pcap <PCAP_PATH> <CLIENT_IP>
ArgumentDescription
<PCAP_PATH>Path to the .pcap file to analyze
<CLIENT_IP>Client IP address for direction calculation

Wire Mode

In wire mode, CensorLab sits between two network interfaces (WAN and client) and forwards packets between them. It can drop, delay, or modify packets inline. This mode requires the wire feature flag at build time.

cargo build --release --features wire
censorlab -c censor.toml wire <WAN_INTERFACE> <CLIENT_INTERFACE> [OPTIONS]
Argument / FlagDefaultDescription
<WAN_INTERFACE>WAN-side network interface
<CLIENT_INTERFACE>Client-side network interface
--wan-packets <NUM>1Max packets from WAN before polling client
--client-packets <NUM>1Max packets from client before polling WAN

Packet Processing Pipeline

Packets flow through a layered processing pipeline. Each layer can allow, drop, reset, or ignore packets before they reach the censor script:

Network Input (NFQ / Wire / PCAP)
  → Ethernet layer (MAC allowlist/blocklist)
    → ARP handling
      → IP layer (IP allowlist/blocklist)
        → ICMP handling
          → TCP/UDP layer (port allowlist/blocklist)
            → Censor script (Python or CensorLang)
              → Action (Allow / Drop / Reset / Ignore)

If a packet matches a blocklist at any layer, the configured action is taken immediately and the packet does not reach the censor script. Allowlists work inversely — only listed values pass through.

Demos

The demos directory contains example scenarios, each with a censor.toml and associated scripts/models:

DemoDescription
dns_blocking/DNS poisoning — inject forged responses for blocked domains
http_blocking/Block HTTP requests by keyword in the Host header
https_blocking/Block HTTPS connections (simple)
https_blocking_tls/Block HTTPS by TLS ClientHello SNI
ip_blocking/Block traffic to/from specific IP addresses
quic_blocking/Block QUIC connections by SNI
shadowsocks_gfw/Detect Shadowsocks-like encrypted proxy traffic via entropy heuristics
mega_gfw/Comprehensive GFW emulation combining 7 censorship techniques
model/ML model-based traffic classification (ONNX)
null/No-op passthrough (baseline)
print/Debug logging of all packets
# DNS blocking
censorlab -c demos/dns_blocking/censor.toml nfq

# HTTP blocking
censorlab -c demos/http_blocking/censor.toml nfq

# Comprehensive GFW emulation (DNS + HTTP + TLS + QUIC + IP + SSH + entropy)
censorlab -c demos/mega_gfw/censor.toml nfq

Paths in censor.toml are relative to the TOML file itself, including paths to censor scripts and ML models.


Configuration Reference

CensorLab is configured via TOML files, passed with the -c flag. Paths to scripts and models within the config are resolved relative to the config file location.

Configuration file vs. censor script: The TOML configuration file (censor.toml) handles transport- and link-layer filtering — IP addresses, MAC addresses, ports, and IP:port pairs can be allowed or blocked without writing any code. The censor script (censor.py or censor.cl) handles application-layer logic — parsing DNS queries, inspecting TLS handshakes, running ML models, or implementing any custom packet analysis. The TOML file references the censor script via the [execution] section.

[execution]

Controls which scripting engine to use and global execution parameters.

KeyTypeDefaultDescription
modestring"Python"Execution mode: "Python" or "CensorLang"
scriptstring(none)Path to the censor script (relative to config file)
hash_seedinteger1337Hash seed for Python VM reproducibility
reset_repeatinteger5Number of times to repeat sending a TCP RST packet

Layer Filtering

CensorLab processes packets through a layered pipeline. Each layer can have allowlists and blocklists that act before the script is invoked.

[ethernet]

KeyTypeDescription
unknownactionAction for unknown Ethernet frame types
allowlistlist blockMAC address allowlist
blocklistlist blockMAC address blocklist

[arp]

KeyTypeDescription
actionactionAction for all ARP traffic (default: "None")

[ip]

KeyTypeDescription
unknownactionAction for unknown IP next-header protocols
allowlistlist blockIP address allowlist
blocklistlist blockIP address blocklist

IP addresses are strings like "192.168.1.1" or "::1".

[icmp]

KeyTypeDescription
actionactionAction for all ICMP traffic (default: "None")

[tcp] / [udp]

KeyTypeDescription
port_allowlistlist blockPort number allowlist
port_blocklistlist blockPort number blocklist
ip_port_allowlistlist blockIP:port pair allowlist
ip_port_blocklistlist blockIP:port pair blocklist

Port lists contain integers (e.g. [80, 443]). IP:port pairs are strings like "10.0.0.1:80" or "[::1]:443".

List Block Format

Allowlist and blocklist sections share a common format:

[ip.blocklist]
list = ["192.168.1.1", "10.0.0.0"]
action = "Drop"
KeyTypeDefaultDescription
listarray[]Values to match against
actionaction string"None"Action to take on match

[models.<name>]

Defines ONNX models available to censor scripts.

KeyTypeDescription
pathstringPath to the .onnx.ml model file (relative to config file)

Actions

Actions control what happens to a packet at a given processing layer.

ActionDescription
"None"Continue processing through subsequent layers
"Ignore"Forward the packet immediately, skip further processing
"Drop"Silently drop the packet
"Reset"Send TCP RST packets in both directions (only valid for TCP-layer lists)

"Reset" is only valid on [tcp] lists. Using it on [ethernet], [arp], [ip], or [icmp] sections will produce a validation error.

Example Configuration

[execution]
mode = "Python"
script = "censor.py"
hash_seed = 1337
reset_repeat = 5

[ip.blocklist]
list = ["198.51.100.1", "203.0.113.0"]
action = "Drop"

[tcp.port_blocklist]
list = [80]
action = "Reset"

[tcp]
ip_port_allowlist = { list = ["10.0.0.1:443"] }

[tcp.ip_port_blocklist]
list = ["93.184.216.34:443"]
action = "Reset"

[udp.port_blocklist]
list = [53]
action = "Drop"

[udp]
ip_port_allowlist = { list = [] }

[models.classifier]
path = "models/classifier.onnx.ml"

Note: "Reset" is only valid on [tcp] lists. Using it on [ip], [ethernet], [arp], [udp], or [icmp] sections will produce a validation error at startup. Use "Drop" for non-TCP layers.


PyCL (Python Censor Language) Reference

PyCL is CensorLab's Python scripting API, powered by RustPython. Scripts run per-connection inside an embedded Python VM.

Lifecycle

  1. Connection init: On the first packet for a new connection, the entire script file is executed. Use this for initialization (global variables, imports, etc.).
  2. Per-packet processing: For every packet (including the first), the process(packet) function is called.
  3. Return value: The return value of process() determines the action for that packet.

Imports

from rust import Packet, Model, regex

This brings all core types into scope. Protocol-specific modules are imported separately:

from dns import parse as parse_dns
from tls import parse_client_hello, parse_client_hello_message
from quic import parse_initial

Return Values

The process() function should return one of:

Return valueEffect
NoneAllow the packet (continue processing)
"allow"Same as None
"drop"Drop the packet silently
"reset"Send TCP RST in both directions (falls back to "drop" for non-TCP)
bytesInject a forged UDP response (e.g., DNS poisoning) while allowing the original packet through

Returning raw bytes triggers packet injection — CensorLab constructs a UDP response packet with the returned bytes as payload and sends it back to the client, while the original packet is forwarded normally. This is used for techniques like DNS poisoning where the censor races a forged response against the legitimate one. See dns.craft_response() below.

Any unrecognized string is treated as allow with a warning logged.

Packet

The packet object is passed to process() and provides read-only access to packet metadata and payload.

Top-level Attributes

AttributeTypeDescription
packet.timestampfloat or NoneUnix timestamp of the packet capture
packet.directionint1 = client-to-WAN, 0 = unknown, -1 = WAN-to-client
packet.payloadbytesTransport-layer payload bytes
packet.payload_lenintLength of the payload in bytes
packet.payload_entropyfloatShannon entropy of the payload, scaled 0.0–1.0
packet.payload_avg_popcountfloatAverage number of set bits per byte (0.0–8.0)
packet.ipIpPacketIP layer metadata
packet.tcpTcpPacket or NoneTCP metadata (None if not TCP)
packet.udpUdpPacket or NoneUDP metadata (None if not UDP)

IpPacket

Accessed via packet.ip.

AttributeTypeDescription
.srcstrSource IP address
.dststrDestination IP address
.header_lenintIP header length in bytes
.total_lenintTotal IP packet length
.ttlintTime-to-live / hop limit
.next_headerintIP protocol number (6=TCP, 17=UDP)
.versionintIP version (4 or 6)

IPv4-only (return None on IPv6):

AttributeTypeDescription
.dscpint or NoneDifferentiated Services Code Point
.ecnint or NoneExplicit Congestion Notification
.identint or NoneIdentification field
.dont_fragbool or NoneDon't Fragment flag
.more_fragsbool or NoneMore Fragments flag
.frag_offsetint or NoneFragment offset
.checksumint or NoneHeader checksum

IPv6-only (return None on IPv4):

AttributeTypeDescription
.traffic_classint or NoneTraffic class
.flow_labelint or NoneFlow label
.payload_lenint or NonePayload length

TcpPacket

Accessed via packet.tcp. Returns None if the packet is not TCP.

AttributeTypeDescription
.srcintSource port
.dstintDestination port
.seqintSequence number
.ackintAcknowledgement number
.header_lenintTCP header length in bytes
.urgent_atintUrgent pointer
.window_lenintWindow size
.flagsTcpFlagsTCP flags object

Method:

MethodReturnsDescription
.uses_port(port)boolTrue if either src or dst equals port

TcpFlags

Accessed via packet.tcp.flags. All attributes are bool.

AttributeDescription
.finFIN flag
.synSYN flag
.rstRST flag
.pshPSH flag
.ackACK flag
.urgURG flag
.eceECE flag
.cwrCWR flag
.nsNS flag

UdpPacket

Accessed via packet.udp. Returns None if the packet is not UDP.

AttributeTypeDescription
.srcintSource port
.dstintDestination port
.lengthintTotal UDP datagram length (header + payload)
.checksumintUDP checksum

Method:

MethodReturnsDescription
.uses_port(port)boolTrue if either src or dst equals port

DNS Module

Parse DNS packets from raw payload bytes, and optionally craft forged DNS responses for injection.

from dns import parse as parse_dns, craft_response

def process(packet):
    udp = packet.udp
    if udp and udp.uses_port(53):
        dns = parse_dns(packet.payload)
        for question in dns.questions:
            if "example.com" in question.qname:
                # Inject a forged DNS response pointing to a decoy IP
                return craft_response(packet.payload, "10.10.10.10")

Functions

FunctionReturnsDescription
parse(bytes)DnsPacketParse a DNS packet from raw bytes
craft_response(query_bytes, ip)bytesCraft a forged DNS A-record response from a query, with default TTL of 300
craft_response(query_bytes, ip, ttl)bytesSame as above, with a custom TTL value

Returning the bytes from craft_response() causes CensorLab to inject the forged response while allowing the original query through — the same technique used by the GFW for DNS poisoning.

DnsPacket

Returned by parse(bytes).

AttributeTypeDescription
.idintQuery ID
.queryboolTrue if this is a query (vs response)
.opcodestrOpcode (e.g. "StandardQuery")
.authoritativeboolAuthoritative answer flag
.truncatedboolTruncation flag
.recursion_desiredboolRecursion desired flag
.recursion_availableboolRecursion available flag
.authenticated_databoolAuthenticated data flag
.checking_disabledboolChecking disabled flag
.response_codestrResponse code (e.g. "NoError")
.questionslist[Question]Question records
.answerslist[ResourceRecord]Answer records
.nameserverslist[ResourceRecord]Authority records
.additionallist[ResourceRecord]Additional records
.optRecord or NoneOPT pseudo-record

Question

AttributeTypeDescription
.qnamestrQueried domain name
.prefer_unicastboolUnicast response preferred
.qtypestrQuery type (e.g. "A", "AAAA")
.qclassstrQuery class (e.g. "IN")

ResourceRecord

AttributeTypeDescription
.namestrRecord name
.multicast_uniqueboolMulticast unique flag
.clsstrRecord class (e.g. "IN")
.ttlintTime-to-live in seconds
.datatupleRecord data as a tagged tuple (see below)

The .data attribute returns a tuple whose first element is the record type string:

TypeFormat
A("A", "1.2.3.4")
AAAA("AAAA", "::1")
CNAME("CNAME", "alias.example.com")
MX("MX", preference, "mail.example.com")
NS("NS", "ns1.example.com")
PTR("PTR", "host.example.com")
SOA("SOA", primary_ns, mailbox, serial, refresh, retry, expire, minimum_ttl)
SRV("SRV", priority, weight, port, "target.example.com")
TXT("TXT", [b"text data", ...])
Unknown("UNKNOWN",)

TLS Module

Parse TLS ClientHello messages from TCP payload.

from tls import parse_client_hello

def process(packet):
    tcp = packet.tcp
    if tcp and tcp.dst == 443 and packet.payload_len > 0:
        try:
            hello = parse_client_hello(packet.payload)
            if hello.sni and "blocked.com" in hello.sni:
                return "reset"
        except:
            pass

Functions

FunctionDescription
parse_client_hello(bytes)Parse from a full TLS record (with record header)
parse_client_hello_message(bytes)Parse from the handshake message only (no record header)

Both return a ClientHelloInfo.

ClientHelloInfo

AttributeTypeDescription
.snistr or NoneServer Name Indication hostname
.alpnlist[str]ALPN protocol names (e.g. ["h2", "http/1.1"])
.client_versionintLegacy TLS version from ClientHello (e.g. 0x0303)
.supported_versionslist[int]Supported TLS versions from extension
.cipher_suites_countintNumber of cipher suites offered
.extensions_countintNumber of extensions present

QUIC Module

Parse QUIC Initial packets from UDP payload. This decrypts the Initial packet using the QUIC v1 key derivation and extracts the embedded TLS ClientHello.

from quic import parse_initial

def process(packet):
    udp = packet.udp
    if udp and udp.uses_port(443) and packet.payload_len > 0:
        try:
            info = parse_initial(packet.payload)
            if info.sni and "blocked.com" in info.sni:
                return "drop"
        except:
            pass

QuicInitialInfo

Returned by parse_initial(bytes).

AttributeTypeDescription
.versionintQUIC version number
.dcidbytesDestination Connection ID
.scidbytesSource Connection ID
.snistr or NoneSNI from embedded TLS ClientHello
.alpnlist[str]ALPN from embedded TLS ClientHello

Regex

Byte-level regular expression matching using Rust's regex engine.

from rust import regex

re = regex(r"Host:\s+example\.com")

def process(packet):
    if re.is_match(packet.payload):
        return "reset"

regex(pattern)Regex

MethodReturnsDescription
.is_match(bytes)boolTrue if the pattern matches anywhere in the byte string

Model

Evaluate ONNX models defined in the configuration file.

from rust import Model

def process(packet):
    features = [packet.payload_len, packet.payload_entropy]
    # Pad to expected input size
    features += [0.0] * (90 - len(features))
    result = model.evaluate("classifier", features)
    if result[0] > 0.5:
        return "drop"

The model variable is automatically available in the process() scope after initialization.

model.evaluate(name, data)list[float]

ParameterTypeDescription
namestrModel name as defined in [models.<name>]
datalist[float]Input feature vector (must match model's expected input size)

Returns a list of floats from the model's probability output.

Complete PyCL Example

from rust import Packet, Model, regex
from dns import parse as parse_dns
from tls import parse_client_hello

blocked_domains = ["example.com", "blocked.org"]
http_re = regex(rb"Host:\s+blocked\.org")

def process(packet):
    # Block DNS queries for blocked domains
    udp = packet.udp
    if udp and udp.uses_port(53):
        try:
            dns = parse_dns(packet.payload)
            for q in dns.questions:
                for domain in blocked_domains:
                    if domain in q.qname:
                        return "drop"
        except:
            pass
        return

    # Block TLS connections to blocked domains
    tcp = packet.tcp
    if tcp and tcp.dst == 443 and tcp.flags.syn == False and packet.payload_len > 0:
        try:
            hello = parse_client_hello(packet.payload)
            if hello.sni:
                for domain in blocked_domains:
                    if domain in hello.sni:
                        return "reset"
        except:
            pass

    # Block HTTP by Host header
    if tcp and tcp.uses_port(80):
        if http_re.is_match(packet.payload):
            return "reset"

    # Drop high-entropy traffic (possible encrypted tunnel)
    if packet.payload_entropy > 0.95 and packet.payload_len > 200:
        result = model.evaluate("classifier", [
            float(packet.payload_len),
            packet.payload_entropy,
            packet.payload_avg_popcount,
        ] + [0.0] * 87)
        if result[0] > 0.8:
            return "drop"

CensorLang Reference

CensorLang is a linear, register-based DSL for writing censor programs. It is designed for machine-generated censorship strategies (e.g. via genetic programming) but can also be written by hand.

Overview

  • Programs execute top-to-bottom, one instruction per line
  • The first RETURN instruction that executes determines the action for that packet
  • If no RETURN executes, the packet is allowed
  • Each connection gets its own set of registers, preserved across packets

Configuration

Set mode = "CensorLang" in the [execution] section:

[execution]
mode = "CensorLang"
script = "censor.cl"

CensorLang programs also respect these environment settings (from the internal program config):

SettingDefaultDescription
field_default_on_errortrueReturn 0/false instead of erroring when accessing a field from the wrong protocol
relax_register_typesfalseAllow writing values into register banks of a different type

Syntax

Each line has the form:

[if CONDITION:] OPERATION

A line is either unconditional (OPERATION) or conditional (if CONDITION: OPERATION).

Operations

RETURN

Returns an action for the current packet. The first RETURN that executes wins.

RETURN allow
RETURN allow_all
RETURN terminate
ActionEffect
allowAllow this packet, continue evaluating future packets
allow_allAllow this and all future packets for the connection
terminateDrop this and all future packets for the connection

Actions are case-insensitive (ALLOW, Allow, allow all work).

COPY

Copy a value into a register.

COPY <value> -> <register>

Example:

COPY field:tcp.payload.len -> reg:i.0
COPY 3.14 -> reg:f.0
COPY True -> reg:b.0

Arithmetic: ADD, SUB, MUL, DIV, MOD

Perform arithmetic on two values and store the result.

ADD <value>, <value> -> <register>
SUB <value>, <value> -> <register>
MUL <value>, <value> -> <register>
DIV <value>, <value> -> <register>
MOD <value>, <value> -> <register>

Division and modulo by zero return zero (no error).

Type coercion: when operand types differ, the result is promoted (bool → int → float).

Bitwise/Logic: AND, OR, XOR

Logical operations on two values, stored as a bool.

AND <value>, <value> -> <register>
OR  <value>, <value> -> <register>
XOR <value>, <value> -> <register>

Values are converted to bool before the operation (0 / 0.0 / false → false, everything else → true).

NOOP

No operation. Removed by the optimizer.

NOOP

MODEL

Placeholder for model evaluation (reserved, not yet fully integrated in CensorLang).

MODEL

Values

Values can appear as operation inputs or condition operands.

SyntaxTypeExample
Integer literalint42, -1, 0
Float literalfloat3.14, 0.5, -1.0
True / FalseboolTrue
field:<path>variesfield:tcp.payload.len
reg:<type>.<N>variesreg:f.0, reg:i.3, reg:b.1

Registers

Registers are per-connection persistent storage, organized into three typed banks:

PrefixTypeDefault value
reg:f.Nfloat0.0
reg:i.Nint0
reg:b.Nboolfalse

The default bank size is 16 registers per type. Writing a value to a register of the wrong type is an error unless relax_register_types is enabled.

Fields

Fields read packet metadata. Accessing a field from the wrong protocol (e.g. field:tcp.seq on a UDP packet) produces an error, or returns 0/false if field_default_on_error is enabled.

Environment

FieldTypeDescription
field:env.num_packetsintNumber of packets processed on this connection

General

FieldTypeDescription
field:timestampfloatPacket capture timestamp

IP (all versions)

FieldTypeDescription
field:ip.header_lenintIP header length in bytes
field:ip.total_lenintTotal IP packet length
field:ip.hop_limitintTTL / hop limit

IPv4-specific

FieldTypeDescription
field:ip4.dscpintDifferentiated Services Code Point
field:ip4.ecnintExplicit Congestion Notification
field:ip4.identintIdentification field
field:ip4.dont_fragboolDon't Fragment flag
field:ip4.more_fragsboolMore Fragments flag
field:ip4.frag_offsetintFragment offset
field:ip4.checksumintHeader checksum

IPv6-specific

FieldTypeDescription
field:ip6.traffic_classintTraffic class
field:ip6.flow_labelintFlow label
field:ip6.payload_lenintPayload length

TCP

FieldTypeDescription
field:tcp.seqintSequence number
field:tcp.ackintAcknowledgement number
field:tcp.lenintTotal TCP segment length (header + payload)
field:tcp.header.lenintTCP header length
field:tcp.payload.lenintTCP payload length
field:tcp.urgent_atintUrgent pointer
field:tcp.window_lenintWindow size

TCP Flags

FieldTypeDescription
field:tcp.flag.finboolFIN flag
field:tcp.flag.synboolSYN flag
field:tcp.flag.rstboolRST flag
field:tcp.flag.pshboolPSH flag
field:tcp.flag.ackboolACK flag
field:tcp.flag.urgboolURG flag
field:tcp.flag.eceboolECE flag
field:tcp.flag.cwrboolCWR flag
field:tcp.flag.nsboolNS flag

UDP

FieldTypeDescription
field:udp.lengthintTotal UDP datagram length
field:udp.checksumintUDP checksum

Transport Payload

FieldTypeDescription
field:transport.payload.entropyfloatShannon entropy of payload (0.0–1.0)

Conditions

Conditions are comparisons between two values:

if <value> <operator> <value>: OPERATION

Comparison Operators

OperatorAliasesDescription
<ltLess than
<=leLess than or equal
>gtGreater than
>=geGreater than or equal
==eqEqual
!=neNot equal

Logic Operators

Logic operators treat both operands as booleans.

OperatorAliasesDescription
&&and, op_andLogical AND
||or, op_orLogical OR
^xor, op_xorLogical XOR
nandop_nandLogical NAND
norop_norLogical NOR
xnorop_xnorLogical XNOR

Compiler Optimizations

CensorLang programs are optimized at load time:

  • Constant folding: ADD 2, 3 -> reg:i.0 becomes COPY 5 -> reg:i.0
  • Dead code elimination: Registers that are written but never read are removed
  • NOOP stripping: All NOOP instructions are removed
  • Always-true condition elimination: if 1 == 1: RETURN terminate becomes RETURN terminate
  • Always-false condition elimination: Lines with always-false conditions are removed entirely
  • Unreachable code removal: Code after an unconditional RETURN is truncated

Complete CensorLang Example

COPY field:tcp.payload.len -> reg:i.0
COPY field:transport.payload.entropy -> reg:f.0
if field:tcp.flag.syn == True: RETURN allow
if reg:i.0 > 200: COPY True -> reg:b.0
if reg:f.0 > 0.95: COPY True -> reg:b.1
if reg:b.0 && reg:b.1: RETURN terminate

This program:

  1. Copies payload length and entropy into registers
  2. Allows SYN packets immediately
  3. Flags packets with payload > 200 bytes
  4. Flags packets with entropy > 0.95
  5. Terminates connections where both flags are set (likely encrypted tunnel traffic)