Skip to main content

Free 30-min security demo  — We'll scan your real code and show live findings, no commitment Book Now

Offensive360
Academy Insecure Deserialization
Advanced · 20 min

Insecure Deserialization

See how deserializing untrusted data can lead to RCE and learn safe alternatives.

1 Why Deserialization is Dangerous

Deserialization converts a byte stream or string back into an object. When you deserialize attacker-controlled data, the deserialization process itself can execute arbitrary code — before any application logic runs.

# CRITICAL — never deserialize untrusted data with pickle
import pickle, base64
data = base64.b64decode(request.cookies['session'])
obj = pickle.loads(data)  # RCE if data is malicious

Python's pickle, Java's native serialization, PHP's unserialize(), and Ruby's Marshal.load() are all capable of executing code during deserialization through magic methods (__reduce__, readObject, __wakeup).

2 Safe Alternatives

The fundamental rule: never deserialize untrusted data with a format that supports arbitrary object instantiation.

Use data-only formats instead:

  • JSON — safe by default; only creates dicts, lists, strings, numbers
  • JSON Schema validation — add type/structure enforcement after parsing
  • Protobuf / MessagePack — typed binary formats that do not execute code
import json

# SAFE — JSON only creates primitive Python structures
session = json.loads(request.cookies['session'])
user_id = int(session['user_id'])  # validate types explicitly

If you must use a rich serialization format, sign or encrypt the data and verify the signature before deserializing.

3 HMAC Signing for Serialized Data

When you need to store structured data client-side (e.g., session cookies), sign it with an HMAC so the server can verify it has not been tampered with:

import json, hmac, hashlib, base64

SECRET = b'your-server-secret-key'

def encode(data: dict) -> str:
    payload = base64.b64encode(json.dumps(data).encode()).decode()
    sig = hmac.new(SECRET, payload.encode(), hashlib.sha256).hexdigest()
    return f"{payload}.{sig}"

def decode(token: str) -> dict:
    try:
        payload, sig = token.rsplit('.', 1)
        expected = hmac.new(SECRET, payload.encode(), hashlib.sha256).hexdigest()
        if not hmac.compare_digest(sig, expected):
            raise ValueError("Invalid signature")
        return json.loads(base64.b64decode(payload))
    except Exception:
        raise ValueError("Invalid token")

Use hmac.compare_digest (not ==) to prevent timing attacks.

Knowledge Check

0/3 correct
Q1

Why is Python's pickle module dangerous for deserializing user-supplied data?

Q2

Which data format is safest for deserializing untrusted input?

Q3

If you must store structured data in a client-side cookie, what is the correct way to protect it?

Code Exercise

Replace Pickle with JSON

This function uses pickle to deserialize a session cookie — a critical RCE vulnerability. Replace it with JSON deserialization and add basic type validation.

python