eBPF Jul 14, 2025 14 min read

eBPF for Cloud-Native Security: How Kernel Probes Replace HIDS in K8s Environments

eBPF runtime detection for cloud-native security: kernel probes and syscall monitoring in K8s

Traditional host-based intrusion detection was designed for a world where workloads lived on fixed machines. You installed an agent, it watched the filesystem and network interfaces, and it stayed there for years. That model collapses when a pod can spin up, process a million requests, and disappear in under sixty seconds — leaving no persistent artifact for a kernel module or HIDS agent to anchor to.

eBPF changes the architecture of this problem. Not because it's newer, but because it operates at the right level of abstraction: the kernel event stream itself, not the filesystem or the process list snapshot that was already stale by the time your HIDS polled it.

What eBPF Actually Does at the Kernel Level

eBPF — Extended Berkeley Packet Filter — lets you attach verified programs to kernel instrumentation points without modifying kernel source or loading a kernel module. The BPF verifier checks your program statically before execution: it verifies bounded loops, rejects unsafe memory access patterns, and ensures the program terminates. This is what makes eBPF programs production-safe in a way that kernel modules are not.

The attachment points relevant to runtime security are kprobes, uprobes, and kernel tracepoints. A kprobe attaches to the entry or return of an arbitrary kernel function — for example, security_file_open or tcp_connect. A tracepoint is a stable, versioned hook point defined in the kernel source, preferred over kprobes when available because they have guaranteed ABI stability across kernel versions. Uprobes operate in userspace function entry/return — useful for attaching to libc calls like execve from a specific binary without full syscall tracing overhead.

The data path from probe to user space runs through eBPF maps — shared memory structures that kernel-side programs write to and user-space daemons read from. For event streams, a perf ring buffer or BPF ring buffer is the typical transport: kernel side writes a fixed-size event struct (process ancestry, syscall number, arguments, cgroup namespace), user space drains it asynchronously. At high event rates, the ring buffer design means you shed events rather than blocking the kernel — a deliberate tradeoff that keeps probe overhead bounded.

CO-RE and BTF: Why Portability Across Kernel Versions Actually Works Now

One legitimate criticism of early eBPF security tools was the per-kernel-version compilation requirement. If your probe read a field at a hardcoded offset in a kernel struct, you had to recompile for every kernel version your fleet ran. That was operationally untenable at scale.

BTF (BPF Type Format) solves this. BTF embeds type information directly into the kernel binary, and CO-RE (Compile Once – Run Everywhere) uses that type information to perform relocations at load time rather than compile time. Your eBPF program is compiled against a reference BTF schema; at load time, the BPF loader (libbpf) reads the running kernel's BTF and rewrites field offsets to match. The result: a single compiled eBPF object that runs correctly across kernel versions from 5.4 through 6.x without recompilation.

This matters operationally because K8s node pools in practice run a heterogeneous kernel version mix — especially during rolling upgrades. A tool that requires per-kernel compilation adds CI complexity and creates gaps during the update window when the new kernel is running but the recompiled probe hasn't deployed yet. CO-RE eliminates that gap.

The DaemonSet Deployment Model and Why Sidecars Were Always Wrong for This

The alternative to eBPF-based runtime detection, which several earlier-generation tools used, was the sidecar: inject a container into each pod that intercepts syscalls or reads /proc. The appeal was namespace isolation — the sidecar has visibility scoped to its pod. The operational cost was real: every pod gets a second container, memory overhead scales linearly with pod count, the sidecar needs privileges that most admission policies rightly block by default, and the sidecar's lifecycle needs to be coordinated with the application container's startup ordering.

eBPF-based detection runs as a privileged DaemonSet at the node level — one agent per node, not one agent per pod. The agent attaches probes at the kernel level and gets visibility into all containers running on that node via cgroup namespace tagging. Each syscall event is stamped with the container ID, pod name, and K8s namespace, reconstructed from the cgroup hierarchy that the kernel maintains. You get full multi-tenant visibility from a single agent process, with no per-pod injection footprint.

We're not saying the DaemonSet model is without tradeoffs. Node-level agents require more careful RBAC scoping, and a misconfigured privileged DaemonSet is a broader blast radius than a misconfigured sidecar. The point is that for K8s environments running hundreds of pods per node, the operational math of sidecars doesn't work — it never worked at that density.

What You Can Actually Detect: MITRE ATT&CK Mapping in Practice

The value of syscall-level visibility is that it covers the MITRE ATT&CK for Containers techniques that matter most in practice. Consider a container escape scenario where an attacker exploits a misconfigured privileged container to write to the host filesystem. The sequence produces a distinct syscall pattern: open with a path outside the container root (relative to the mount namespace), followed by write, followed by process ancestry that shows the container's PID namespace differs from the target file's mount namespace. No filesystem-based HIDS catches this because the attacker never touches the files the HIDS watches.

Lateral movement within a K8s cluster via the Kubernetes API (T1552.007 in ATT&CK for Containers) produces network syscall patterns: unexpected connect calls from a non-API-client container toward the API server endpoint, often accompanied by credential material accessed from mounted ServiceAccount token paths. An eBPF probe on tcp_connect combined with a tracepoint on openat for paths matching /var/run/secrets/kubernetes.io/serviceaccount/ gives you this correlation in a single detection rule without any network tap or sidecar.

Process injection (T1055) maps to ptrace, process_vm_writev, and memfd_create syscalls from unexpected parent processes. These are straightforward kprobe targets. The process ancestry tree — parent PID chain back to container entry point — is the disambiguating signal between legitimate tooling and attacker activity.

Overhead: What "Under 2ms P99" Actually Means

eBPF probes are not free. Each kprobe or tracepoint adds a small number of instructions to every kernel call site hit. The overhead budget you have is roughly: the probe executes in kernel context, reads from task struct (process metadata), writes to an eBPF map, and returns. On a modern CPU with warm L1 cache, this is 50-200 nanoseconds per syscall event for a minimal probe.

The P99 latency overhead figure of under 2ms that appears in the platform stats refers to the tail-end impact on application syscalls — measured as the delta in P99 request latency for a representative HTTP workload under continuous tracing. This figure comes from a benchmark configuration running a Go HTTP service at 10k req/s on a 4-core node, with all five core syscall families traced simultaneously (execve, openat, connect, accept, clone). Your actual numbers will vary by workload syscall density — services doing heavy filesystem I/O will see proportionally more overhead than services that are mostly compute-bound.

What this means practically: for most production K8s workloads, the eBPF agent overhead is within the noise of normal request latency variance. The BPF verifier's constraint that programs must be bounded-runtime programs means there's no risk of a buggy probe spinning a CPU core — the worst case is the program exits early with an error, not an infinite loop in kernel context.

The Gap That SBOM and Static Scanning Don't Fill

Runtime detection via eBPF operates in a completely separate detection layer from SBOM generation and static container image scanning. It's worth being explicit about why both are necessary rather than treating them as substitutes.

Consider a mid-size SaaS platform running 90 K8s nodes across two regions — roughly 600 pods in steady state. Their static scanner flagged 340 CVEs in the previous sprint's CI builds. Their SBOM export covers every package in every container image, and their SCA tool correlates those packages against the NVD. What static analysis cannot tell you is whether a memory-corruption vulnerability in a dependency is reachable from the network entry point of the specific service where it lives — because static reachability analysis at that fidelity requires taint tracking across service boundaries, which no static tool does reliably at container scale.

What eBPF runtime detection adds is the behavioral ground truth: which binaries actually execute, which network connections actually happen, which file paths are actually opened. Combined with SBOM data, you get a filter that says "this CVE is in a library, and the library's vulnerable symbol was never called in production over the past 30 days" — a statement you cannot make from static analysis alone. That's the correlation signal that makes the 340-CVE backlog tractable.

The combination isn't just about efficiency. A runtime behavioral anomaly that doesn't match any known CVE pattern — novel exploitation of a logic bug, credential stuffing via a legitimate dependency — is visible to eBPF tracing and invisible to any static scanner. The two layers cover different threat models and need to run in parallel, not in sequence.

eBPF for Cloud-Native Security: How Kernel Probes Replace HIDS in K8s Environments

What eBPF Actually Does at the Kernel Level

CO-RE and BTF: Why Portability Across Kernel Versions Actually Works Now

The DaemonSet Deployment Model and Why Sidecars Were Always Wrong for This

What You Can Actually Detect: MITRE ATT&CK Mapping in Practice

Overhead: What "Under 2ms P99" Actually Means

The Gap That SBOM and Static Scanning Don't Fill

Related posts

SBOM for Kubernetes: Per-Service CycloneDX Generation in a Microservices Monorepo

Software Supply Chain Attack Patterns: What SLSA Provenance Actually Defends Against

Runtime-SCA Correlation: How eBPF Turns a 400-CVE Backlog Into 12 Actionable Items