Local Setup - 2026 [On Arch Linux]
As an Infrastructure engineer, my terminal isn’t just a shell — it’s my control plane. Production debugging, Kubernetes firefighting, ASG scaling checks, S3 hotfixes, SSH pivots — everything happens here. This post is generated from my actual ~/.zshrc on Arch Linux: what I really run, not a wishlist.
This setup has evolved over six years of continuous refinement, shaped by daily use across different environments & roles. It has traveled with me from Windows → Ubuntu → macOS → Arch (CachyOS), picking up useful patterns & improvements along the way.
Every function & shortcut exists because it made something faster, clearer, or more repeatable. Over time, the terminal naturally became a production cockpit - a place where context, tooling, & muscle memory come together.
The stack is intentionally simple & well-tested: Zsh, Oh My Zsh, Powerlevel10k, tmux, kubectl, k9s, fzf, AWS CLI, plus a growing set of custom functions. Nothing flashy — just efficient, consistent, & built to keep me moving quickly, the way DevOps systems usually end up... well they never end up, & always grow :).
1. Shell (Zsh), theme (Powerlevel10k), & plugins
Shell is zsh; theme is Powerlevel10k. Oh My Zsh loads the theme & plugins:
export ZSH="$HOME/.oh-my-zsh"
ZSH_THEME="powerlevel10k/powerlevel10k"
plugins=(
git
history-substring-search
fzf
kubectl
kubectx
helm
docker
docker-compose
terraform
aws
systemd
sudo
archlinux
ssh-agent
)
source $ZSH/oh-my-zsh.sh
2. SSH & SSM: secure access
SSM (recommended)
For hardened or private environments I use AWS Systems Manager instead of opening port 22. ssm_fuzzy lists running EC2 instances in a region (default $AWS_DEFAULT_REGION, fallback ap-south-1), optionally filters by a query string, then uses fzf with a small preview (InstanceId, Name, IP, AZ). Selecting one runs aws ssm start-session for that instance. Fallback without fzf: numbered list + read.
ssm_fuzzy() {
local REGION="${AWS_DEFAULT_REGION:-ap-south-1}"
local QUERY=""
for arg in "$@"; do
case "$arg" in
--region=*) REGION="${arg#*=}" ;;
*) QUERY="$arg" ;;
esac
done
local INSTANCES
INSTANCES=$(aws ec2 describe-instances \
--region "$REGION" \
--filters "Name=instance-state-name,Values=running" \
--query 'Reservations[].Instances[].{
ID: InstanceId,
Name: Tags[?Key==`Name`]|[0].Value,
IP: PrivateIpAddress,
AZ: Placement.AvailabilityZone
}' \
--output text)
[[ -z "$INSTANCES" ]] && echo "No running instances found." && return 1
[[ -n "$QUERY" ]] && INSTANCES=$(echo "$INSTANCES" | grep -i "$QUERY")
[[ -z "$INSTANCES" ]] && echo "No instances matched '$QUERY'" && return 1
local SELECTED=""
if command -v fzf >/dev/null && [[ -t 1 ]]; then
SELECTED=$(echo "$INSTANCES" | \
awk '{printf "%-20s %-30s %-15s %-10s\n", $1, $2, $3, $4}' | \
fzf --prompt="Select EC2 for SSM > " \
--preview "echo {} | awk '{print \"InstanceId: \" \$1 \"\nName: \" \$2 \"\nIP: \" \$3 \"\nAZ: \" \$4}'")
else
nl -w2 -s'. ' <<< "$INSTANCES"
read -p "Select number: " idx
SELECTED=$(echo "$INSTANCES" | sed -n "${idx}p")
fi
[[ -z "$SELECTED" ]] && return 0
local INSTANCE_ID
INSTANCE_ID=$(echo "$SELECTED" | awk '{print $1}')
echo "Starting SSM session to $INSTANCE_ID in $REGION"
aws ssm start-session --region "$REGION" --target "$INSTANCE_ID"
}
SSH (legacy only)
SSH is only used for legacy or old running setups; it’s not recommended for production security (prefer SSM above). Where SSH is still needed, I don’t type ssh -i ~/.ssh/<key-prod> ubuntu@10.x.x.x. I use sshprod 10.x.x.x. One key per environment, with a small helper:
__ssh_with_key() {
local key="$1"
shift
if [[ "$1" == *"@"* ]]; then
ssh -i "$key" "$@"
else
ssh -i "$key" ubuntu@"$@"
fi
}
# Replace <key-dev>, <key-stag>, <key-prod>, <key-aws> with your key paths
sshdev() { __ssh_with_key ~/.ssh/<key-dev> "$@"; }
sshstag() { __ssh_with_key ~/.ssh/<key-stag> "$@"; }
sshprod() { __ssh_with_key ~/.ssh/<key-prod> "$@"; }
sshmum() { __ssh_with_key ~/.ssh/<key-aws> "$@"; }
Enforces the right key per env & a default user so I can’t accidentally hit prod with a dev key.
3. Incident mode: tmux war room
One command: incident. It creates a session, splits panes, & starts k9s + kubectl + a shell.
incident() {
local SESSION="incident"
tmux new-session -d -s $SESSION -n INCIDENT
tmux split-window -v -t $SESSION
tmux split-window -h -t $SESSION:.1
tmux resize-pane -t $SESSION:.0 -y 20
tmux send-keys -t $SESSION:.0 "k9s" C-m
tmux send-keys -t $SESSION:.1 "kubectl get pods" C-m
tmux send-keys -t $SESSION:.2 "echo 'Incident shell ready'" C-m
tmux select-pane -t $SESSION:.2
tmux attach -t $SESSION
}
Top: k9s. Bottom left: pod list. Bottom right: shell. No manual pane setup during an incident.
4. Kubernetes helpers
These are not in the default Oh My Zsh kubectl plugin — they’re custom helpers for debugging & cleanup.
Pods with node (pod name + node name):
kpn() {
kubectl get pods -o wide | awk '{print $1, $7}'
}
All crashing pods (restart count > 0). Current namespace by default; -A or --all-namespaces for all:
kcrash() {
if [[ "$1" == "-A" ]] || [[ "$1" == "--all-namespaces" ]]; then
kubectl get pods -A -o wide | awk 'NR==1 || $5+0>0'
else
kubectl get pods -o wide | awk 'NR==1 || $4+0>0'
fi
}
Pods not ready (anything not Running or Completed). Add -A for all namespaces:
knotready() {
if [[ "$1" == "-A" ]] || [[ "$1" == "--all-namespaces" ]]; then
kubectl get pods -A -o wide | awk 'NR==1 || ($4!="Running" && $4!="Completed")'
else
kubectl get pods -o wide | awk 'NR==1 || ($3!="Running" && $3!="Completed")'
fi
}
Pods in bad states (ImagePullBackOff, CrashLoopBackOff, Error, Pending, ErrImagePull). -A for all namespaces:
kbad() {
if [[ "$1" == "-A" ]] || [[ "$1" == "--all-namespaces" ]]; then
kubectl get pods -A -o wide | awk 'NR==1 || $4~/ImagePullBackOff|CrashLoopBackOff|Error|Pending|ErrImagePull/'
else
kubectl get pods -o wide | awk 'NR==1 || $3~/ImagePullBackOff|CrashLoopBackOff|Error|Pending|ErrImagePull/'
fi
}
Pods sorted by restart count (worst first). -A for all namespaces:
krestarts() {
local wide
if [[ "$1" == "-A" ]] || [[ "$1" == "--all-namespaces" ]]; then
wide=$(kubectl get pods -A -o wide)
echo "$wide" | head -1
echo "$wide" | tail -n +2 | sort -t' ' -k5 -rn
else
wide=$(kubectl get pods -o wide)
echo "$wide" | head -1
echo "$wide" | tail -n +2 | sort -t' ' -k4 -rn
fi
}
Logs from previous container instance (handy after a crash). Omz has kl but not --previous:
klprev() {
kubectl logs "$1" --previous "${@:2}"
}
Delete evicted pods in current namespace (or -A for all). Safe cleanup after node pressure:
kevict() {
if [[ "$1" == "-A" ]] || [[ "$1" == "--all-namespaces" ]]; then
kubectl get pods -A | awk 'NR>1 && $4~/Evicted/{print $1, $2}' | while read -r ns name; do kubectl delete pod -n "$ns" "$name"; done
else
kubectl get pods | awk 'NR>1 && $3~/Evicted/{print $1}' | while read -r name; do kubectl delete pod "$name"; done
fi
}
After all this too, I use k9s a lot — but k9s helps me see what’s happening; the shell is where I encode what needs to happen next & how I can find out things at a more ground level.
5. AWS: ASG & EC2
EC2 by name tag:
ec2_find() {
local NAME="$1"
local REGION="${AWS_DEFAULT_REGION:-ap-south-1}"
if [[ -z "$NAME" ]]; then
echo "Usage: ec2_find <name-substring> [region]"
return 1
fi
aws ec2 describe-instances \
--region "$REGION" \
--filters "Name=tag:Name,Values=*${NAME}*" \
--query 'Reservations[].Instances[].{
ID: InstanceId,
Name: Tags[?Key==`Name`]|[0].Value,
State: State.Name,
IP: PrivateIpAddress,
AZ: Placement.AvailabilityZone
}' \
--output table
}
ASG instances (region from $AWS_DEFAULT_REGION, fallback ap-south-1):
asg_instances() {
local ASG="$1"
local REGION="${2:-${AWS_DEFAULT_REGION:-ap-south-1}}"
if [[ -z "$ASG" ]]; then
echo "Usage: asg_instances <asg-name> [region]"
return 1
fi
aws ec2 describe-instances \
--region "$REGION" \
--filters "Name=tag:aws:autoscaling:groupName,Values=$ASG" \
--query 'Reservations[].Instances[].{
InstanceId: InstanceId,
Name: Tags[?Key==`Name`]|[0].Value,
PrivateIP: PrivateIpAddress,
State: State.Name,
LaunchTime: LaunchTime
}' \
--output table
}
ASG fuzzy (fzf):
asg_fuzzy() {
local REGION="${AWS_DEFAULT_REGION:-ap-south-1}"
local WATCH=""
for arg in "$@"; do
case "$arg" in
--watch) WATCH=1 ;;
--region=*) REGION="${arg#*=}" ;;
esac
done
while true; do
local ASG_LIST
ASG_LIST=$(aws autoscaling describe-auto-scaling-groups --region "$REGION" \
--query 'AutoScalingGroups[].AutoScalingGroupName' --output text | tr '\t' '\n')
[[ -z "$ASG_LIST" ]] && echo "No ASGs in $REGION" && return 1
local SELECTED
if command -v fzf >/dev/null && [[ -t 1 ]]; then
SELECTED=$(echo "$ASG_LIST" | fzf --prompt="ASG > " \
--preview "aws autoscaling describe-auto-scaling-groups --region $REGION --auto-scaling-group-names {} \
--query 'AutoScalingGroups[0].{Desired:DesiredCapacity,Min:MinSize,Max:MaxSize,Instances:length(Instances)}' --output text 2>/dev/null || echo {}")
else
echo "$ASG_LIST" | nl -w2 -s'. '
read -p "Select number: " idx
SELECTED=$(echo "$ASG_LIST" | sed -n "${idx}p")
fi
[[ -z "$SELECTED" ]] && return 0
asg_instances "$SELECTED" "$REGION"
[[ -z "$WATCH" ]] && return 0
echo "Refreshing in 5s (Ctrl-C to exit)..."
sleep 5
done
}
asg_fuzzy in my rc: fzf over ASG names, optional --watch to poll asg_instances every 5s, & a preview that shows desired/min/max/instance count. Ctrl-C exits & (in watch mode) returns to the ASG list. Same region default.
6. S3: edit in place
s3edit() {
local S3PATH="$1"
if [[ -z "$S3PATH" ]]; then
echo "Usage: s3edit s3://bucket/key"
return 1
fi
if [[ "$S3PATH" != s3://* ]]; then
echo "Invalid S3 path. Must start with s3://"
return 1
fi
local TMPFILE
TMPFILE=$(mktemp /tmp/s3edit.XXXXXX)
echo "Downloading $S3PATH ..."
if ! aws s3 cp "$S3PATH" "$TMPFILE"; then
echo "Download failed."
rm -f "$TMPFILE"
return 1
fi
${EDITOR:-vim} "$TMPFILE"
echo "Uploading back to $S3PATH ..."
aws s3 cp "$TMPFILE" "$S3PATH"
rm -f "$TMPFILE"
echo "Done."
}
Download → edit with $EDITOR (vim fallback) → upload back. No manual copy/paste.
Design Philosophy Behind This Setup
Speed — Less typing. Faster execution.
Context Awareness — Git, K8s, AWS context visible at all times.
Incident Efficiency — Pre-built layouts & crash detection.
Reduced Human Error — Environment-based SSH keys, region defaults, IAM-based SSM.
Console Independence — Almost no dependency on AWS Console, Kubernetes Dashboard, or S3 UI. Everything happens in the terminal.
A Note on How I Use AI (More on This Soon)
Beyond the shell itself, I rely on AI for a large part of my day-to-day work - from coding & refactoring to incident RCA, log exploration, & pattern detection during outages. I also use AI for metrics-driven analysis across databases & applications by connecting AI agents directly to the observability & APM stack from the terminal, turning raw telemetry into actionable insights much faster than manual dashboards ever could.
That said, this deserves a deep dive of its own — how AI fits into a DevOps workflow, how agents reason over metrics & traces, & how this changes incident response. I've intentionally kept that for a separate post.