Last updated: March 22, 2026

Claude vs GPT-4 for Shell Scripting 2026

Shell scripts are unforgiving. A missing quote, an unset variable check, or a rm -rf without a guard can cause real damage. This comparison tests Claude and GPT-4o on five real-world scripting tasks.

Test 1: Safe File Processing Script

Prompt:

Write a Bash script that processes all .log files in a directory.
For each file: count lines, find unique IPs (format X.X.X.X), write results to a CSV.
The script should: use strict mode, handle missing directory argument,
not fail if no .log files exist, be POSIX-portable where possible.

Claude Output

#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'

usage() {
    echo "Usage: $(basename "$0") <log-directory>" >&2
    echo "  Processes .log files and writes results to results.csv" >&2
    exit 1
}

[[ $# -ne 1 ]] && usage

LOG_DIR="${1}"
OUTPUT_CSV="${LOG_DIR}/results.csv"

if [[ ! -d "${LOG_DIR}" ]]; then
    echo "Error: '${LOG_DIR}' is not a directory" >&2
    exit 1
fi

# Write CSV header
echo "filename,line_count,unique_ip_count,unique_ips" > "${OUTPUT_CSV}"

# Collect .log files; handle case where none exist
shopt -s nullglob
log_files=("${LOG_DIR}"/*.log)
shopt -u nullglob

if [[ ${#log_files[@]} -eq 0 ]]; then
    echo "No .log files found in '${LOG_DIR}'" >&2
    echo "Empty report written to ${OUTPUT_CSV}"
    exit 0
fi

for log_file in "${log_files[@]}"; do
    filename="$(basename "${log_file}")"
    line_count=$(wc -l < "${log_file}")

    # Extract and deduplicate IPs
    unique_ips=$(grep -oE '\b([0-9]{1,3}\.){3}[0-9]{1,3}\b' "${log_file}" \
        | sort -u \
        | tr '\n' '|' \
        | sed 's/|$//')

    ip_count=$(grep -oE '\b([0-9]{1,3}\.){3}[0-9]{1,3}\b' "${log_file}" \
        | sort -u \
        | wc -l)

    # Quote the IP list in case it contains special chars
    printf '%s,%d,%d,"%s"\n' \
        "${filename}" "${line_count}" "${ip_count}" "${unique_ips}" \
        >> "${OUTPUT_CSV}"
done

echo "Results written to ${OUTPUT_CSV} (${#log_files[@]} files processed)"

Claude uses set -euo pipefail, IFS=$'\n\t', shopt -s nullglob for the empty directory case, properly quoted variables throughout, and printf for safe CSV output (not echo). The IP regex is standard and portable.

GPT-4o Output

GPT-4o produces a functional script but:

Omits IFS setting (word splitting issues on filenames with spaces)
Uses ls *.log instead of glob expansion (fails on spaces in names)
Missing shopt -s nullglob (script errors if no .log files exist)
Uses echo for CSV output without quoting

Verdict: Claude — significantly safer for production use.

Test 2: Deployment Script with Rollback

Prompt:

Write a deployment script that:
Pulls a Docker image (IMAGE_TAG from env)
Runs smoke tests (curl check) against the new container
If smoke tests pass, stops the old container and starts the new one
If smoke tests fail, removes the new container and exits non-zero
Logs all actions with timestamps

Claude Output

#!/usr/bin/env bash
set -euo pipefail

: "${IMAGE_TAG:?IMAGE_TAG must be set}"
: "${SERVICE_NAME:?SERVICE_NAME must be set}"
: "${PORT:=8080}"

SMOKE_URL="http://localhost:${PORT}/health"
NEW_CONTAINER="${SERVICE_NAME}-new"
OLD_CONTAINER="${SERVICE_NAME}"
LOG_FILE="/var/log/deploy-${SERVICE_NAME}.log"

log() {
    local level="${1}"; shift
    printf '[%s] [%s] %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "${level}" "$*" \
        | tee -a "${LOG_FILE}"
}

cleanup_new() {
    log "WARN" "Removing failed new container ${NEW_CONTAINER}"
    docker rm -f "${NEW_CONTAINER}" 2>/dev/null || true
}
trap cleanup_new ERR

log "INFO" "Pulling image ${IMAGE_TAG}"
docker pull "${IMAGE_TAG}"

log "INFO" "Starting new container ${NEW_CONTAINER}"
docker run -d \
    --name "${NEW_CONTAINER}" \
    --network bridge \
    -p "127.0.0.1:$((PORT + 1)):${PORT}" \
    "${IMAGE_TAG}"

log "INFO" "Waiting for container to start"
sleep 3

log "INFO" "Running smoke tests against ${SMOKE_URL}"
SMOKE_PORT=$((PORT + 1))

for attempt in 1 2 3; do
    if curl -sf --max-time 5 "http://localhost:${SMOKE_PORT}/health" > /dev/null; then
        log "INFO" "Smoke test passed (attempt ${attempt})"
        break
    fi
    if [[ ${attempt} -eq 3 ]]; then
        log "ERROR" "Smoke tests failed after 3 attempts"
        exit 1
    fi
    log "WARN" "Smoke test attempt ${attempt} failed, retrying in 5s"
    sleep 5
done

log "INFO" "Stopping old container ${OLD_CONTAINER}"
docker stop "${OLD_CONTAINER}" 2>/dev/null || true
docker rm "${OLD_CONTAINER}" 2>/dev/null || true

log "INFO" "Renaming ${NEW_CONTAINER} to ${OLD_CONTAINER}"
docker rename "${NEW_CONTAINER}" "${OLD_CONTAINER}"

trap - ERR
log "INFO" "Deployment complete"

Key details Claude gets right: trap cleanup_new ERR for automatic rollback on any failure, publishing the new container to a different port for smoke testing (doesn’t interfere with running service), ${VAR:?message} for required env vars, and || true on cleanup commands so they don’t abort.

GPT-4o wrote a similar script but forgot the trap and tested against the production port instead of an isolated one.

Test 3: Argument Parsing

Prompt:

Write a Bash script with long options: --input FILE (required), --output DIR (default: /tmp),
--verbose (flag), --format [json|csv|tsv] (default: json). Show usage on --help.

#!/usr/bin/env bash
set -euo pipefail

VERBOSE=false
FORMAT="json"
OUTPUT="/tmp"
INPUT=""

usage() {
    cat <<EOF
Usage: $(basename "$0") --input FILE [OPTIONS]

Options:
  --input FILE      Input file (required)
  --output DIR      Output directory (default: /tmp)
  --format FORMAT   Output format: json, csv, tsv (default: json)
  --verbose         Enable verbose output
  --help            Show this help

EOF
    exit "${1:-0}"
}

while [[ $# -gt 0 ]]; do
    case "${1}" in
        --input)    INPUT="${2}"; shift 2 ;;
        --output)   OUTPUT="${2}"; shift 2 ;;
        --format)   FORMAT="${2}"; shift 2 ;;
        --verbose)  VERBOSE=true; shift ;;
        --help)     usage 0 ;;
        *)          echo "Unknown option: ${1}" >&2; usage 1 ;;
    esac
done

[[ -z "${INPUT}" ]] && { echo "Error: --input is required" >&2; usage 1; }
[[ ! -f "${INPUT}" ]] && { echo "Error: '${INPUT}' is not a file" >&2; exit 1; }
[[ "${FORMAT}" =~ ^(json|csv|tsv)$ ]] || {
    echo "Error: invalid format '${FORMAT}'" >&2; exit 1
}

${VERBOSE} && echo "Processing ${INPUT} → ${OUTPUT} (format: ${FORMAT})" >&2

Both Claude and GPT-4o handle argument parsing well. Claude’s version adds input validation (file existence check, format enum validation) without being asked — GPT-4o skips these unless specified.

Test 4: Process Monitor

Prompt:

Write a script that monitors a process by name, restarts it if it crashes,
with a maximum of 5 restarts in 60 seconds (circuit breaker pattern). Log everything.

Claude implements the circuit breaker correctly using an array of timestamps and checking how many falls within the window. GPT-4o uses a simple counter without the time window — it would restart 5 times quickly then refuse to restart after a single recovery.

Claude’s full process monitor:

#!/usr/bin/env bash
set -euo pipefail

PROCESS_NAME="${1:?Usage: $0 <process-name>}"
MAX_RESTARTS=5
WINDOW_SECONDS=60
LOG_FILE="/var/log/monitor-${PROCESS_NAME}.log"

log() {
    printf '[%s] %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$*" | tee -a "${LOG_FILE}"
}

# Ring buffer of restart timestamps
restart_times=()

is_running() {
    pgrep -x "${PROCESS_NAME}" > /dev/null 2>&1
}

count_restarts_in_window() {
    local now
    now=$(date +%s)
    local cutoff=$(( now - WINDOW_SECONDS ))
    local count=0
    for ts in "${restart_times[@]:-}"; do
        [[ -n "${ts}" && "${ts}" -gt "${cutoff}" ]] && (( count++ )) || true
    done
    echo "${count}"
}

start_process() {
    log "INFO" "Starting ${PROCESS_NAME}"
    "${PROCESS_NAME}" &
    log "INFO" "${PROCESS_NAME} started (PID $!)"
}

log "INFO" "Monitor starting for process: ${PROCESS_NAME}"

while true; do
    if ! is_running; then
        log "WARN" "${PROCESS_NAME} is not running"

        recent=$(count_restarts_in_window)
        if [[ "${recent}" -ge "${MAX_RESTARTS}" ]]; then
            log "ERROR" "Circuit open: ${recent} restarts in ${WINDOW_SECONDS}s. Backing off 120s."
            sleep 120
            restart_times=()  # Reset after backoff
            continue
        fi

        start_process
        restart_times+=("$(date +%s)")
    fi
    sleep 5
done

This is a meaningful difference: the time-windowed restart counter prevents the monitor from permanently locking out a process that had a burst of failures but is now stable, while still protecting against rapid crash loops.

Test 5: Parallel Task Runner

Prompt:

Write a Bash script that runs up to N jobs in parallel, reads commands from stdin,
captures each job's exit code and output, and prints a summary at the end.

Claude’s approach uses process substitution and a job queue:

#!/usr/bin/env bash
set -euo pipefail

MAX_PARALLEL="${1:-4}"
declare -A job_pids=()
declare -A job_outputs=()
declare -A job_statuses=()

run_job() {
    local job_id="${1}"
    local cmd="${2}"
    local tmpfile
    tmpfile=$(mktemp)

    if eval "${cmd}" > "${tmpfile}" 2>&1; then
        job_statuses["${job_id}"]="OK"
    else
        job_statuses["${job_id}"]="FAIL:$?"
    fi
    job_outputs["${job_id}"]="${tmpfile}"
}

job_id=0
running=0

while IFS= read -r cmd; do
    [[ -z "${cmd}" || "${cmd}" == \#* ]] && continue

    run_job "${job_id}" "${cmd}" &
    job_pids["${job_id}"]=$!
    (( job_id++ ))
    (( running++ ))

    # Wait if at capacity
    if [[ "${running}" -ge "${MAX_PARALLEL}" ]]; then
        wait -n 2>/dev/null || wait  # wait -n requires Bash 4.3+
        (( running-- ))
    fi
done

# Wait for remaining jobs
wait

# Print summary
echo ""
echo "=== Job Summary ==="
pass=0; fail=0
for id in $(seq 0 $(( job_id - 1 ))); do
    status="${job_statuses[${id}]:-UNKNOWN}"
    output_file="${job_outputs[${id}]:-}"
    if [[ "${status}" == "OK" ]]; then
        echo "[PASS] Job ${id}"
        (( pass++ ))
    else
        echo "[FAIL] Job ${id} (${status})"
        [[ -f "${output_file}" ]] && cat "${output_file}"
        (( fail++ ))
    fi
    [[ -f "${output_file}" ]] && rm -f "${output_file}"
done
echo "Result: ${pass} passed, ${fail} failed out of ${job_id} jobs"

GPT-4o tends to use xargs -P for parallelism, which is simpler but doesn’t capture per-job output or exit codes separately. Claude’s version maintains full observability of each job.

Summary Scorecard

Test	Claude	GPT-4o	Notes
Safe file processing	Excellent	Good	Claude adds nullglob, IFS, printf
Deployment + rollback	Excellent	Good	Claude’s trap/port isolation is correct
Argument parsing	Excellent	Good	Claude validates without being asked
Process monitor (circuit breaker)	Excellent	Partial	GPT-4o misses time-windowed counter
Parallel task runner	Excellent	Good	Claude maintains per-job observability
POSIX portability notes	Yes	Sometimes	Claude proactively notes bash-isms
Destructive operation guards	Always	Usually	Claude never writes unguarded rm -rf

Overall: Claude is the stronger shell scripting tool for production scripts. The gap is clearest in safety-critical patterns: trap-based cleanup, proper glob expansion, time-windowed circuit breakers, and guarded destructive operations. GPT-4o produces functional scripts that usually work, but lacks Claude’s habit of adding defensive layers unprompted.

For inline autocomplete in editors, GitHub Copilot remains convenient — it’s fast and context-aware within a file. But for generating complete, production-ready scripts from a prompt, Claude is the better choice in 2026.

Built by theluckystrike — More at zovo.one

Claude vs GPT-4 for Shell Scripting 2026

Test 1: Safe File Processing Script

Claude Output

GPT-4o Output

Test 2: Deployment Script with Rollback

Claude Output

Test 3: Argument Parsing

Test 4: Process Monitor

Test 5: Parallel Task Runner

Summary Scorecard

Related Reading

Claude Sonnet vs GPT-4o for Code Review Accuracy Comparison

Related Articles