Error Handling

Advanced ~30 min read

Production scripts must handle errors gracefully. A script that silently continues after a failure can cause serious damage. This lesson teaches you to catch errors early, clean up properly, and provide meaningful feedback when things go wrong!

The set Options

Bash's set builtin enables critical error-handling behaviors.

#!/usr/bin/env bash
# Error Handling Demo

echo "=== Error Handling Basics ==="
echo ""

# Demo 1: Exit codes
echo "--- Exit Codes ---"

# Success command
ls /tmp >/dev/null 2>&1
echo "ls /tmp: exit code = $?"

# Failed command (directory doesn't exist)
ls /nonexistent_dir 2>/dev/null
echo "ls /nonexistent_dir: exit code = $?"

echo ""

# Demo 2: The || and && operators
echo "--- Conditional Execution ---"

# Run second command only if first fails
ls /nonexistent 2>/dev/null || echo "Directory not found (|| triggered)"

# Run second command only if first succeeds
ls /tmp >/dev/null && echo "Directory exists (&& triggered)"

echo ""

# Demo 3: Custom die function
die() {
    echo "ERROR: $*" >&2
    # In real script: exit 1
    return 1
}

echo "--- Die Function ---"
# Simulating error
die "Something went wrong" 2>&1 || true
echo ""

# Demo 4: Checking command existence
echo "--- Command Existence Check ---"

check_command() {
    if command -v "$1" &>/dev/null; then
        echo "✓ $1 found"
        return 0
    else
        echo "✗ $1 NOT found"
        return 1
    fi
}

check_command "bash"
check_command "grep"
check_command "fake_command_xyz"

echo ""

# Demo 5: Input validation
echo "--- Input Validation ---"

validate_file() {
    local file="$1"

if [[ -z "$file" ]]; then
        echo "Error: No file specified" >&2
        return 1
    fi

if [[ ! -f "$file" ]]; then
        echo "Error: File not found: $file" >&2
        return 1
    fi

if [[ ! -r "$file" ]]; then
        echo "Error: File not readable: $file" >&2
        return 1
    fi

echo "✓ File valid: $file"
    return 0
}

# Create test file
test_file="/tmp/error_test.txt"
echo "test" > "$test_file"

validate_file "$test_file"
validate_file "/nonexistent/file.txt" 2>&1 || true
validate_file "" 2>&1 || true

rm -f "$test_file"

echo ""

# Demo 6: Safe defaults
echo "--- Safe Default Values ---"

# Using ${var:-default} pattern
config_file="${CONFIG_FILE:-/etc/default.conf}"
timeout="${TIMEOUT:-30}"
verbose="${VERBOSE:-false}"

echo "config_file: $config_file"
echo "timeout: $timeout"
echo "verbose: $verbose"

echo ""
echo "=== Key Takeaways ==="
echo "1. Always check exit codes"
echo "2. Use || and && for conditional execution"
echo "3. Validate inputs before using them"
echo "4. Provide meaningful error messages"
echo "5. Use default values with \${var:-default}"

Output

Click Run to execute your code

Essential Set Options

Option	Effect	Why Use It
`set -e`	Exit on error	Stops script when any command fails
`set -u`	Exit on undefined variable	Catches typos and missing variables
`set -o pipefail`	Pipeline returns rightmost error	Catches errors in pipelines
`set -E`	ERR trap inherited by functions	Error handlers work in functions

The Standard Header: Start every production script with:

#!/usr/bin/env bash
set -euo pipefail

This catches most common errors automatically.

Exit Codes

Every command returns an exit code. Use them to detect and communicate errors.

# Standard exit codes
# 0   - Success
# 1   - General error
# 2   - Misuse of command
# 126 - Permission denied
# 127 - Command not found
# 128+N - Fatal error signal N

# Check exit code with $?
command
if [[ $? -ne 0 ]]; then
    echo "Command failed"
fi

# Better: use if directly
if ! command; then
    echo "Command failed"
fi

# Define custom exit codes
readonly EXIT_SUCCESS=0
readonly EXIT_ERROR=1
readonly EXIT_USAGE=2
readonly EXIT_CONFIG=3

die() {
    echo "ERROR: $*" >&2
    exit "${EXIT_ERROR}"
}

# Return specific codes
validate_config() {
    [[ -f "$config" ]] || return $EXIT_CONFIG
    return $EXIT_SUCCESS
}

# Check specific failure
if ! validate_config; then
    case $? in
        $EXIT_CONFIG) echo "Configuration error" ;;
        *)            echo "Unknown error" ;;
    esac
fi

Documenting Exit Codes: Always document your script's exit codes in the header comments. This helps users understand and script around your tool.

The trap Command

Use trap to execute cleanup code when your script exits or receives signals.

#!/usr/bin/env bash
# Trap Command Demo

echo "=== Trap Command Demo ==="
echo ""

# Demo 1: Basic EXIT trap
echo "--- EXIT Trap ---"

demo_exit_trap() {
    # Define cleanup function
    cleanup() {
        echo "  [TRAP] Cleanup executed!"
    }

# Set trap
    trap cleanup EXIT

echo "  Running some code..."
    echo "  Exiting function..."

# Cleanup runs automatically when function exits
}

# Run in subshell to isolate trap
(demo_exit_trap)

echo ""

# Demo 2: Cleanup temporary files
echo "--- Temp File Cleanup ---"

demo_temp_cleanup() {
    local temp_file
    temp_file=$(mktemp)
    echo "  Created temp file: $temp_file"

# Trap to remove temp file
    trap "rm -f '$temp_file'; echo '  [TRAP] Removed temp file'" EXIT

# Use the temp file
    echo "test data" > "$temp_file"
    echo "  Wrote to temp file"

# File is automatically cleaned up on exit
}

(demo_temp_cleanup)

echo ""

# Demo 3: ERR trap for error handling
echo "--- ERR Trap ---"

demo_err_trap() {
    # Error handler
    on_error() {
        echo "  [TRAP] Error occurred on line $1!"
    }

# Note: ERR trap needs set -E to work in functions
    trap 'on_error $LINENO' ERR

echo "  Running commands..."
    true  # Success
    echo "  First command succeeded"

# Simulate an error (but don't exit)
    false || echo "  Second command failed (caught with ||)"
}

demo_err_trap

echo ""

# Demo 4: Multiple signals
echo "--- Multiple Signal Handlers ---"

show_signals() {
    echo "  Common signals you can trap:"
    echo "    EXIT  - Script exits (any reason)"
    echo "    ERR   - Command returns non-zero"
    echo "    INT   - Ctrl+C (SIGINT)"
    echo "    TERM  - kill command (SIGTERM)"
    echo "    HUP   - Terminal hangup"
    echo ""
    echo "  Signals you CANNOT trap:"
    echo "    KILL (9)  - Cannot be caught"
    echo "    STOP (19) - Cannot be caught"
}

show_signals

echo ""

# Demo 5: Stack-based cleanup
echo "--- Stack-based Cleanup ---"

demo_cleanup_stack() {
    # Array to hold cleanup commands
    declare -a cleanup_stack=()

# Function to add cleanup task
    push_cleanup() {
        cleanup_stack+=("$1")
        echo "  Added cleanup: $1"
    }

# Function to run all cleanup tasks (in reverse order)
    run_cleanup() {
        echo "  [TRAP] Running cleanup stack..."
        local i
        for ((i=${#cleanup_stack[@]}-1; i>=0; i--)); do
            echo "    Executing: ${cleanup_stack[i]}"
            eval "${cleanup_stack[i]}" 2>/dev/null || true
        done
    }

trap run_cleanup EXIT

# Add some cleanup tasks
    push_cleanup "echo 'Task 1: Done'"
    push_cleanup "echo 'Task 2: Done'"
    push_cleanup "echo 'Task 3: Done'"

echo "  Simulating script work..."
}

(demo_cleanup_stack)

echo ""

# Demo 6: Trap best practices
echo "--- Best Practices ---"
cat << 'EOF'
  1. Always trap EXIT for cleanup:
     trap cleanup EXIT

2. Create temp files safely:
     temp=$(mktemp)
     trap "rm -f '$temp'" EXIT

3. Handle interrupts gracefully:
     trap 'echo "Interrupted"; exit 130' INT

4. Use trap -p to show current traps:
     trap -p

5. Reset a trap with:
     trap - SIGNAL
EOF

echo ""
echo "=== Trap Demo Complete ==="

Output

Click Run to execute your code

Common Trap Patterns

# Cleanup on exit (any exit)
cleanup() {
    rm -f "$temp_file"
    echo "Cleanup complete"
}
trap cleanup EXIT

# Handle specific signals
trap 'echo "Interrupted!"; exit 130' INT
trap 'echo "Terminated!"; exit 143' TERM

# Error handler
on_error() {
    local line="$1"
    local code="$2"
    echo "Error on line $line: exit code $code" >&2
}
trap 'on_error ${LINENO} $?' ERR

# Multiple cleanup actions
declare -a cleanup_tasks=()

add_cleanup() {
    cleanup_tasks+=("$1")
}

run_cleanup() {
    for task in "${cleanup_tasks[@]}"; do
        eval "$task" || true  # Continue even if cleanup fails
    done
}
trap run_cleanup EXIT

# Usage
temp_file=$(mktemp)
add_cleanup "rm -f '$temp_file'"

temp_dir=$(mktemp -d)
add_cleanup "rm -rf '$temp_dir'"

Common Signals

Signal	Number	Cause
`EXIT`	N/A	Any script exit (normal or error)
`ERR`	N/A	Command returns non-zero
`INT`	2	Ctrl+C (interrupt)
`TERM`	15	Termination request
`HUP`	1	Terminal closed

Error Recovery Patterns

Sometimes you want to handle errors gracefully rather than exit immediately.

# Retry with backoff
retry() {
    local max_attempts=${1:-3}
    local delay=${2:-1}
    local cmd="${@:3}"
    local attempt=1

    while [[ $attempt -le $max_attempts ]]; do
        if eval "$cmd"; then
            return 0
        fi

        echo "Attempt $attempt failed, retrying in ${delay}s..." >&2
        sleep "$delay"
        ((attempt++))
        ((delay *= 2))  # Exponential backoff
    done

    echo "All $max_attempts attempts failed" >&2
    return 1
}

# Usage
retry 3 2 curl -sf "https://api.example.com/health"

# Try/catch pattern
try() {
    local result
    result=$("$@" 2>&1) && echo "$result" || {
        echo "Error: $result" >&2
        return 1
    }
}

# Fallback chain
get_config() {
    cat "$1" 2>/dev/null ||
    cat "$HOME/.config/app.conf" 2>/dev/null ||
    cat "/etc/app/default.conf" 2>/dev/null ||
    echo "default_value"
}

# Conditional error handling
set +e  # Temporarily disable exit-on-error
risky_command
status=$?
set -e  # Re-enable

if [[ $status -ne 0 ]]; then
    handle_error $status
fi

Disabling set -e: Use set +e sparingly and only for specific commands. Always re-enable with set -e immediately after.

Assertion Functions

Build reusable assertion functions for common validation patterns.

# Assertion library
assert_not_empty() {
    local name="$1"
    local value="$2"
    [[ -n "$value" ]] || die "Assertion failed: $name must not be empty"
}

assert_file_exists() {
    local file="$1"
    [[ -f "$file" ]] || die "Assertion failed: File not found: $file"
}

assert_directory_exists() {
    local dir="$1"
    [[ -d "$dir" ]] || die "Assertion failed: Directory not found: $dir"
}

assert_command_exists() {
    local cmd="$1"
    command -v "$cmd" &>/dev/null ||
        die "Assertion failed: Command not found: $cmd"
}

assert_root() {
    [[ $EUID -eq 0 ]] || die "This script must be run as root"
}

# Usage
main() {
    assert_command_exists "curl"
    assert_file_exists "$config_file"
    assert_not_empty "API_KEY" "$API_KEY"

    # Script continues only if all assertions pass
}

Common Mistakes

1. Not checking command success

# Wrong - continues even if cd fails!
cd /nonexistent
rm -rf *  # Deletes from wrong directory!

# Correct
cd /nonexistent || exit 1
rm -rf *

# Or with set -e
set -e
cd /nonexistent  # Script exits here

2. Ignoring pipeline errors

# Wrong - only checks last command
cat missing.txt | grep pattern
echo "Status: $?"  # Shows 0 or 1 from grep, not cat!

# Correct - use pipefail
set -o pipefail
cat missing.txt | grep pattern  # Now fails if cat fails

3. Not cleaning up on error

# Wrong - temp file left behind on error
temp=$(mktemp)
process_data > "$temp"  # If this fails, temp remains
rm "$temp"

# Correct - trap ensures cleanup
temp=$(mktemp)
trap "rm -f '$temp'" EXIT
process_data > "$temp"
# temp is removed even on error

Exercise: Robust File Processor

Task: Create a script with comprehensive error handling!

Requirements:

Use set -euo pipefail
Validate input file exists
Use trap for cleanup
Provide meaningful error messages
Return appropriate exit codes

Show Solution

#!/usr/bin/env bash
set -euo pipefail

# Exit codes
readonly EXIT_SUCCESS=0
readonly EXIT_ERROR=1
readonly EXIT_USAGE=2

# Globals
temp_file=""

# Cleanup handler
cleanup() {
    [[ -n "$temp_file" && -f "$temp_file" ]] && rm -f "$temp_file"
}
trap cleanup EXIT

# Error handler
on_error() {
    echo "ERROR: Script failed at line $1" >&2
}
trap 'on_error $LINENO' ERR

# Utility functions
die() {
    echo "ERROR: $*" >&2
    exit $EXIT_ERROR
}

usage() {
    echo "Usage: $0 "
    exit $EXIT_USAGE
}

# Validation
validate_input() {
    local file="$1"

    [[ -n "$file" ]] || die "Input file required"
    [[ -f "$file" ]] || die "File not found: $file"
    [[ -r "$file" ]] || die "File not readable: $file"
}

# Main processing
process_file() {
    local input="$1"
    temp_file=$(mktemp) || die "Failed to create temp file"

    echo "Processing: $input"

    # Simulate processing
    wc -l "$input" > "$temp_file"
    cat "$temp_file"

    echo "Processing complete"
}

# Entry point
main() {
    [[ $# -eq 1 ]] || usage

    validate_input "$1"
    process_file "$1"

    exit $EXIT_SUCCESS
}

main "$@"

Summary

set -euo pipefail: Essential trio for catching errors automatically
Exit Codes: Use meaningful codes (0=success, non-zero=error types)
trap EXIT: Always clean up temporary files and resources
trap ERR: Add error context with line numbers
Assertions: Validate preconditions early with clear messages
Recovery: Implement retry logic for transient failures

What's Next?

Good error handling needs good Logging. Next, you'll learn to implement logging functions with levels, timestamps, and proper output handling!

Previous Best Practices

Next Logging

The set Options

Essential Set Options

Exit Codes

The trap Command

Common Trap Patterns

Common Signals

Error Recovery Patterns

Assertion Functions

Common Mistakes

1. Not checking command success

2. Ignoring pipeline errors

3. Not cleaning up on error

Exercise: Robust File Processor

Summary

What's Next?

Enjoying these tutorials?