90 Snippets - Counting words 🫥

90 Snippets - Counting words 🫥

90 Projects towards mastery - Day 08 👾

Hey fellow programmers, engineers, and all the tech enthusiasts!
It's Astro again, with another solved problem!

You can find the full code snippet on GitHub

Embracing the Challenge: My Tech Journey Milestone

As a tech enthusiast, every line of code we write is a step forward in our journey, and each milestone reached is a testament to our growth. Today, I want to share with you a recent achievement that has marked a significant point in my coding adventure - my experience with a set of powerful bash scripts.

The Milestone: Taming Text with Bash

Recently, I embarked on a challenging yet enlightening endeavor. The goal was to create a collection of bash scripts that would not only provide practical solutions but also test my understanding of the intricate world of scripting. One of these scripts, txtproc proved to be a cornerstone in my journey.

The Journey Begins: Understanding the Task

At the outset, the task seemed straightforward: create a script that counts the occurrences of each word in a given text file. Little did I know that this seemingly simple task would unfold into a comprehensive learning experience.

The script, aptly named txtproc had multiple facets. It required me to read a text file, process its content, and manipulate strings efficiently. This journey was about more than just solving a problem; it was about embracing the nuances of bash scripting.

Unveiling the Script

Let's dive into the heart of the bash script that counts word occurrences in a text file. I'll break it down into smaller, digestible code snippets and provide a clear explanation for each one. So, grab your favorite coding beverage, sit back, and let's explore the script together.

Setting the Stage: Colors and Formatting

RED_BOLD="\033[1;31m"
GREEN_BOLD="\033[1;32m"
YELLOW_BOLD="\033[1;33m"
BLUE_BOLD="\033[1;34m"
MAGENTA_BOLD="\033[1;35m"
CYAN_BOLD='\033[1;36m'
WHITE_BOLD='\033[1;37m'
RESET='\033[0m'

The script starts by defining a set of color codes for formatting output. This clever touch enhances the readability of messages and adds a touch of personality to the script's interface.

Guiding Users: Displaying the Manual

function manual() {
    echo -e "$GREEN_BOLD""NAME""$RESET"
    echo -e "\t$0 - Count the number of occurrences of each word in a text file"
    # ... (continues)
}

Here, I've defined a function that displays a detailed manual explaining the script's purpose and usage. This provides users with clear instructions on how to interact with the script effectively.

Sharing the Version: Displaying Script Information

if [[ "$1" == "-v" || "$1" == "--version" ]]; then
    echo -e "$GREEN_BOLD""txtproc (ASTRO coreutil) 1.0""$RESET"
    # ... (continues)
fi

The script accommodates curious users by offering a way to retrieve version information using the -v or --version flags. This level of detail showcases my commitment to user-friendly interfaces.

Providing Help: Handling Help Requests

if [[ "$1" == "-h" || "$1" == "--help" ]]; then
    manual
    # ... (continues)
fi

By incorporating -h or --help flags, the script shows its helpful side. It calls the manual function to provide users with comprehensive guidance, empowering them to navigate the script confidently.

Ensuring Input Validity: Argument Validation

if [[ $# -lt 1 || $# -gt 2 ]]; then
    echo -e "$RED_BOLD""Error: Invalid number of arguments""$RESET"
    # ... (continues)
fi

As a conscientious coder, I'm aware that user input can be unpredictable. By checking the number of arguments, I'm ensuring that the script runs smoothly and gracefully handles incorrect usage.

Checking the File: Validating Input Files

if [[ ! -f "$1" ]]; then
    echo -e "$RED_BOLD""Error: $1 is not a file""$RESET"
    # ... (continues)
fi

The script respects the sanctity of files and confirms their existence before proceeding. This check is vital for maintaining the integrity of operations.

Embarking on the Journey: Word Occurrence Count

if [[ $# -eq 1 ]]; then
    cat "$1" | tr -s '[:space:]' '\n' | sort | uniq -c | sort -nr
    exit 0
fi

At the heart of the script, this snippet reads a text file and processes its contents to count the occurrences of each word. It takes a pragmatic approach by utilizing commands like cat, tr, sort, and uniq, showcasing my command over command-line operations.

Navigating Output Paths: Handling Output Files

if [[ $# -eq 2 && -f "$2" ]]; then
    echo -e "$RED_BOLD""Error: $2 already exists""$RESET"
    echo -n -e "$WHITE_BOLD""Do you want to overwrite it? (y/n): ""$RESET"
    read CHOICE
    if [[ "$CHOICE" == "y" || "$CHOICE" == "Y" ]]; then
        echo -e "$YELLOW_BOLD""Overwriting file $2...""$RESET"
        # ... (continues)
    elif [[ "$CHOICE" == "n" || "$CHOICE" == "N" ]]; then
        echo -e "$YELLOW_BOLD""Exiting...""$RESET"
        exit 0
    else
        echo -e "$RED_BOLD""Error: Invalid choice""$RESET"
        exit 1
    fi
    exit 1
fi

The script becomes an adept navigator when it comes to output files. It ensures that files are overwritten only with user consent, thus preventing accidental data loss.

Bringing It All Together: Main Logic

echo -e "$YELLOW_BOLD""Creating file $2...""$RESET"
cat "$1" | tr -s '[:space:]' '\n' | sort | uniq -c | sort -nr > "$2"
echo -e "$GREEN_BOLD""File $2 created successfully""$RESET"
exit 0

The grand finale: the script's main logic. It processes input files, counts word occurrences, and generates meaningful outputs. The combination of commands demonstrates my mastery over data manipulation and text processing.

Reveling in the Victorious Moment

After days spent writing refining and debugging, bash scripts. It permitted me to not only tally word frequencies but also to grasp the poetry within bash scripting. With each line of code, I stitched together a knowledge quilt that would warm me on cold coding nights.

Reflecting on the Odyssey

As this milestone shimmers in my rearview mirror, I'm reminded that coding ventures are more than a string of characters. They are voyages of the mind and heart, fraught with challenges that morph into triumphs. The txtproc script stands as a testament to my perseverance, a beacon that guides me towards embracing novel hurdles and surmounting them.

So, fellow coders, remember that within every script lies a chapter of discovery. Within every challenge lies a lesson. It's these milestones that shape our narratives, reminding us that coding isn't just about syntax – it's about exploration, invention, and the joy of scripting our own stories.

Â