Sorting, Slicing and More… in Linux | Filtering Content Part 2

This part is the 16 of 17 in the series Linux Basics For Hackers

Series Navigation<< Pagers and More in Linux | Filtering Content Part 1Learn Now : Filtering Content in Linux with Awk and Sed >>

Seriously, did you try using tail -f to watch a log file in real-time? It feels like you’ve suddenly got superpowers, doesn’t it?

In Pagers and More in Linux | Filtering Content Part 1, we learned how to tame the flood of information in the terminal. We took that chaotic rush of text and learned to view it calmly with less, peek at the start with head, and check the latest updates with tail. We essentially learned how to read the data without going crazy.

But a hacker’s job isn’t just to read; it’s to find the needle in the haystack. It’s about spotting the one weak entry in a list of thousands, the one anomalous username in a log file. Just viewing the data isn’t enough. We need to dissect it, rearrange it, and carve out exactly what we need.

That’s what this article is all about. We’re moving from being passive observers to active surgeons of data. Today, we’re diving deeper into the art of filtering content in Linux by learning how to manipulate the output itself. We’ll cover five incredibly powerful commands: sort, cut, tr, wc, and column.

Think of it like this: in Part 1, we got the book and learned how to turn the pages. Now, we’re going to get out our highlighters, scissors, and a calculator to really analyze the contents.

So, top up your coffee (or chai, whatever fuels your hacking sessions ☕️), and let’s get our hands dirty.

Table of Contents

[Open][Close]

Bringing Order to Chaos with sort
Making the cut: Extracting Specific Data
Transforming Characters with tr
Counting Things with wc
Making Output Pretty with column
- Handling Different Delimiters
What’s on the Horizon?
FAQs

Bringing Order to Chaos with `sort`

Let’s start with a very common problem. You have a list of things—usernames, files, IP addresses—and they’re all jumbled up. Trying to find anything specific is a pain. The sort command is our elegant solution.

By default, sort takes text input and reorders it alphabetically.

Let’s create a small file to play with. You can use a text editor like nano or just use this quick echo command:

echo -e "kali\nparrot\narch\nfedora\nubuntu\ndebian" > distros.txt

echo -e "kali\nparrot\narch\nfedora\nubuntu\ndebian" > distros.txt

Now, if we cat this file, it will show the distros in the order we entered them. But what if we pipe it to sort?

cat distros.txt | sort

cat distros.txt | sort

Look at that. A perfectly ordered, alphabetical list:

arch
debian
fedora
kali
parrot
ubuntu

arch
debian
fedora
kali
parrot
ubuntu

Simple, clean, and incredibly effective.

But sort has a few more tricks up its sleeve.

Reverse Order (-r): What if you want it in descending order? Just add the -r flag for reverse.cat distros.txt | sort -r
Numeric Sorting (-n): This is a super important one. By default, sort treats everything as text. So, if you sort the numbers 1, 10, and 2, it will order them as 1, 10, 2 (because “10” comes before “2” alphabetically). That’s usually not what we want. The -n flag tells sort to interpret the values as numbers.# Create a file with numbers echo -e "10\n2\n1\n100\n5" > numbers.txt # Incorrect alphabetical sort sort numbers.txt # Correct numeric sort sort -n numbers.txtSee the difference? For any kind of data analysis, -n is essential.
Unique Lines (-u): Often, your data will have duplicate entries. The -u flag is a lifesaver; it tells sort to only show one instance of each line.echo -e "kali\nparrot\nparrot\nkali\nkali" > duplicates.txt sort -u duplicates.txtThis command first sorts the list and then removes any duplicates it finds. It’s a quick and dirty way to get a unique list of items.

Imagine you’ve extracted a list of usernames from a log file. Using sort -u instantly gives you a clean list of every unique user who has accessed the system. Powerful stuff.

One more powerful use case is sorting by a specific column with the -k flag. Think about the output of ls -l. If you want to find the largest files, you need to sort by the 5th column (file size).

# -k5 specifies the 5th column
# -n means sort numerically
# -r reverses the result to show biggest first
ls -l /etc/ | tr -s ' ' | sort -k5 -n -r | head -n 5

# -k5 specifies the 5th column
# -n means sort numerically
# -r reverses the result to show biggest first
ls -l /etc/ | tr -s ' ' | sort -k5 -n -r | head -n 5

This command chain is a perfect example of what we’re learning: it lists the contents of /etc/, squeezes the spaces, sorts the result by the 5th column (size) in reverse numerical order, and then shows you just the top 5 largest files. This is a daily-driver command for any sysadmin.

Making the `cut`: Extracting Specific Data

Okay, so our data is sorted. But what if it’s in columns? Think about the output of ls -l. You have permissions, owner, group, file size, date, and filename all in one line. Boom! You’ve just extracted a clean list of all users on the system. How cool is that?

Let’s take a real-world security example. Imagine you’re looking at a web server’s access log. A typical line might look like this: 192.168.1.10 - - [05/Sep/2025:10:30:01 +0000] "GET /login.php HTTP/1.1" 200 1482

What if you want a list of all IP addresses that have accessed your server? The IP address is the first field, and the delimiter is a space.

# Assuming your log is in access.log
# -d' ' specifies a space delimiter
# -f1 gets the first field (the IP)
cut -d' ' -f1 access.log | sort -u

# Assuming your log is in access.log
# -d' ' specifies a space delimiter
# -f1 gets the first field (the IP)
cut -d' ' -f1 access.log | sort -u

In one line, you’ve just extracted every unique IP address from potentially thousands of log entries. You can now use this list to check for malicious IPs or see who your most frequent visitors are.

You can select multiple fields too:

-f1,3 will get the 1st and 3rd fields (username and user ID).
- -f1-3 will get a range: the 1st, 2nd, and 3rd fields.

What about commands that use spaces as delimiters, like ls -l? Well, it gets a bit tricky because ls -l can have a variable number of spaces. This is where other tools, which we’ll see later, can be better. But for consistently delimited files (like CSVs or the /etc/passwd file), cut is your go-to tool. It’s fast, simple, and does one job perfectly.

Transforming Characters with `tr`

Next up is tr, which stands for “translate.” This command is a bit different. It doesn’t really care about lines or fields; it cares about individual characters. You can use it to replace or delete specific characters in a stream of text.

The basic syntax is tr [characters-to-find] [characters-to-replace-with].

Changing Case: A classic use is to convert text from lowercase to uppercase or vice versa.echo "Hello World" | tr 'a-z' 'A-Z' # Output: HELLO WORLD
Replacing Characters: Maybe you have some data separated by hyphens, but you need it to be separated by spaces. Easy.echo "2025-09-05" | tr '-' ' ' # Output: 2025 09 05This can be really useful for reformatting data to be fed into another command that expects a different delimiter.
Deleting Characters (-d): This is where it gets interesting for hacking. Sometimes you have weird characters in your text that are messing things up. You can use tr with the -d flag to delete them.# Remove all the letter 'o's echo "Hello World" | tr -d 'o' # Output: Hell Wrld
Squeezing Characters (-s): The -s flag is for “squeezing” repeating characters. If you have multiple spaces in a row, this will condense them down to a single space. This is a fantastic way to clean up messy command output before using cut.echo "this has too many spaces" | tr -s ' ' # Output: this has too many spacesRemember how ls -l had a variable number of spaces? If you pipe its output through tr -s ' ' first, you get clean, single-space-delimited columns, perfect for cut!ls -l | tr -s ' ' | cut -d' ' -f9 # This will now reliably give you the filenames

tr is like a find-and-replace for your terminal, but on a character-by-character level.

For a pentester, tr is fantastic for cleaning up wordlists. Many password-cracking tools work best with clean lists. You can strip out all punctuation and numbers from a file like this:

# -d deletes characters
# [:punct:] and [:digit:] are character classes
cat some_messy_wordlist.txt | tr -d '[:punct:][:digit:]' > clean_wordlist.txt

# -d deletes characters
# [:punct:] and [:digit:] are character classes
cat some_messy_wordlist.txt | tr -d '[:punct:][:digit:]' > clean_wordlist.txt

This takes a messy list and outputs a clean one containing only letters, which can be much more effective for certain attacks.

Counting Things with `wc`

Ever needed a quick headcount? wc (word count) is the command for the job. It can count lines, words, and characters in a file.

Run it on our distros.txt file from earlier:

wc distros.txt

wc distros.txt

The output will be something like: 6 6 41 distros.txt. This means: 6 lines, 6 words, and 41 characters.

Most of the time, you only care about one of those things. So, you’ll use a flag:

-l: Count lines. This is probably the one you’ll use the most.
-w: Count words.
-c: Count bytes (characters).

Let’s find out how many users are on our system. We already know how to get the list of usernames with cut. Now we just pipe that list into wc -l.

cut -d':' -f1 /etc/passwd | wc -l

cut -d':' -f1 /etc/passwd | wc -l

This is useful, but in security, you’ll often use wc as a counter after a search. For instance, how many times has a specific IP address, say 182.74.201.2, tried to access your server? We can use grep (which we’ll cover later) to find the lines and wc to count them.

# Search for the IP in the log file, then count the matching lines
grep '182.74.201.2' access.log | wc -l

# Search for the IP in the log file, then count the matching lines
grep '182.74.201.2' access.log | wc -l

This instantly tells you the frequency of an event, which is fundamental to log analysis and incident response. If you see that number suddenly spike, you might be under attack.

This command chain gives you a single number: the total number of users registered in the file. This is the essence of the Linux command line philosophy: small tools, each doing one job well, chained together to achieve a complex result.

Making Output Pretty with `column`

One last command for today, and it’s a handy one for readability. Presentation matters, right? Sometimes, you’ll extract data and it’s all jumbled and misaligned, making it hard to read. The column command is your personal data stylist; it takes messy, space-separated text and formats it into beautiful, clean, newspaper-style columns.

The most common way to use it is with the -t flag, which tells it to create a table.

Let’s revisit our /etc/passwd example. We know it’s separated by colons. If we use tr to replace the colons with spaces, the output is still a bit messy.

# Replacing colons with spaces gives us the right data, but poor alignment
cat /etc/passwd | tr ':' ' '

# Replacing colons with spaces gives us the right data, but poor alignment
cat /etc/passwd | tr ':' ' '

Now, let’s pipe that same output into column -t.

cat /etc/passwd | tr ':' ' ' | column -t

cat /etc/passwd | tr ':' ' ' | column -t

Look at that difference! The output is now a perfectly formatted table. Everything is aligned, making it incredibly easy to read and understand.

Handling Different Delimiters

But what if you don’t want to use tr or sed first? The column command is smart enough to handle different delimiters on its own using the -s flag. You can tell it what the separator character is, and it will build the table accordingly.

Let’s try our /etc/passwd example again, but this time, we’ll do it in a single, more efficient step.

# -s specifies the separator, -t creates the table
column -s':' -t /etc/passwd

# -s specifies the separator, -t creates the table
column -s':' -t /etc/passwd

This gives you the same beautiful output but with a cleaner, more direct command. This is fantastic for quickly viewing CSV (Comma-Separated Values) files or any other consistently delimited file.

For example, if you had a file data.csv that looked like this: Name,Role,ID Alice,Admin,101 Bob,User,102

You could view it as a proper table with:

column -s',' -t data.csv

column -s',' -t data.csv

Another great use case is for cleaning up the output of system commands. For example, the mount command shows you all the mounted filesystems, but its default output is a bit jumbled. Pipe it to column -t and it’s a different story.

mount | column -t

mount | column -t

Suddenly, the information is perfectly aligned and easy to scan, allowing you to quickly check mount points and options. It’s a simple trick that makes your life on the command line much more pleasant.

This is fantastic for making sense of complex, delimited data on the fly without needing to open a spreadsheet application. It’s the final touch that makes your terminal output look professional.

What’s on the Horizon?

And there you have it! We’ve just added five more surgical instruments to our command-line toolkit. You can now sort data, cut out the pieces you need, transform characters, count the results, and format it all beautifully.

Practice combining these with the commands from Part 1. For example: ls -l /usr/bin | sort -k5 -n -r | head -n 10 Can you figure out what that command does? (It lists the 10 largest files in /usr/bin).

We’re getting really close to being able to perform some seriously advanced data-fu. In the next and final part of our “Filtering Content” mini-series, we’re bringing out the heavy artillery: sed and awk. These are the ultimate text-processing powerhouses, allowing you to edit streams of text and perform complex actions based on patterns.

Stay tuned, keep practicing, and as always, happy hacking!

FAQs

1. When should I use cut instead of other tools like awk? cut is best for simple, fixed-column data where the delimiter is consistent (like a colon or a comma). It’s extremely fast and straightforward. For more complex tasks, like when columns are separated by a variable number of spaces or you need to perform actions on the data (not just extract it), awk is the more powerful choice.

2. How can I sort a file based on a specific column? You can use the -k flag with sort. For example, sort -k3 -n filename will sort the file numerically (-n) based on the contents of the third column (-k3). This is incredibly useful for sorting the output of commands like ls -l by file size.

3. What’s a practical cybersecurity use case for tr? A common use case is data sanitization. Imagine you’re analyzing a file that has a mix of uppercase and lowercase letters, but you need everything to be consistent for searching. You can pipe the file through tr 'A-Z' 'a-z' to make everything lowercase before further processing. It’s also used to remove non-printable or “bad” characters from data that might otherwise break other scripts or tools.

4. How can wc -l be used in scripting? In shell scripting, wc -l is frequently used to check if a command produced any output. For example, you might run a grep command to search for an error in a log file, pipe the result to wc -l, and if the count is greater than zero, you know an error was found and can trigger an alert.

5. Is the column command available on all Linux systems? The column command is part of the util-linux package and is available on the vast majority of modern Linux distributions. However, you might encounter very old or minimalistic systems (like some embedded devices) where it might not be installed by default.

Sorting, Slicing and More… in Linux | Filtering Content Part 2

Bringing Order to Chaos with `sort`

Making the `cut`: Extracting Specific Data

Transforming Characters with `tr`

Counting Things with `wc`

Making Output Pretty with `column`

Handling Different Delimiters

What’s on the Horizon?

FAQs

Leave a Reply Cancel reply

Hacking

What is ADB and How To Install it in Windows and Linux ?

What is MBR, it's Working and Hacking !!

What is GPT, its Structure and Working !!

Cyber Security

How to Install Wazuh on Raspberry Pi ?

Cloud Network Security Strategies You Are Missing !!

How to Build a Resilient Network Security Architecture ?

Hacking Journey

I Changed My Whole Life in 6 Months

The Beginning

Bringing Order to Chaos with sort

Making the cut: Extracting Specific Data

Transforming Characters with tr

Counting Things with wc

Making Output Pretty with column

Handling Different Delimiters

What’s on the Horizon?

FAQs

Related Posts

Leave a Reply Cancel reply

Hacking

Cyber Security

Hacking Journey

Bringing Order to Chaos with `sort`

Making the `cut`: Extracting Specific Data

Transforming Characters with `tr`

Counting Things with `wc`

Making Output Pretty with `column`