Lab 11: Password Cracking (optional)

Due Date: Monday 03/21/22 11:59PM CST

Lab Overview

In this lab, we will cover the basics of password cracking, and hopefully convince you to change all of your passwords. For this lab you’ll need to use a Kali Linux VM. Grab the latest release here. You can just use it as a Live CD (a common use case with Kali), no need to setup a virtual harddisk (but be careful if you want to save your work).

Lab Description

If one were to build a multi-user systems that requires logins, a first stab might be to store passwords for users in a file (in the clear) and perform a simple comparison operation when the user types in their password. This has the advantage of being very fast. However, for this system to be secure, the password file must always be protected. The assumption is that only a root user can read or modify the password file. However, what if the system administrator becomes a disgruntled employee and decides to trash other employees’ passwords? What if the sysadmin is compromised or hacked? Now everyone’s password will be disclosed. What we need is a way to store passwords such that (1) we can still authenticate users and (2) even if the stored passwords are disclosed, the password itself is not revealed.

Hashing Passwords

This is where hashing (one-way functions) come in. The basic idea here is that we don’t store a password at all. Instead, we hash the password using a strong, cryptographic hash function, e.g. SHA-2 or MD6 (see SEED Ch. 22), and store the hash value in our system or database. When a user attempts a login, the password they type is hashed using the same algorithm, and if the output matches what we stored, they are successfully authenticated. The security of this scheme is vastly better than before, but as you’ve probably already guessed, its strength relies on the assumption that given a hash value, an attacker cannot derive the password that produced that value. This is equivalent to saying that the hash function is truly a one-way function. If this isn’t true, our security is no better than before. Luckily, we have plenty of hash functions that meet this requirement (another requirement is that our hash function is very unlikely to generate collisions, which is a harder requirement to meet. If there are collisions, that means that two or more passwords could produce the same hash).

Brute-Force Cracking

Hash functions may make it seem as though there’s no hope for an attacker. Of course, a hacker can just guess passwords until he’s right. This is a brute force approach. Imagine an organization that implements a password policy that requires users to create passwords that are between 4 and 8 alphabetic characters in length. No numbers or symbols. This turns out to be about 217 billion possibilities (you should do the math on your own). That’s peanuts for today’s systems. So an attacker can just write a program to guess every possibility, and will likely find the correct answer in a few seconds.

Thankfully, this is easy to defeat. We can simply make our system either (1) limit the number of login attempts and/or (2) rate limit the attempts, for example allowing only one login attempt per second. This defeats the above attack, because to try the same number of combinations would take almost 7,000 years.

Getting the Hash

However, what if the attacker knows the hash. For example, if an attacker can perform a privilege escalation attack on a Linux system, he or she can read /etc/shadow, which stores these hashes. Then the attacker is back in business, because the same brute forcing techniques can work. However, reasonable password lengths and using numbers and symbols will make brute force attacks take a much longer time.

Task 1: Playing the Fool

On your Kali VM, create a new user with a short password by running the following:

useradd foo 
passwd foo
... type in 'foo' to the prompts ...

This creates a user named foo with the password foo. Take a look at how Linux stores the password entry:

grep foo /etc/shadow

Normally you’d have to do this with sudo, but the default user on Kali is root. Notice that the output is delimited by both ‘:’ and ‘$’ characters. The ‘:’ characters delimit entries in the password file, so the first entry is the username, the second is the hashed password, and the rest are things like timestamps, expiration dates, etc. The password itself is generated by the crypt() program. This program uses ‘$’ to delimit attributes of the hashes it generates. The first attribute is the type of hash algorithm (see here). 6 means SHA-512. The second attribute (in this case kpm6CJMA) is a salt, which we’ll come back to later. The last component is the hash value itself, which is stored in a base64-like encoding.

Since we know the password, we can recreate the entry above by using Python’s API to crypt. Try running the following:

python3 -c 'import crypt; print(crypt.crypt("foo", "$6$kpm6CJMA$"))'

The first argument is the password, and the second is a salt. You should get the same result as what appears in /etc/shadow. Of course, in this case we knew the password, so it was easy.

Salts

Even if we hash passwords, there’s a potential here for leaking information. What if two users choose the same password? This is actually pretty likely (1) because humans aren’t that original and (2) many systems have millions of users, increasing the likelihood that there are duplicates. This means if we can somehow crack one person’s password, we get a free lunch for another. To prevent this, we use salts. These are essentially random numbers appended to a password before the hash is computed, ensuring that duplicate passwords still produce distinct hashes. A salt is conceptually very similar to a nonce, which you’ll recognize from the encryption lab. To authenticate users with salts, we store the salt in the clear along with the hash, i.e. if the password is P, the salt S, and the hash function H, we store S || H(S || P). Now that we know about salts, we can proceed to the next task.

Task 2: Cracking

Write a Python program to try every three-letter possibility to crack foo’s password. Your program should take as its argument a user name, and it should open /etc/shadow, search for the username, and brute force its password (using 3-letter possibilities). Call your program crack.py. Report how long it takes to crack the password by using the following:

 time ./crack.py foo

How long does it take if you include numerals in addition to letters?

Dictionary Attacks

Of course, we can reduce the size of our search space dramatically by using what we know about human nature, i.e., that people are pretty predictable and have limited memories. We can thus guide our search by using a dictionary, a word list that contains things like common words, common passwords, etc. We can then try out our dictionary words first, and only if this fails fall back to an exhaustive brute-force search.

To strengthen passwords, many users start with dictionary words, but augment them with symbols and numbers. For example, I might start with the password seedlabs, but augment it with numbers to produce s33dl4bs. This is not a dictionary word, but its easy enough to write programs that take dictionary words and produce hybrid combinations like this, essentially mutating our wordlist in pre-defined ways.

Task 3: Cracking with John

Let’s do the same exercise, but using a mature tool, called John the Ripper, which employs a hierarchical approach like we outlined above. John comes pre-installed in your Kali image. You can see how to use it by running it without any arguments:

john

Running this is easy as giving it the file we want and letting it go to work:

time john -users:foo /etc/shadow

How did the time compare to your method?

Rainbow Tables

Even with the dictionary and hybrid attacks above, it’s still a computationally intensive approach. While we can speed things up with GPUs and custom cracking ASICs, we can also observe that a good deal of the time in cracking is spent computing the hashes. We can cut down on this time by pre-computing a table of hashes for our dictionaries and then instead perform straight comparisons. These tables are called rainbow tables. They trade off memory for performance, as they can grow quite large (often many GB).

If you’re interested in exploring rainbow tables, check out RainbowCrack.

Task 4: Dictionary Attacks with Transformation Rules

Now let’s run some experiments with dictionary attacks. We’re going to use HashCat for this exercise. This is another program for password cracking that has a nice language for generating mutation rules that can be applied to transform words in the dictionary. HashCat also comes pre-installed on Kali. You can run hashcat -h to see the help screen.

Let’s imagine a scenario where we’ve compromised the hashed passwords for a criminal organization in Chicago. We know that this criminal organization is involved in betting for Chicago sports teams. We also know from lurking on the sports team’s organizational Slack channel that passwords in the organization’s internal systems have to be at least 8 characters long and must include at least one number.

Let’s challenge our tools a bit by creating a new user:

useradd bar
passwd bar

This time, make the password whit3s0cks1994. Now, on the face of it, this is a pretty strong password. It’s 14 characters long, and includes numerals. A naive brute-force search faces 36^14 possibilities, which is pretty daunting. Even with john, it’s going to take a while. Give it a try and go for a coffee:

john -users:bar /etc/shadow

Still waiting? You can see why password length (and using mixed symbols) really does matter. However, our inside knowledge can help us to reduce the search space considerably. With this knowledge, we have all we need to perform a dictionary attack. We know that this organization probably has some serious sports fans, so why don’t we include the Chicago-area sports teams in a word list? We’ll also throw in some easy ones as well.

for i in bears cubs whitesocks fire blackhawks abcd qwert password; do echo $i >> wordlist; done

Let’s see if we can use this wordlist with hashcat. We first need to tell hashcat a hash type to apply. From the man page, we can see that we need type 1800 (for sha512_crypt, which is what is used for /etc/shadow).

grep bar /etc/shadow > crackme # we don't care about the other users
hashcat --force -m 1800 crackme wordlist

This finishes quickly, but it fails saying “Exhausted,” because nothing in the wordlist is in the file. The user has added his birthdate to the end and has applied a letter-number substitution scheme. Let’s use HashCat’s rule system to specify some transformations to apply to the password candidates in the wordlist before it hashes them.

So what do we know? We know that sports teams are probably a good bet. We know that users have to have numbers in their passwords. What are some easy numbers? password1, password2, etc. are always good bets. Also 1234 appended to the end. We’ll also try caps, lowercase, etc. We can encode those like so in a hashcat rules file:

cat > rules
:
u
l
c
$1
$1 $2 $3
$1 $2 $3 $4
(hit ctrl^d)

Each line here represents a transformation rule, and each rule will be applied to every word in our wordlist before hashing. Thus if we have N words in our dictionary and M rules, we will attempt NM password hashes. The first rule (:) means “apply no transformation,” meaning the unchanged word. The second rule (u) means make the whole word upper-case. The third (l) means make the whole word lower-case. The fourth rule (c) means capitalize the first letter, and make the rest lower-case. The next three rules use the append ($) operator. Here, $1 means “append a 1 to the end of the word,” and $1 $2 means “append 12 to the end of the word.” We can now try this with hashcat:

hashcat --force -m 1800 crackme wordlist -r rules

Still no beans. We need to consider some other simple numbers. A really common one is to append your birth year to the end of a password. This is easy enough to encode. For example, if someone was born on 1994:

$1 $9 $9 $4

So really, all we need to do here is write a script that can create a rule for each birthday for anyone. There’s another nice transformation rule we can use, in the form of sXY, where X is a pattern to search for and Y is its replacement. We can use this to guess easy substitutions. For example:

sa4 se3 sl1 so0

This will replace the letter ‘a’ with ‘4’, and so on. People familiar with l33tsp34k will often use this in passwords. Let’s go ahead and add this to our rules:

echo "sa4 se3 sl1 so0" >> rules

OK, back to birthdays. We need to autogenerate rules for every year (say since 1920 and 2020), with and without number substitution:

for i in $(seq 1920 2020); do echo $i | sed 's/\([0-9]\)/$\1/g' >> rules; done;

That gives us the birth year suffixed to the word. Now with substitution as well:

for i in $(seq 1920 2020); do echo "$(echo $i | sed 's/\([0-9]\)/$\1/g')" "sa4 se3 so0 sl1" >> rules; done;

Now let’s give it a try again:

hashcat --force -m 1800 /etc/shadow wordlist -r rules

Bingo! You should see the password appear in hashcat’s output now. Keep in mind this is running on a VM. It would go a bit faster on physical hardware, especially with more cores. HashCat really prefers to use a GPU cluster, an FPGA, or some other OpenCL accelerator to make the hashing blazingly fast.

Task 5: Putting it Together

I’ve put a salted UNIX password here. Can you crack it? Hint: can you find anything out about me from my website?

Takeaway: Staying Safe

Here are some insights you should probably take away from this lab:

Use Strong Passwords! Length is a priority here. Short passwords are easy to crack. If nothing else, use a very long password (the max allowed). Try to avoid dictionary words, try to use symbols and numbers, but again, length is key!
Do not use the same password in multiple places. This limits impact if a password is cracked. Think to yourself: how do you know the strength of the website’s password storage?
Do not store passwords in the clear. Ever.
Do not derive passwords from publicly available information.
Use two-factor authentication (2FA). This ideally means combining something you know (password) with something you have (a physical artifact) or a biosignature. Ideally use 2FA for all services you use, and consider getting a physical authentication token (e.g., a YubiKey). If you must settle, setup 2FA with an authenticator app (e.g. Google Authenticator). It’s not as ideal, but receiving 2FA verification over SMS or E-mail is better than not having it at all.
Use a password manager. This solves the first three points above for you, since you only have to remember one (hopefully very strong) password (even better use 2FA to access your PM). Good password managers will verify the strength of your passwords, give you ways to autogenerate random passwords with rulesets, will have autofill capabilities, integration between devices, will encrypt your saved passwords, will make sure you don’t have duplicates, will scan your passwords from known leaked databases, and will notify you when accounts you’ve saved passwords for have been breached. Consider KeyPass, 1Password, Keeper, or Dashlane.

Handin

Please write your lab report according to the description. Please also list the important code snippets followed by your explanation. You will not receive credit if you simply attach code without any explanation. Upload your answers as a PDF to blackboard

Other Tools

Ophcrack
L0phtCrack
Cain and Abel (Windows only)
Hydra (for web logins)
Aircrack-ng can be used for WEP password cracking
crunch, a wordlist generator
CUPP, Common User Password Profiler