Every year the private digital security company NordPass publishes a list of the most popular passwords across 30 countries. And as always, the current list from 2022 also contains shockingly simple ones. The top five are: “password,” “123456,” “123456789,” “guest” and “qwerty.”
Needless to say, these are weak passwords—but what makes a good one? Most people know a few rules of thumb: it should be as long as possible, contain special characters and not be a simple word. You should also change it regularly, choose a different password for each user account and never write it down. Meeting all these requirements at the same time seems almost impossible. And once you have found a good password, a website may not accept it: either it is too short, contains an illegal character—or is somehow too long. PayPal, for example, does not allow passwords longer than 20 characters. These restrictions make password selection extremely frustrating for most users.
For their secure password requirements, many Internet service providers rely on 2003 guidelines published by the U.S. National Institute of Standards and Technology that recommend passwords with as large a mix of special characters, uppercase letters and lowercase letters as possible. Bill Burr, a former NIST employee, created these guidelines but has since told the Wall Street Journal that he regrets many of these recommendations. That’s because forcing people to change passwords and requiring them to use special characters often lead them to choose easy-to-remember (and therefore insecure) passwords that follow a particular scheme or pattern. For example, “password1” is no more secure than “password.” Thus, NIST has now revised its guidelines, but not all providers have followed suit. Very often, you are forced to use special characters, numbers, and uppercase and lowercase letters in a password.
On supporting science journalism
If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
How Are Passwords Cracked?
To learn how to choose a secure password, you need to understand how hackers do their work. The simplest approach is to systematically try all possible password combinations in what is known as a brute-force attack. Fortunately, it is rarely possible to log in to an online provider, such as an e-mail service provider, in this way. Most websites nowadays have an integrated security mechanism that suppresses further log-in attempts in the event of multiple incorrect entries. You then have to confirm your identity in another way (for example, by clicking on a link in an e-mail) or wait several minutes before you can try to log in again.
[Read more about hackers and cybersecurity]
That’s why such brute-force attacks are usually carried out in what are called “offline attacks,” in which a hacker has stolen a list of log-in credentials from a website. Almost all providers encrypt passwords rather than store their users’ log-in data in plain text. The attacker therefore receives a list of usernames and an encrypted string, which cannot be used to log in to the website. But with a few tricks, a hacker can still get hold of the passwords.
To protect log-in information, a great many providers use so-called hash functions to encrypt their users’ access data. These functions convert a string of characters of any length (such as a password) into a code (“hash”) with a fixed length, such as 32 characters. The special thing about hash functions: they are virtually irreversible. While it is easy to compute the hash to any input (at least for a computer), it is virtually impossible to get the original input from the hash. So when a user logs in to a website, they type in the password, whereupon the website computes the hash from it and matches it with the stored user data (the username and the hash of the password). In this way, the user’s passwords remain protected.
If an attacker is able to capture a list of user data and associated hashes, however, there is still the possibility of a brute-force attack. You can let a computer create various passwords and convert them into ciphered values using known hash functions. These can then be compared with the hashes in the captured list. If there are matches, then you have most likely found the corresponding password. (There’s a possibility—though unlikely—that the password is still incorrect because hash functions can potentially return the same hash for different inputs). An attacker can take as much time as they want to do this because the list is in their possession, and they can work with it offline.
A hacker then has the choice of spending a lot of computational effort or computer memory to crack the hashes. In the first case, one works one’s way from user to user, each time trying all possible password combinations for which hashes are generated and compared with the user’s corresponding hash. In the second case, one generates the hashes for all possible passwords in advance and stores them in a huge table that they can match against the user’s data. Both cases create difficulties. One needs either extremely powerful computers or huge amounts of storage space.
Clever Tricks Crack Complex Passwords
Rainbow tables offer a compromise solution. Basically they provide a way to group passwords and their associated hashes so that you can store less content. To extract the desired content from the table requires some computational effort but much less than if you were to compute the hashes for each user individually.
The idea behind this approach is as follows: Once you have computed a hash, such as “920ECF10,” from a possible password, say, “password,” you can transform it back into the representation of a possible password using the function f(920ECF10) = kjhsedn. This new password, “kjhsedn,” is assigned to the same group as “password.” Then you calculate the hash of “kjhsedn”— for example, FB107E70—and apply f to it again, resulting in a new possible password, which you assign to the same group.
This process is repeated n times (where n should be as large as possible) to obtain a group with n possible passwords. Then you choose a password not yet assigned and creates a new group with n contents in the same way. You repeat this step until you have covered all the password possibilities you want to consider (for example, from eight lowercase letters). In the rainbow table, you only store the initial password (in our example, “password”) and the hash of the nth password generated from it. The rainbow table is thus much smaller than if you were to store the hashes of all password combinations.
Now if you want to crack a hash from a stolen list, you have to do a little math. With f you convert the hash into a password string and calculate a hash from it again until you eventually land on a hash in the rainbow table. Then you know the group to which a stolen password belongs. By calculating the hashes for all character combinations in this group, you will eventually find the corresponding password. This way, the computational effort is significantly lower than if you compute the corresponding hashes for all character strings.
To prevent such attacks, many websites add a random string of characters called a “salt” to the input of the hash functions. These characters are added to a person’s password before the hash is formed. Each user receives a different salt, which is stored in the access data list. Thus, when a hacker tries to steal the user data, even though that hacker obtains the associated salts, cracking passwords is much more difficult. Instead of creating hashes for all possible passwords and matching them against the list, the attacker is forced to go through each user’s information individually. The hacker must add the user-specific salt to each possible password in order to calculate the hash from it. This extra layer of protection also prevents the use of a rainbow table. The introduction of a salt increases the computing time for a brute-force attack many times over.
Special Characters Do Not Always Help
Brute-force attacks are inefficient. They are not especially challenging for today’s computers, however. After all, computers can sometimes calculate millions of passwords per second. If you have chosen a six-digit password that only consists of lowercase letters (such as “qwerty”), the computer has to check 266, or 308,915,776, combinations. (There are 26 possibilities for each of the six letters). A computer would only need a few seconds to calculate that password.
By choosing a longer password and expanding the choice of characters, the space of all possible password combinations also grows rapidly. For an eight-character combination of 26 uppercase letters, 26 lowercase letters, 10 numbers and 32 special characters (such as “p4$sW0Rd”), a hacker would now have to sift through a space of (26 + 26 + 10 + 32)8 = 6.10 x 1015 possible combinations. That’s about 20 million times as many possibilities as in the case of “qwerty” and would theoretically takes 20 million times as long.
Thus, experts classify the security of passwords based on the size of the space containing the chosen combination of characters. And because this is a subfield of computer science, one does not consider the size of the space in decimal notation (“The space contains quadrillions of combinations”) but in binary notation, which consists only of 0’s and 1's (“The size of the space contains a combination of 32 1’s and 0’s”).
This value is referred to as the entropy of a password and is measured with the unit “bit.” For example, a four-digit PIN is contained in a space of size 104, which corresponds to 10011100010000 in binary notation, so the entropy is 14 bits. The relationship between a number N in decimal representation and its entropy (binary length) is given by the integer part of log2N. (If log2N yields a fractional number, one has to round up to the next integer.) Thus, “qwerty” has an entropy of log2(308,915,776) ≈ 28.2, or 29 bits. On the other hand, “p4$sW0Rd” has an entropy of log2(6.10 x 1015) ≈ 52.4, or 53 bits.
But it’s not enough to just look for high entropy. In practice, passwords such as “p4$sW0Rd” or “qwerty” have a much worse entropy than the theoretically calculated one. This is because hackers usually do not carry out pure brute-force attacks. Instead they use “dictionaries.” Password dictionaries contain not only common words of a language but also data sets with popular password combinations derived from stolen data. NordPass uses such lists to determine the most popular passwords of the year in different countries, for example. Meanwhile there are numerous easily accessible programs such as John the Ripper that are used to carry out such dictionary attacks.
Hackers combine common names and words with strings of characters and numbers. They also make popular substitutions, such as turning an A into a 4. “If you replace an E [with] a 3..., don’t think that’s clever, because it’s not,” says computer scientist Mike Pound in a video on the YouTube channel Computerphile. Attacks can easily crack this type of replacement with available computing power.
Password Managers Offer the Best Protection
So what does the most secure password look like? To exploit the full amount of entropy, one should choose a random string from the space of all possibilities. For an eight-character password with special characters, numbers, and lowercase and uppercase letters, such a choice could be, say, “MA9^Wc7f”. But it will be difficult to remember such a password—especially if you are supposed to choose a different (and similarly strong) password for each of your other accounts.
Password managers, programs that generate random strings and store a person’s passwords, offer a solution to this problem. Of course, one should be especially careful when choosing a provider and check what security mechanisms are used. How are the passwords encrypted? How good is the random generator that the program uses?
Once you have found a trustworthy provider, you still need to remember a “master password” used to gain access to all log-in data. A secure password is still necessary—but only one. You should be able to remember it well enough to access the password manager at any time and thus have access to all of your accounts. A popular strategy is the “correct-horse-battery-staple” method: Choose a sequence of four to five words separated by a special character. Because of the many characters, the entropy is quite large—especially if you decide, for example, to put another special character in the middle of a word, such as: “bat%tery.” This prevents direct dictionary attacks.
The security of the password increases if you choose the words randomly. To do this, you can use the “diceware method.” For instance: You roll a die five times in a row, giving you five numbers from 1 to 6. Then you look up the corresponding word in a diceware word list. The list contains several thousand words numbered by five die numbers between 1 and 6. This way, you get a password that is easy to remember and cannot be cracked directly by a dictionary or brute-force attack.
The End of Passwords?
Meanwhile there are increasing efforts to get rid of passwords altogether. This has been a recurring theme for several years, but in May 2023, tech giant Google took the first step: allowing users to log in with passkeys that merely require biometric data. Another option is to receive a code on an authenticated device (such as via SMS to a cell phone) and then type it in on the website. The user-specific data (for example, the biometric characteristics) are stored locally on the user’s device—so providers such as Google do not have direct access to it. Microsoft, Apple and other major companies announced in 2022 that they are also looking at secure password alternatives.
Until the time comes when we no longer have to remember passwords (if that day ever arrives), password managers likely offer the most protection. Even if you have a sophisticated system that results in a different password for each user account, the mere fact that you can remember that pattern means it is likely not sophisticated enough.
This article originally appeared in Spektrum der Wissenschaft and was reproduced with permission.