Publishable Humanly Usable Secure Password Creation Schemas

Yüklə 0,7 Mb.

Pdf görüntüsü

tarix	08.08.2018
ölçüsü	0,7 Mb.
	#61628

Publishable Humanly Usable Secure Password Creation Schemas*

Manuel Blum and Santosh Vempala

CMU, Pittsburgh PA 15213 and Georgia Tech, Atlanta GA 30332

mblum@cs.cmu.edu

, vempala@cc.gatech.edu

Abstract

What can a human compute in his/her head that a powerful adver-

sary cannot infer? To answer this question, we define a model of

human computation and a measure of security. Then, motivated

by the special case of password creation, we propose a collection

of well-defined password-generation methods. We show that our

password generation methods are humanly computable and, to a

well-defined extent, machine uncrackable. For the proof of secu-

rity, we posit that password generation methods are public, but

that the human’s privately chosen seed is not, and that the adver-

sary will have observed only a few input-output pairs. Besides

the application to password generation, our proposed Human

Usability Model (HUM) will have other applications.

Introduction

Passwords are maps from challenges (e.g., website names)

to responses (strings of characters). The ideal password is

hard to crack, i.e., an adversary that knows the password

generation method, but not the human’s privately chosen

seed, and that knows only a few challenge-response pairs,

will

provably (as opposed to probably) have little clue

about the response to a new challenge. Even the knowledge

of a half dozen challenge-response pairs will give a combi-

nation supercomputer-human adversary little chance to

respond correctly.

Constructing a password is a trivial problem for a com-

puter. All it needs to do is store a table containing a com-

pletely random string for each challenge of interest. It can

then respond to any challenge by table look-up

In this

case, an adversary who knows all but one challenge-

response pair has negligible information for responding to

the missing challenge.

But this does not work for humans [Shay et al., 2014].

Indeed, people commonly use passwords that are too sim-

ple [Bonneau, 2012; Cranor, 2014] and/or they use the

same password for many different websites [Ives et al.,

2004] and/or they use password vaults that, if cracked,

expose all their passwords [Li et al., 2014]; see [Blocki,

2014] for a detailed discussion and references. People are

mostly unwilling to make the effort to generate and memo-

rize random responses to 100 challenges, so this password

generation method is

not humanly usable. The motivating

problem for this paper is the following:

Are there any humanly usable, secure and publishable

password generation methods?

Informally, human usability has two requirements:

Preprocessing Time (PT1): Any initial long-term

memorization should take at most 1 hour, preferably less

than 20 minutes; future rehearsals should take at most 2

minutes each for a total of no more than 1 hour over the

user’s lifetime.

Processing Time (PT2): Generation of a 10-character

password should be done entirely in the user's head

(without the use of paper, pencil, phone), and should

take at most 30 seconds, preferably less than 20 seconds.

With these stringent requirements, what can a human

do? What level of security can be achieved for password

generation?

In the next section, we formally define the space of

allowable password schemas. Following that, we give

several candidate schemas. Then we introduce our Hu-

man Usability Model (HUM) to define and quantify the

space of human algorithms. The subsequent section de-

fines security parameters. Armed with these measures,

we analyze the proposed schemas. This will let us state

our (existence) theorem for good password schemas. As

a note of caution, we also demonstrate a natural schema

that is not humanly usable (an impossibility theorem).

Password schemas: definitions and desiderata

Passwords are maps from challenges to responses. We

also use the word password (as is commonly done) to de-

note the response to a challenge

. With this definition, a

challenge is typically a website name, e.g. AMAZON,

PHUSPCS, pronounced “phoospics” as in “toothpicks”.

Hopefully, this dual use of the word will not be cause for confusion.

AMEX, EBAY; the associated response is the password.

To login to a website such as AMAZON, for example, the

user must enter -- besides the user’s login name -- the

password response to the challenge AMAZON.

A password schema consists of a sample space of al-

lowable challenges called the dictionary and a set of in-

structions for transforming challenges in the dictionary

into passwords. That set of instructions typically has two

stages, of which the second is the processing stage for

transforming challenges into passwords (about which we

say more later). The first is the preprocessing stage,

which explains:

what must be memorized (e.g., a random map from

26 characters to 10 digits), and

what operations are to be performed on the chal-

lenge (e.g., a direct map from the characters in the

challenge to the corresponding digits in the re-

sponse) to create the password.

Both 1 and 2 above must include explicit directions for

memorizing and for operating on challenges so that a hu-

man can do these in a well-defined acceptable amount of

time.

The goal is to have a schema for creating and generating

passwords that is 1) humanly usable, 2) self-rehearsing, 3)

secure, 4) publishable and 5) analyzable. Here are the

meanings of these terms, in reverse order:

Analyzable means that the schema is so explicitly well-

defined that a Turing Machine, in practice a computer

without access to the web, can follow its directions. By

comparison, most currently recommended schemas for

constructing passwords are "soft": they are not algo-

rithms. “Take a sentence and concatenate the first letters

of all words to create the password,” while good advice as

far as it goes, invariably does not say how the sentence is

to be chosen; on the rare occasions when it does say how,

it does not say it in a way that a Turing Machine can gen-

erate the password. Precise (Turing Machine) algorithms

are necessary for security and human usability to be ana-

lyzed.

Publishable means that the schema for generating and

recalling passwords is publicly available for everyone to

know. In particular, the adversary knows exactly how the

user creates responses (passwords) to challenges (website

names); the only unknown to the adversary is the particu-

lar random choices made by the user in the preprocessing

stage. This selection is private (to the user).

Note that analyzable and publishable are not the same:

many schemas have been published that are not sufficiently

precise to be analyzable. Schemas that are analyzable are

not in general published.

Secure means that an adversary cannot generate a user’s

password to a new website, a website whose user password

she has not seen.

Security for passwords schemas is measured very differ-

ently from security for most (other) cryptographic sche-

mas. With most cryptographic schemas, security can be

based on the fact that the adversary has limited, typically

polynomial time bounded, computational power. In pass-

word schemas the adversary is a human with a supercom-

puter, while the user is a human without a computer. As

the adversary has vastly greater power than the human us-

er, proofs of security are generally information theoretic:

the adversary is assumed to be capable of computing any-

thing computable.

In this work, complexity considerations do arise, but on-

ly in the rather severe limits they place on the user. Even

finite automata are too powerful to model the computations

that a human can do in his head. The only satisfactory

proof that a human can compute something in a given

amount of time is to see a human do it. Equivalently, we

can specify the steps that the human must perform to gen-

erate passwords, estimate the time for each, then put this

all together to estimate the time it will take the human to

do his part.

Self-rehearsing means that in the process of computing

passwords for frequently entered websites, the user gets all

the practice he needs to compute passwords for infrequent-

ly entered websites. In other words, computing the pass-

words for commonly visited sites rehearses both the algo-

rithm (schema) and the maps that are used to construct the

passwords for the other sites.

In our experience, at typical rates of password genera-

tion, all 22 letters except Z, X, Q, J are rehearsed suffi-

ciently often to be remembered without additional rehears-

als. These are the self-rehearsed letters. As for the Z, X, Q,

J maps above, they can either be deliberately rehearsed, or

all have the same map. In addition, only lines of code in

the password algorithm that are used for the most frequent-

ly visited websites will be rehearsed often enough to be

remembered without additional rehearsals.

Humanly usable means that a human can learn the sche-

ma, then use it to generate passwords, and that all this can

be done satisfying certain time requirements (given earlier)

for preprocessing and processing. In a later section, we

will present a Human Usability Model. It defines a set of

operations that humans can perform and gives measures for

the costs of these operations.

The Dictionary: A password-generation schema must

define the sample space of (allowable) challenges. Exam-

ples include:

All 3-letter strings of letters, with every 3-letter string

equally likely.

A standard English dictionary of 10,000 or so words,

each with its probability of occurrence in ordinary

English text.

The top 500 currently most popular websites, each

with its probability of being logged into.

Sample schemas

Most of our schemas will involve implicit or explicit char-

acter-to-character maps. We write letters of the challenge

as capitals A, B, C… and their map values (A)f, (B)f,

(C)f… as lower-case letters (A)f=a, (B)f=b, (C)f=c…. We

use (A)f instead of f(A) because human computation re-

quires first retrieving A and then applying f.

Our first set of schemas use a letter-to-digit map.

Digit Schema 1 (DS1)

Preprocessing: Memorize a random letter-to-digit map f

(we discuss the important issue of how to do this later).

Processing: Apply the map to the distinct letters of the

challenge in the order they appear.

Example: f(AMAZON) = (A)f (M)f(Z)f (O)f (N)f = amzon

(Note: the lower-case letters represent digits; we write f

after the letter it is applied to; we skipped the second A).

Caution: short challenges, or ones with many repeated

letters will lead to short responses, e.g., f(AAA) = (A)f = a.

This is not a favorite schema.

In what follows, we use and

to denote addition and

multiplication mod 10.

Digit Schema 2 (DS2)

Preprocessing: Memorize a letter-to-digit map f.

Processing:

For a challenge

1. Output

2. For

Output

Remark 1: A variant of the above schema is to use a ran-

dom-looking starting point for processing. This can be

done by first computing a starting location

and starting at that location with

, then wrapping around the chal-

lenge when needed to get b

…b

n-1

Digit Schema 3 (DS3)

Preprocessing: Memorize a random letter-to-digit map f

and a random digit permutation g.

Processing: For a challenge

1. Output b

= ((A

n-1

)f )g

2. For

Output

i

= (b

i-1

)f )g

The next few schemas do not use any arithmetic.

Word Schema 1 (WS1)

Preprocessing: Memorize a letter-to-word map, i.e., pick

a word for each starting letter A-Z. For this we suggest that

the user first choose a topic (e.g., “Animals”), then gener-

ate a list of 10 or more words from that topic for each start-

ing letter (using the internet if he wishes) picking one of

those 10 words at random as his map value. If he cannot

find 10 words for some letters, he can widen the topic (e.g.,

“Animals and Birds”). For rare letters such as Q and X, he

can simply pick a random word. Let f map a letter to the

first TWO consonants in the word following the starting

letter. E.g., if A

AARDVARK, then (A)f=RD.

Processing: Given a challenge, apply the map f to the

first k distinct letters of the challenge, or to all letters of the

challenge if it is of length less than k. We suggest k=4.

Example: Suppose

A AARDVARK

M MONGOOSE

Z ZORILLO

O ORANGUTAN

Then, f(AMAZON) = (A)f (M)f (Z)f (O)f=RDNGRLRN.

Letter Permutation Schema 1 (LP1)

Preprocessing: Pick a random permutation of the 26

letters A-Z, and memorize the sequence. For this we sug-

gest the user recite the permutation to a tune, e.g., the

standard alphabet song:

ABCD EFG

HIJK LMNOP

QRS and TUV

W and X, Y and Z…

For example, if the random permutation is

XBUDVOWTNRPGAHFJSZYLIMQKEC, then the tune

could go:

XBUD VOW

TNRP GAHFJ

SZY and LIM

Q and K, E and C…

Let f map a letter to the next letter in this sequence, with

the last mapping to the first.

Processing: Given a challenge, apply the map f to each

distinct character of the challenge.

Note: The user does not have to make a pass through the

challenge to determine which letters have not occurred

earlier. On challenge MAMAMIA, for example, the user

will recall, on the second and third occurrences of M and

A, that he already printed (M)f and (A)f respectively.

Caution: It is advised that each challenge in the user’s

dictionary should contain at least 4 distinct characters.

Remark 2: For the above schemas that output characters, a

random-looking start could be obtained by fixing a rule

such as: “start one past the second vowel” or “start at the

vowel before the last consonant”.

Remark 3: Many real-life websites require that passwords

contain special characters and/or both capital and lower-

case letters. To comply with this, we suggest that the user

employ a fixed rule for all passwords, such as: “capitalize

the first letter and terminate the password with @1”. The

purpose of this is not to increase the security but to have

self-rehearsing methods for making passwords legal.

Letter Permutation Schema 2 (LP2)

Preprocessing: Pick a random word with a great many

different letters. In a table of the 26 letters, mark off those

letters that appear in the word. Pick another random word

that uses many unmarked letters (and possibly some

marked ones) and mark off those letters in the table. Pick a

third word that uses many unmarked letters and mark those

letters off the table. Memorize the three words in the string

word1-word2-word3, and a final string4 consisting of let-

ters (if any) that do not appear in any word, e.g., QXZ if all

letters except Q, X and Z appear in the three chosen words.

Let f map each letter in the challenge to whichever conso-

nant follows its first occurrence in the sequence word1-

word2-word3-string4.

Processing: Apply f to the distinct letters of the chal-

lenge. If a letter is a repeat, map it to the next letter in the

sequence, wrapping around the unused letters if needed.

Example: ANTIVIRAL-MOHAWKBUG-ESCAPED are

the three words and FJQXYZ is string4. Then,

f(AMAZON) = (A)f(M)f((A)f)f(Z)f (O)f(N)f = NHTFHT.

The above schemas can be extended to multiple maps

(letter-to-digit or letter-to-letter), allowing for proportional-

ly more security, needing proportionally more prepro-

cessing, but only slightly more processing. We call these

multi-map schemas, and provide one illustration.

[MMS1]

Preprocessing. Memorize m letter-to-letter maps,

Processing. Given a challenge of length n, compute t = n

(mod m), apply the t

map to the challenge.

Remark 4: There are several ways the user might choose a

meta-map to decide which of his maps to apply to a chal-

lenge. E.g., he could use a letter-to-digit map and compute

to determine which map to use.

All the above schemas can be applied by making one or

at most two passes over the challenge. In contrast, the fol-

lowing schema appears to require substantially more mem-

orization and far too many passes.

Random Matrix Schema (RM)

Preprocessing: Memorize an n x n matrix M of random-

ly chosen digits and a random letter-to-digit map f.

Processing: given a challenge

for

Output

The Human Usability Model (HUM)

We model the human as a finite state machine with both

long-term and short-term memory. The long-term memory

is written in the preprocessing stage, when the human

stores the function(s) that must be memorized. That long-

term memory is used in read-only mode together with a

tiny, short-term, read-write memory in the processing

stage to transform any given challenge to its response. The

state diagram of the machine that does this, i.e., accesses

both types of memory in order to transform challenge into

response,

is described in part in the preprocessing stage

and in part in the processing stage of each password sche-

ma. The state diagram must be learnable in minutes, not

hours, from a description of the code and a few examples.

It must also be completely self-rehearsing in that every line

of code defining that state diagram must be used in a sub-

stantial fraction (quarter? half? the larger the fraction, the

more self-rehearsing is the code) of challenge-response

computations. A password schema that chooses its starting

location in the challenge by randomly hashing the first

letter of the challenge to the start location would not be

self-rehearsing and therefore would not be permissible.

More on Long- and Short-term Memory

For the purpose of transforming challenges into responses,

human memory consists of a large long-term relatively

permanent memory and a small short-term working

memory.

1. Long-term memory is unbounded except for

the (substantial, proportional) time and effort needed to

acquire it (up to a concentrated hour or so),

the relatively small time and effort (in spaced rehears-

als) required to maintain it (measured in minutes over a

lifetime), and

the length of one’s (mentally healthy) life.

2. Long-term memory (for our purpose) consists of se-

quences of text. The storage of these texts may be based on

visual, auditory, sensory, motor, experiential and other

memories, all of which can play a role in computing and

remembering passwords. Since this work is concerned

with transforming challenge strings into response strings,

only the memory required to store strings of characters

(rather than, say, visual images), and the rules/password

schemas for transforming them are described here.

3. Just as humans typically store their memory of the al-

phabet A, B… Z in a singly-linked list, which can typically

only be recited left-to-right from the starting point and a

few other entry points in the list, the input challenge words

are streamed into the machine looking much like a singly-

linked list in long-term memory. The human has direct

access to the input characters in the order they arrive.

Short-term memory enables going back at most one char-

acter. The human also has pointer access to the first, se-

cond, last and last-but-one characters of the input se-

quence.

4. Working memory is severely bounded. In password

computations, it typically consists of 2 characters (2 digits,

2 letters, or 1 letter and 1 digit) and a pointer into one or at

most two sequences, these being typically the challenge

itself and at most one additional sequence.

5. Working memory fades quickly. Items in working

memory that are needed but not recalled for more than 2 or

3 operations must be reconstructed.

We are now ready to define our measure of the cost (or

perhaps more precisely, the effort) of human computation.

HUM Processing Complexity = total number of reads

from and writes to working memory.

The total combined Preprocessing Complexity (PC1)

and Processing Complexity (PC2) is a vector: HUM =

Complexity of some natural operations

● Working memory for letters and digits is a stack of two

elements (letters or digits). The cost of accessing the top

(most recent) entry is 1, and that of accessing the bottom

entry is 2.

● Retrieving a sequence from recently retrieved long-term

memory, i.e., retrieving a pointer to the beginning of the

sequence, has Cost=1.

● Following a pointer into a recently-retrieved sequence in

long-term memory, moving it 1 step to the right, resetting

the pointer to start/start+1 or to end/end-1 has Cost=1.

● Operations of =, + and x (mod 2,3,4,5,9,10) on two sin-

gle-digit numbers. Cost = number of digits created (not

necessarily all written) during the operation.

Example: 4+3 (mod 10) = 7 has cost 1; 4+9 (mod 10) = 3

has cost 2.

● Operations of =, + and x (mod 11) on two single-digit

numbers has Cost = number of digits created (not neces-

sarily written) before applying the mod operation + 1 for

the mod operation on numbers > 11.

Example: 4+3 (mod 11) has cost 1; 4+9 (mod 11) has cost

● Operations of =, + and x (mod 2,3,4,5,9) on two single-

digit numbers are treated similarly.

● Applying a map such as a character-to-digit map is the

same as following a pointer and has Cost =1.

Remark 1

: The HUM only indicates proportional effort for

a human, the actual time will vary with the human. This is

analogous to asymptotic complexity, which counts the

number of operations, while the running time depends on

the machine executing the algorithm and its current load.

An important difference between human and machine

computation counts is that, since challenges typically have

length n ≈ 12 or less, care must be taken to estimate all

constants, not just accept their big

Ο or big Ω dependence

on n.

Remark 2

: A major reason for specifying the operations

and their costs is to determine, albeit roughly, the expected

human usability of proposed password schemas. These

computed values should be confirmable experimentally.

Cost of memorization

How are we to measure the cost of preprocessing, i.e. long-

term memorization? First, fix an upper bound on how

much one can memorize in a single sitting and how many

sittings one is willing to do per day. While this can vary

from person to person, most people can memorize one 7-

digit phone number in a single sitting. An efficient way to

retain the number in long-term memory is the following:

taking T = 15 minutes, a phone number memorized at time

t should be rehearsed at times t+T, t+2T, t+4T, t+8T,…, a

roughly “doubling” schedule [Pimsleur 1967; Woźniak &

Gorzelańczyk 1994]. These purposeful rehearsals of a

phone number can stop once the required time-between-

rehearsals exceeds the expected time-between-successive-

uses of the phone number.

Suppose the cost of memorizing one 7-digit phone num-

ber is C, and the cost of rehearsing a 7-digit phone number

is c, c « C. Then we observe that n phone numbers can be

memorized at a cost

of Cn + cn·(log n). If no more than n

phone numbers are ever memorized, the next sitting will

require just c·(log n) time, a big drop from (C + c·(log n)),

and this c·(log n) cost will then decrease slowly to zero.

Security measures

The human usability of a password schema is measured by

its preprocessing and postprocessing (processing) com-

plexities PC1 and PC2. We define additional measures Q

and K to evaluate the security of password schema.

1. Quality Q is formally defined by the password game

below. Informally, Q is the expected number of randomly

chosen challenges and associated responses that an adver-

sary must see in order to have sufficient information to

To see this, suppose a new phone number is memorized in each of sit-

tings k = 1…n, and that in sitting k in which we memorize the k

tele-

phone number, that sitting’s work also includes the log k rehearsals, each

costing c, sufficient to retain all previously memorized phone numbers.

The cost for this k

sitting will be C + c·(log k). That is at most Cn +

cn·log n total for all n phone numbers, reckoned on the sitting in which

we memorize our last (n

) phone number.

generate the correct response in one of at most 10 tries to

the next randomly chosen challenge. Our Q’s are typically

in the range of 6 to 10. By comparison, cryptographic ze-

ro–knowledge protocols have Q = 10

n-1

for integer pass-

words of length n.

2. Exponent K is an integer defining a probability, typical-

for an integer response to a challenge and

for a response consisting of letters and digits. This

is the probability that an adversary (Alice) can guess the

correct response to a new challenge once she has seen (in

previously observed challenge-response pairs) all but one

of the characters in the current challenge. For example,

suppose the schema maps letter strings (challenges) of

length n to digit strings (responses) of length 2n by hashing

each letter in the challenge to a pair of randomly chosen

digits. Further suppose that on challenge EBAY, the ad-

versary has previously seen E, B, A, but not Y. Then K =

2, indicating that the correct response to EBAY is any one

of 10

= 100 equally likely possibilities.

The Password Game (for determining Q)

Recall that the design of a Password schema includes in its

specification a sample space of permissible challenges, the

dictionary. The game is played between the User (he) of

the schema and the Adversary (she) under the watchful eye

of a trusted Judge. There are no winners or losers, just a

determination of Q.

Until the Adversary's 10 responses to a challenge is

guaranteed to contain the correct one:

•

The Judge selects a challenge at random (with

repetition) from the dictionary.

•

User and Adversary independently and privately

supply responses to the Judge. Here,

the User supplies the unique correct response

to the judge;

the Adversary (without seeing the User’s re-

sponse) guesses/computes 10 possible respons-

es, which she supplies to the Judge.

If none of the 10 responses are correct, then the

Judge gives the adversary the correct response.

End Until

Return the expected number of challenges, Q, up to and

including the one on which the Adversary supplied a

correct response.

Analysis of human usability and security for

select schemas

[DS1]

Usability.

Postprocessing: This schema requires the user to apply his

map to the first letter of the challenge, output the value,

and shift his “pointer” to the next letter of the challenge,

each operation costing 1 unit in our model. Thus, for an n-

letter challenge, the Human Usability Measure is HUM =

#(apply map, output, shift pointer in challenge)·n = 3n,

where #(x, y, z) denotes the number of steps to perform

operations x, y, z in this order. The average time it takes to

do this is about n seconds on an n letter challenge. This

comes to 10 seconds for a 10-letter challenge, which is

well within the desired 20 seconds (and the required 1 mi-

nute) per password to satisfy requirement PT2.

Preprocessing: The preprocessing complexity for memo-

rizing a letter-to-digit map, while significant, can be made

less than 1 hour by use of an appropriate memorization

technique. Specifying a technique that upper bounds the

human’s memorization or computation time is an im-

portant aspect of human computability, akin to specifying

an algorithm to achieve a certain run time.

After the user has tossed a 10-sided die 26 times to cre-

ate a random map from 26 letters to 10 digits – an amount

of time that we do not count toward our maximum 3-hour

requirement – he can memorize his map. This he does by

using a 26 row x 10 column table in which each entry pre-

scribes a visual way to memorize the associated letter to

digit map. For example, row W column 6 shows how to

remember that the chosen map takes W 6:

The authors of this paper can do an initial memorization

(up to the first successful recall of the entire map) in a con-

centrated 15-30 minutes using this table. Once the user has

completed an initial memorization of the material, he can

set a timer to rehearse his map using the [Pimsleur, 1967;

Woźniak & Gorzelańczyk, 1994; Blocki et al., 2013] rec-

ommended doubling schedule, which rehearses at intervals

of 15 min, 30 min, 1 hr, 2 hrs, 4 hrs etc. The first day is

hell, but after that the rehearsals are short and (relatively)

painless. Assuming each rehearsal takes 2 minutes, the

total time spent over the user’s lifetime is at most an addi-

tional 45 minutes, which comes to a total of one hour. This

satisfies the PT1 time requirement.

Security. The adversary can respond correctly to a chal-

lenge only if she has seen all letters in the challenge in pre-

vious challenges. If she hasn’t seen even one letter, the

chance of guessing the response to a single letter of the

challenge is at most 1/10, thus K=1. The quality measure Q

will be the expected number of challenge-response pairs up

to (and including) the first challenge all of whose letters

have appeared in earlier challenges. This value depends on

the dictionary. The table below shows the Q value (expec-

tation), its standard deviation and its 90% value (i.e., the

length of challenge at which the probability that Q is at

least this value is at least 0.9) for three dictionaries: a large

English dictionary, the top 500 domain names and random

7-letter strings. All estimates of Q were done using

100,000 trials with repetition.

Dictionary Q

STD

90%-value

English

50K

6.23 1.80 4

Top 500 domains

6.60

2.03

Random

7-letter

7.54 1.58 6

Table 1. Q values

For a dictionary of random strings, Q depends only slightly

on the length of the challenge, as shown in the next table.

1 2 3 4 5 6 7 8 9 10

7.09 8.89 9.12 8.82 8.39 7.96 7.54 7.12 6.97 6.91

Table 2. Q for random strings

[DS2]

Usability. Preprocessing for this schema is the same as

[DS1], requiring the user to memorize a letter-to-digit map.

Processing involves applying this map, adding mod 10,

outputting and shifting pointer. More precisely,

HUM = go to last letter, apply map, go to first letter, apply

map, add mod 10, output; then repeat n-1 times: (apply

map, add mod 10, output, shift pointer)

= 6.5 + (n-1)(4.5) = 4.5n+2.

Here we have counted 1.5 per addition mod 10, since some

additions create 2 digits while others create one, and both

types are of the same number in expectation.

Security. From knowledge of the schema and a challenge-

response pair, the adversary can recover all the map values

of letters in that challenge:

. Thus the

parameters K and Q are the same for [DS2] as for [DS1].

[DS3]

Usability. Preprocessing: This schema needs the user to

memorize both a letter-to-digit map as before (15-30 min

up front, 1 hour total over the user’s lifetime), and also a

random permutation on the digits 0-9. He does this by

choosing a random cyclic permutation of the digits, i.e., a

single 10-digit number

with no repeated digits. The map

takes each digit to the next digit, and the last digit to the

first. This is as hard as memorizing a single 10-digit phone

number, and takes 2 minutes up front and less than 10 min

over the course of the user’s lifetime. We note that with

only the 10-digit number memorized, access to an input

digit takes longer initially – about 5 steps on average. The

user has to scan multiple digits to find the right one. Each

time he does this, however, he rehearses his map: this will

soon give random access to the digit map.

The processing complexity is HUM = #(go to last letter,

apply f, go to first letter, apply f, add mod 10, switch to g,

apply g, output) + #(select next letter, apply f, add mod 10,

switch to g, apply g, output, shift pointer)·(n-1)

= 8.5 + (n-1)(7.5) = 7.5n+1.

The “switch to g” is for computing g after the user has

memorized the map as a linked list but not yet as a random

access hash. Its cost would initially be 5 but would drop

eventually to 1.

Security. A computationally unbounded adversary can

consider all possible cycles on 10 digits (9! =362,880 of

them), and decode the user’s maps as follows: whenever

she notices an inconsistency in the values of f for a particu-

lar g, that g is eliminated. The quality Q is the number of

challenges up to and including those for which the surviv-

ing f, g pairs

all give the same response. In our evaluations,

the Q for [DS3] is about the same as for [DS1] and [DS2].

Nevertheless, [DS3] provides more security in at least one

sense: the adversary needs to see multiple challenge-

response pairs before she can determine the value of f for

even a single letter.

[WS1]

Usability. Preprocessing: The user has to create a map

from letters to words of a topic, perhaps aided by a word

generator, and then rehearse his map. Creating the map

takes 30mins or so, but after that, the map is surprisingly

easy to rehearse and to remember since the starting letter

together with the topic gives a strong clue to the word.

Budgeting 2min per rehearsal, with spaced repetition at

doubling intervals, we get a total of at most 45min over the

user’s lifetime.

Processing: Recalling that we process k letters of the chal-

lenge and output 2k letters, the processing complexity is

HUM = #(apply f, output two consonants, shift pointer on

the challenge)·k = 4k = 2n. This is the lowest complexity

per output character of all the schemas we discuss.

Security. As in the case of [DS1], an adversary has to have

seen each of the first k characters of a challenge in previ-

ous challenges to be able to respond correctly. However,

each challenge-response pair reveals only k values, where

k is typically less than the length of the challenge. Using

k=4 (responses of length 8) gives a significantly higher

value of the quality Q compared to [DS1, DS2, DS3] that

use the entire challenge (Table 1).

Dictionary Q

STD

90%-value

English

50K

7.25 2.02 4

Top 500 domains

7.56

2.24

Random

4-letter

8.82 2.21 6

Table 3. Q values for 4-letter challenges

The security parameter K is between 1 and 3. If there is

one letter in the new challenge that the adversary has not

seen, he has to map it to two letters. If these two letters

were both completely random then his chance would be

only

However, since we

output consecutive consonants of a word, and consonants

are not equally likely, the true probability is a bit higher.

[LP1]

Usability. Preprocessing: Memorizing a random letter

permutation by singing it to a tune takes 5 mins initially

and 1 min per rehearsal, giving a total of 20 additional

minutes over the user’s lifetime.

Processing.

HUM = #(apply map, output, shift pointer)·n = 3n.

We remark that while we have charged a cost of 1 for ap-

plying the “next-letter” map, this could initially take signif-

icantly longer, as the user would have to recite (sing) an

expected 13 letters to compute his hash. With time, how-

ever, he will get faster. This schema will be truly self-

rehearsing if and only if the user recites the entire permuta-

tion each time he generates a password.

Security. The quality Q is the same as in previous schemas

that use all n letters of a challenge (Table 1).

[LP2]

Usability. This schema has the distinct advantage of very

low preprocessing complexity --- once the user has chosen

his three words, he has practically got them already memo-

rized! The processing complexity is similar to LP1, con-

verging to HUM = 3n, after sufficient practice.

Security. The quality Q is potentially lower, since an ad-

versary who knows the schema, after seeing some letters,

could try to guess the user’s words, which come from a

significantly smaller set of possibilities than all letter per-

mutations.

[RM]

Usability. This schema, introduced primarily to demon-

strate the tradeoff between usability and security, requires

a huge preprocessing effort, random access memorization

of an n x n matrix of random digits, for n about 8, 9, 10,

with the user learning a pointer to a value in each position

(i,j) of the matrix. Unlike all the previous schemas, which

we have successfully implemented on ourselves, we have

not done or even tried to do this one. We do believe it is

possible, given routine memory feats such as learning the

digits of up to hundreds of digits. The processing com-

plexity, at least for the natural human algorithm, can be

computed as follows: for each output digit, make one pass

over the challenge: apply the matrix map, apply the letter-

to-digit map, multiply mod 10, add mod 10, then shift the

pointer into the challenge. Thus, for an n-digit output,

HUM = #((apply M, retrieve f, apply f, multiply, add, shift

pointer)· , output, shift M pointer)·

Here we counted 2 per multiplication since multiplying

two single digits typically results in two digits. This im-

plementation uses n passes over a challenge of length n.

The resulting processing complexity is the highest of all

the schemas we have considered. Is there a faster human

algorithm? In a later section, we will prove that the number

of passes needed by any human algorithm grows linearly

with the length of the challenge.

Security. The adversary has to know the entire matrix M

and the user’s map on the letters of the next challenge, to

be able to output the correct response with probability

more than 1/10. Each challenge-response pair gives n

equations. Thus Q is at least n. An unbounded adversary

could keep track of all

possible letter-to-digit maps f,

and eliminate them as she finds inconsistencies. A rough

lower estimate of Q is n + (26/n) = 11.25 for n=8.

[Multi-map schemas]

Usability. The preprocessing complexity is m times the

preprocessing complexity of a single map. For letter-to-

digit maps he can memorize a single map from letters to

m-digit numbers. For letter-to-word maps, he can pick dif-

ferent topics for each map.

While processing, first the user has to compute the map

number t. Then HUM = n+1 if he uses

. If

he uses

, then HUM=5 for identify-

ing the map. The remaining processing cost is that for a

single map schema. For example, for [DS1], whose pro-

cessing cost for a single map is 3n, the processing cost of a

multi-map schema would be 3n+5 with the latter rule for t.

Security. This schema has a Q with expected value that is

nearly m times the Q-value for a single map. To see this,

suppose for argument that it takes exactly Q challenge-

response pairs for a single map. If we think of the k maps

as k bins and each challenge as a ball thrown into a random

bin, then the game stops when one of the bins reaches Q

balls. With k bins, we stop as soon as one of them reaches

Q balls, which will happen slightly before all k of them do.

Schema Preprocessing Processing

HUM Q

DS1 15+45

6-7

DS2 15+45

4.5n+2

6-7

DS3 17+55

7.5n+1

6-7

WS1 30+45

(=16)

7-8

LP1 5+20

6-7

LP2 1+1

(eventually) 3-4

MWS1 m(30+45)

3n+5

6·m

RM ??

+ 2n

11

Table 4. Comparison of Schemas

The ultimate proof that a schema is humanly usable is to

see a human do it within the prescribed constraints. To do

this, and to validate the HUM, we tested ourselves on most

of these schemas. These results are in the tables and charts

below. Just as a computer’s timing depends on what other

processes it is executing, a human’s speed also depends on

several factors such as energy level, distraction etc. To

avoid these confounds, we timed for all the schemas in one

sitting, using 11 challenges in random order.

Table 5. Timings for H2

We compared these observed timings to their HUM val-

ues by plotting the data for [DS1], [DS2], [DS3] and

[WS1] against the length of the challenge (Figure 1). The

HUM for [WS1] is 8 for a challenge with at least k=4 dis-

tinct letters, i.e., constant.

Using the same challenges, we timed 8 other human sub-

jects on [DS1] and [LP1], with each schema timed in a

separate sitting. The average timings are shown in Figure

2. The two schemas have comparable timings, consistent

with their HUM values. We expect them to get even closer

as the users rehearse their letter permutations.

A human password schema theorem…

Based on the previous section, we assert the following.

THEOREM 1. There exist publishable PASSWORD

SCHEMAS that are

1. WELL-DEFINED (i.e., implementable unambiguously

on a Turing machine),

2. MACHINE UNCRACKABLE to a well-defined extent

by unbounded Turing machines (i.e., information- theoreti-

cally, based on the Q value), and

3. HUMANLY USABLE as witnessed by a feasible human

algorithm with a bounded HUM, and demonstrated by at

least one normal dedicated human being with at most 1

hour of preprocessing, and at most 30 seconds of pro-

cessing (i.e., shown empirically).

…and an impossibility theorem

Here we show that any human algorithm that implements

the [RM] schema has to make a prohibitively large number

of passes over the challenge, even for some simple M.

THEOREM 2. There exist

matrices

such that for

any

, any algorithm that computes

by making

at most passes over any given -digit challenge must

use a working memory of size at least

digits.

The theorem is tight: we can compute n/k digits of output

in each pass using k digits of working memory.

However,

since the working memory allowed in our human usability

model is a small constant (2 digits), at least

passes are

needed to compute

for a challenge of length .

Proof of Thm 2. Let M be the matrix that "reverses" chal-

lenge

, i.e.,

. First we

prove the bound for any 1-pass algorithm. A 1-pass algo-

rithm that scans only once cannot output till it sees the

last digit of . Consider the working memory contents S at

that point. The algorithm is able to compute from just S

and

. From we can reconstruct since M is full rank.

N DS1 DS2 DS3 WS1 LP1 LP2

CMU

3

3 6 16

5 5 6

APPLE

4 6 22

6 10

GMAIL

4 9 19

6 8 18

DELTA

4 6 14

6 7 19

CHASE

5 7 16

6 14

AMAZON

7 10 24 7 15 17

GATECH

6 8 23

6 14

PAYPAL

5 8 20

7 9 17

DROPBOX 7

8 11 22 6 14 18

PNCBANK 7

6 13 25 7 18 21

WELLFARGO

10 18 29 5 17 35

Figure 1. Timing comparison in one sitting

Figure 2. Average times of 8 participants

So S and

together determine , which means S must

have at least n-1 digits, and along with reading

(which

is essential), this is a total of n digits of memory.

To analyze multi-pass algorithms, we take the challenge

to be a random digit sequence. For a proof by contradic-

tion, suppose that the algorithm uses only digits of

memory for

. Then by the above argument, after

one pass, the algorithm can output at most digits of ,

and these must be a subset of the s-prefix of . In the se-

cond pass, consider the point at which the digit

is out-

put. After this, due to the space restriction, at most s-1 dig-

its can be output in the second pass, so at most a 2s-long

prefix of is output after two passes. This conclusion

clearly holds if

is not output in the second pass. We

repeat this argument, using

in the i

pass, con-

cluding

that

k+1

passes

are

needed. ■

We remark that for a random matrix M, the above proof

technique shows that with probability at least 9/10 (over

the choice of M), any 1-pass algorithm needs memory of

size at least n. This is because the last entry of the first row

is nonzero with probability 9/10; in this case, the algorithm

can start its output only after reading the last digit, at which

point it needs as much memory as its total output.

This lower bound technique applies to many natural

matrices. For other matrices, a human algorithm making

only one or two passes can compute the product C

·

M. E.g.,

DS2 uses the matrix with 1’s on and below the diagonal

and zeros above it. This raises two questions for future

research: for which matrices can C

·

M be computed by a

human algorithm? Which of these gives the most security?

Further considerations

Using letter frequencies. The frequency of letters in Eng-

lish words is highly skewed, and the first 10 letters by fre-

quency make up more than 70% of all occurrences. Thus,

an easier memorization option that works nearly as well as

a complete map is to learn a map for just the first ten most

frequent letters: E, T, A, O, I, N, S, H, R, D. In the case of

a letter-to-digit map, the corresponding digits could be read

off as a single phone number. For a letter-to-letter map, we

recommend mapping only to consonants, then inserting

letters to make the 10 consonants form words and sounds

that are more familiar and therefore easier to remember.

Changing passwords. Passwords sometimes have to be

changed. How should a user who is using one of the pass-

word generation methods described here change his pass-

word(s)? The simplest, oft-suggested method to change

passwords is to attach an extra character at the beginning

or end of the password and cycle through its chosen range

of values, e.g., 0,1,2,0,1,2…. This has a serious problem: a

new password that differs from the old one in only one

digit will not be particularly secure.

A more effective method to change passwords is to set

challenge equal to one of challenge0, challenge1, … chal-

lenge9 (or 0challenge, 1challenge, …). These challenge

will turn into passwords that provably look quite different

from each other. This is in fact a desideratum for password

generation methods.

Acknowledgements

We are deeply grateful to Jeremiah Blocki, Lenore Blum,

Anupam Datta, Adam Kalai, V. V. Rao and Haofeng

Zhang for helpful discussions, and to many others for try-

ing out some of these schemas. This work was partially

supported by the NSF.

References

Blocki, J. 2014. Usable Human Authentication: A Quantitative

Treatment. Ph.D. Thesis. CMU-CS-14-108.pdf

Blocki, J., Blum M. & Datta, A. 2013. Naturally Rehearsing

Passwords. ASIACRYPT (2): 361-380.

Bonneau, J. 2012. The science of guessing: analyzing an anony-

mized corpus of 70 million passwords. IEEE Symposium on Secu-

rity and Privacy (SP), 538–552.

Bonneau, J., Herley, B., Oorschot, P. C. v., and Stajano, F. The

Quest to Replace Passwords: A Framework for Comparative

Evaluation of Web Authentication Schemes 2012. IEEE Sympo-

sium on Security and Privacy, 553 – 567.

Cranor, L.F. What’s wrong with your pa$$w0rd? 2014. TED talk,

http://www.ted.com/talks/lorrie_faith_cranor_what_s_wrong_wit

h_your_pa_w0rd/transcript?language=en

Li, Z., He, W., Akhawe, D. and Song, D. 2014. The Emperor's

New Password Manager: Security Analysis of Web-based Pass-

word Managers. USENIX Security 2014: 465-479.

Ives, B., Walsh, K. R., and Schneider, H. 2004. The Domino

Effect of Password Reuse. Communications of the ACM, April

2004, 47(4), 75-78.

Kruger, H., Steyn, T., Medlin, B. and Drevin, L. 2008. An empir-

ical assessment of factors impeding effective password manage-

ment. Journal of Information Privacy and Security, 4(4):45–59.

Pimsleur, P. 1967. A Memory Schedule. The Modern Language

Journal, 51(2), 73—75.

Shay, R., Komanduri, S., Durity, A. L., Huh, P. (S.), Mazurek, M.

L., Segreti, S. M., Ur, B., Bauer, L., Christin, N. and Cranor, L. F.

2014. Can long passwords be secure and usable? CHI 2014:

2927-2936.

Woźniak, P. A., & Gorzelańczyk, E. J. 1994. Optimization of

repetition spacing in the practice of learning. Acta Neurobiologi-

ae Experimentalis, 54, 59-62.

Yüklə 0,7 Mb.

Dostları ilə paylaş: