How to generate a verification code/number?

Given these requirements, how would you generate such a number?

@Haaked: The code has to be numerical because the user types it with its phone.

@matt b: On the first step, the code is displayed on a Web page, the second step is to call and type in the code. I don't know the user's phone number.

Followup : I've found several algorithms to check the validity of numbers (See this interesting Google Code project : checkDigits).

18.7k 12 12 gold badges 66 66 silver badges 94 94 bronze badges asked Sep 5, 2008 at 16:28 5,950 8 8 gold badges 34 34 silver badges 35 35 bronze badges

The algorithm in the phone can be easily be reverse engineered. So it can't be used as an anti fraud or spam measure.

Commented Feb 3, 2019 at 18:51

9 Answers 9

After some research, I think I'll go with the ISO 7064 Mod 97,10 formula. It seems pretty solid as it is used to validate IBAN (International Bank Account Number).

The formula is very simple:

Take a number : 123456
Apply the following formula to obtain the 2 digits checksum : mod(98 - mod(number * 100, 97), 97) => 76
Concat number and checksum to obtain the code => 12345676
To validate a code, verify that mod(code, 97) == 1

mod(12345676, 97) = 1 => GOOD
mod(21345676, 97) = 50 => BAD !
mod(12345678, 97) = 10 => BAD !

Apparently, this algorithm catches most of the errors.

Another interesting option was the Verhoeff algorithm. It has only one verification digit and is more difficult to implement (compared to the simple formula above).

3,956 4 4 gold badges 38 38 silver badges 59 59 bronze badges answered Sep 5, 2008 at 18:30 5,950 8 8 gold badges 34 34 silver badges 35 35 bronze badges

If hostile users are expected (that seems implied by the question), it's going to be very easy for a user to generate a valid ID using this algorithm.

Commented Jul 9, 2009 at 12:42

Hostile users were not really an issue in this application. Still, I added some logic to "mix" and "unmix" bits in order to make it less straightforward to guess next numbers.

Commented Jul 14, 2009 at 0:10

When calculating a checksum with an IBAN-like length, keep in mind that the intermediate sum often does not fit in a signed 32-bit integer.

Commented Jul 4, 2011 at 22:17

The checksum could be a 1 digit. so, in that case you should prefix cero. e.g. for 256 , the code will be 25609 . don't confuse with 2569

Commented Oct 11, 2013 at 20:23

@nick Johnson i don't think hostile users are implied in the question. also, check digits are not to prevent hostile users but to reduce data entry errors.

Commented Oct 12, 2016 at 9:57

For 1M combinations you'll need 6 digits. To make sure that there aren't any accidentally valid codes, I suggest 9 digits with a 1/1000 chance that a random code works. I'd also suggest using another digit (10 total) to perform an integrity check. As far as distribution patterns, random will suffice and the check digit will ensure that a single error will not result in a correct code.

Edit: Apparently I didn't fully read your request. Using a credit card number, you could perform a hash on it (MD5 or SHA1 or something similar). You then truncate at an appropriate spot (for example 9 characters) and convert to base 10. Then you add the check digit(s) and this should more or less work for your purposes.

answered Sep 5, 2008 at 16:33 Kyle Cronin Kyle Cronin 78.7k 45 45 gold badges 151 151 silver badges 167 167 bronze badges

You want to segment your code. Part of it should be a 16-bit CRC of the rest of the code.

If all you want is a verification number then just use a sequence number (assuming you have a single point of generation). That way you know you are not getting duplicates.

Then you prefix the sequence with a CRC-16 of that sequence number AND some private key. You can use anything for the private key, as long as you keep it private. Make it something big, at least a GUID, but it could be the text to War and Peace from project Gutenberg. Just needs to be secret and constant. Having a private key prevents people from being able to forge a key, but using a 16 bit CR makes it easier to break.

To validate you just split the number into its two parts, and then take a CRC-16 of the sequence number and the private key.

If you want to obscure the sequential portion more, then split the CRC in two parts. Put 3 digits at the front and 2 at the back of the sequence (zero pad so the length of the CRC is consistent).

This method allows you to start with smaller keys too. The first 10 keys will be 6 digits.

answered Sep 5, 2008 at 18:34 Jim McKeeth Jim McKeeth 38.6k 25 25 gold badges 124 124 silver badges 197 197 bronze badges

Does it have to be only numbers? You could create a random number between 1 and 1M (I'd suggest even higher though) and then Base32 encode it. The next thing you need to do is Hash that value (using a secret salt value) and base32 encode the hash. Then append the two strings together, perhaps separated by the dash.

That way, you can verify the incoming code algorithmically. You just take the left side of the code, hash it using your secret salt, and compare that value to the right side of the code.

answered Sep 5, 2008 at 16:44 58.9k 14 14 gold badges 91 91 silver badges 115 115 bronze badges

I must have a reasonnable number of possible combinations (let's say 1M)
The code must be as short as possible, to avoid errors from the user

Well, if you want it to have at least one million combinations, then you need at least six digits. Is that short enough?

answered Sep 5, 2008 at 16:33 Chris Upchurch Chris Upchurch 15.5k 6 6 gold badges 52 52 silver badges 66 66 bronze badges

When you are creating the verification code, do you have access to the caller's phone number?

If so I would use the caller's phone number and run it through some sort of hashing function so that you can guarantee that the verification code you gave to the caller in step 1 is the same one that they are entering in step 2 (to make sure they aren't using a friend's validation code or they simply made a very lucky guess).

About the hashing, I'm not sure if it's possible to take a 10 digit number and come out with a hash result that would be < 10 digits (I guess you'd have to live with a certain amount of collision) but I think this would help ensure the user is who they say they are.

Of course this won't work if the phone number used in step 1 is different than the one they are calling from in step 2.

answered Sep 5, 2008 at 16:42 140k 66 66 gold badges 284 284 silver badges 349 349 bronze badges

Assuming you already know how to detect which key the user hit, this should be doable reasonably easily. In the security world, there is the notion of a "one time" password. This is sometimes referred to as a "disposable password." Normally these are restricted to the (easily typable) ASCII values. So, [a-zA-z0-9] and a bunch of easily typable symbols. like comma, period, semi colon, and parenthesis. In your case, though, you'd probably want to limit the range to [0-9] and possibly include * and #.

I am unable to explain all the technical details of how these one-time codes are generated (or work) adequately. There is some intermediate math behind it, which I'd butcher without first reviewing it myself. Suffice it to say that you use an algorithm to generate a stream of one time passwords. No matter how mnay previous codes you know, the subsequent one should be impossibel to guess! In your case, you'll simply use each password on the list as the user's random code.

Rather than fail at explaining the details of the implementation myself, I'll direct you to a 9 page article where you can read up on it youself: https://www.grc.com/ppp.htm

answered Sep 5, 2008 at 16:50 Rob Rolnick Rob Rolnick 9,579 2 2 gold badges 30 30 silver badges 17 17 bronze badges

It sounds like you have the unspoken requirement that it must be quickly determined, via algorithm, that the code is valid. This would rule out you simply handing out a list of one time pad numbers.

There are several ways people have done this in the past.

Make a public key and private key. Encode the numbers 0-999,999 using the private key, and hand out the results. You'll need to throw in some random numbers to make the result come out to the longer version, and you'll have to convert the result from base 64 to base 10. When you get a number entered, convert it back to base64, apply the private key, and see if the intereting numbers are under 1,000,000 (discard the random numbers).
Use a reversible hash function
Use the first million numbers from a PRN seeded at a specific value. The "checking" function can get the seed, and know that the next million values are good. It can either generate them each time and check one by one when a code is received, or on program startup store them all in a table, sorted, and then use binary search (maximum of compares) since one million integers is not a whole lot of space.

There are a bunch of other options, but these are common and easy to implement.