22nd February 2021

Mysteries of the NHI number

At Rapid Rēhita, we try hard to make sure our enrolment form catches mistakes new patients make.

One common error in paper forms is poor transcription: copying things down wrongly. This is especially likely when writing something which isn't a word but a code or number, as anyone who's had to ask a second time for a phone number knows.

Luckily, the designers behind the NHI number format knew this was likely to be a problem with an arbitrary designator copied and recopied in many different places and circumstances. So they incorporated into NHI numbers a feature which people in New Zealand general practices -- even those who work with NHI numbers every day -- might not know about: a checksum.

The last digit of an NHI number is actually generated from all the earlier characters using a defined algorithm. Swapping numbers around or entering an incorrect number can be checked by following the same algorithm. If this gives a different digit, something was entered incorrectly.

We do this in our form -- if you move forward to the NHI number entry on this demonstration form you'll see that an NHI number which is correct, like ZAC5361, will be accepted, while ZAC3561 will result in a message asking you to check that you've entered the number correctly.

How does this work? Here's the algorithm, which we reimplement for form validation; it's actually surprisingly simple. We'll follow along with the values for ZAC5361 at each step:

Check that an NHI number is in the correct format; three letters followed by four digits. ZAC5361 passes this test.
Convert the first six characters to a numerical value based on their place in the restricted alphabet used in NHI numbers (ABCDEFGHJKLMNPQRSTUVWXYZ) so that A is equal to one, B to two, and so on, while treating any digit as having a value equal to itself, so that 1 is, unsurprisingly, equal to one. ( I and O aren't used in NHI numbers because of the possibility of confusion with 1 and 0.) For ZAC5361, this gives us:
```
24, 1, 3, 5, 3, 6
```
multiply each value by its place in the NHI number, the first by seven, the second by six and so on:
```
  24 * 7, 1 * 6, 3 * 5,  5 * 4,  3 * 3,  6 * 2
```
get the sum of all these values:
```
  168 + 6 + 15 + 20 + 9 + 12 = 230
```
divide this sum modulo 11; that's a fancy way of saying work out what you might have called the remainder in school after dividing by 11. If this is zero, the number is invalid:
```
  230 % 11 = 10
```
Otherwise, subtract this number from 11. If the result is ten, use zero to represent it (so that all NHI numbers are the same length.)
```
  11 - 10 = 1
```
now compare this number, the calculated checksum, to the provided checksum -- the last digit of the NHI number. If they're the same, the number is a valid NHI number. If not, there was probably a transcription error! This really sensible system must save a lot of errors.

An unexpected problem

You might have heard, though, that New Zealand will be moving to a new NHI format soon, because we're running out of numbers. (Don't worry! The new format has a checksum too, and our systems are all ready to validate NHI numbers in the new format as well as the old.)

But did you ever stop to wonder why? You'd think a set of 36 (26 letters and 10 digits) could give you 78,364,164,096 (36⁷) possibilities. That seems more than enough.

But it's not that simple. We actually only have 24 letters, and only six not seven characters in an NHI, since the last is the checksum. A NHI number also has to be in the format AAANNNX, where A is a letter and N a digit, and X the checksum. In other words, we could more precisely formulate the possibilities as one of 13,824 different permutations of letters followed by 1,000 different permutations of numbers, or only around fourteen million possible numbers.

The problem is that having a checksum brings this number down even further. Remember that any NHI number whose calculated sum is divisible by 11 is invalid. That brings us down to 11,310,545 possible NHI numbers. Only it's a bit worse than that; a system like Rapid Rēhita's relies on rigorous automated tests to check that it always gets the right result with a variety of valid and invalid numbers. It's a bit difficult to get real NHI numbers to do this with, so any NHI number starting with Z is reserved for testing. Once you exclude those, there are only 10,839,272 valid NHI numbers left to assign.

It makes a lot more sense that we might run out of NHI numbers by 2025 when you see how few there are available once you've limited the possibilities by taking into account the restricted format and the need to calculate a checksum. So why did the original implementation of NHI numbers limit itself so?

Because forms are hard. People struggle to fill them out accurately, and they will make mistakes. It's much better to have checkable inputs where those mistakes can be corrected than to rely on levels of accuracy which people have never been able to achieve. And so the designers of the NHI system deliberately limited themselves to a quickly recognisable format which could be used to check for errors rather than having an essentially unlimited number of possibilities, because an unlimited number of possibilities also means that you've got no way to spot inevitable mistakes.

At Rapid Rēhita, we take this into account. Our enrolment forms are full of validation measures like those for NHIs. It's more work for us, but it leads to dramatically fewer errors in completed forms, saving practices time and hassle.

Mysteries of the NHI number

An unexpected problem

About Rapid Rēhita

Recent Posts