4.21. European VAT Numbers

Updated on June 7, 2022

Problem

As part of your job, you’ll be developing an online order form for a European Union corporation.

Your customer (a VAT-registered firm based in one EU country) shall not be charged VAT when purchasing from a vendor (your company) located in another EU country (Value-Added Tax). VAT must be charged and paid to the local tax office if the buyer is not VAT-registered. Sellers are required to provide the tax office with the buyer’s VAT registration information to prove that no VAT is due. When selling tax-exempt goods, the vendor must ensure that the customer has a valid VAT number in order to proceed.

Typographical errors by the consumer are the most frequent source of invalid VAT numbers. You should use a regular expression to validate the VAT number as soon as the consumer enters it into your online order form in order to speed up the process. Your web server’s CGI script or some JavaScript can be used to implement this functionality on the customer’s end. A typographical error can be immediately corrected by the customer if the number entered does not fit the regular expression.

Solution

This solution is divided into two sections in order to make it easier to apply. To begin, all white space and punctuation have been removed. Validation is the next step.

Strip whitespace and punctuation

The customer’s VAT number should be stored in a variable. Replace all matches of this regular expression with a blank replacement text before checking for a valid number:

[-.●]
There are no possibilities for regex in this case.
Regex flavors:.NET, Java, JavaScript, PCRE, Perl, Python, Ruby Recipe 3.14 explains you how to do this initial replacement. To avoid confusion, we’ve assumed that customers would only use hyphens, dots, and spaces when entering punctuation. The upcoming check will catch any more characters.

Validate the number

This regular expression, which removes all whitespace and punctuation, verifies that the VAT number is valid in all 27 EU countries:

^(
(AT)?U[0-9]{8} |                              # Austria
(BE)?0[0-9]{9} |                              # Belgium
(BG)?[0-9]{9,10} |                            # Bulgaria
(CY)?[0-9]{8}L |                              # Cyprus
(CZ)?[0-9]{8,10} |                            # Czech Republic
(DE)?[0-9]{9} |                               # Germany
(DK)?[0-9]{8} |                               # Denmark
(EE)?[0-9]{9} |                               # Estonia
(EL|GR)?[0-9]{9} |                            # Greece
(ES)?[0-9A-Z][0-9]{7}[0-9A-Z] |               # Spain
(FI)?[0-9]{8} |                               # Finland
(FR)?[0-9A-Z]{2}[0-9]{9} |                    # France
(GB)?([0-9]{9}([0-9]{3})?|[A-Z]{2}[0-9]{3}) | # United Kingdom
(HU)?[0-9]{8} |                               # Hungary
(IE)?[0-9]S[0-9]{5}L |                        # Ireland
(IT)?[0-9]{11} |                              # Italy
(LT)?([0-9]{9}|[0-9]{12}) |                   # Lithuania
(LU)?[0-9]{8} |                               # Luxembourg
(LV)?[0-9]{11} |                              # Latvia
(MT)?[0-9]{8} |                               # Malta
(NL)?[0-9]{9}B[0-9]{2} |                      # Netherlands
(PL)?[0-9]{10} |                              # Poland
(PT)?[0-9]{9} |                               # Portugal
(RO)?[0-9]{2,10} |                            # Romania
(SE)?[0-9]{12} |                              # Sweden
(SI)?[0-9]{8} |                               # Slovenia
(SK)?[0-9]{10}                                # Slovakia
)$
Regex options: Free-spacing, case insensitive
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby

The above regular expression uses free-spacing mode to make it easy to edit later. Every now and then, new countries join the European Union, and member countries change their rules for VAT numbers. Unfortunately, JavaScript does not support free-spacing. In this case, you’re stuck putting everything on one line:

^((AT)?U[0-9]{8}|(BE)?0[0-9]{9}|(BG)?[0-9]{9,10}|(CY)?[0-9]{8}L|↵
(CZ)?[0-9]{8,10}|(DE)?[0-9]{9}|(DK)?[0-9]{8}|(EE)?[0-9]{9}|↵
(EL|GR)?[0-9]{9}|(ES)?[0-9A-Z][0-9]{7}[0-9A-Z]|(FI)?[0-9]{8}|↵
(FR)?[0-9A-Z]{2}[0-9]{9}|(GB)?([0-9]{9}([0-9]{3})?|[A-Z]{2}[0-9]{3})|↵
(HU)?[0-9]{8}|(IE)?[0-9]S[0-9]{5}L|(IT)?[0-9]{11}|↵
(LT)?([0-9]{9}|[0-9]{12})|(LU)?[0-9]{8}|(LV)?[0-9]{11}|(MT)?[0-9]{8}|↵
(NL)?[0-9]{9}B[0-9]{2}|(PL)?[0-9]{10}|(PT)?[0-9]{9}|(RO)?[0-9]{2,10}|↵
(SE)?[0-9]{12}|(SI)?[0-9]{8}|(SK)?[0-9]{10})$
Regex options: Case insensitive
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Follow Recipe 3.6 to add this regular expression to your order form.

Discussion

Strip whitespace and punctuation

When typing in VAT numbers, users typically use extra punctuation to separate the digits into groups so that they can be read by humans. When a German customer enters his VAT number DE123456789 as DE 123.456.789, he is using the correct format.

It’s impossible to create a single regular expression that can match VAT numbers from all 27 countries, regardless of the notation used. For simplicity’s sake, the punctuation should be removed first, then the plain VAT number should be validated.

You can use the regular expressions [-.], [-.], and [space] to match any character. The punctuation characters typically used in VAT numbers can be removed by replacing all matches of this regular expression with nothing.

TIP

VAT numbers consist only of letters and digits. Instead of using [-.] to remove only common punctuation, you could use [^A-Z0-9] to strip out all invalid characters.

Validate the number

Regular expressions are used to validate numbers. Aside from using free-spacing syntax to help with readability, there is no other difference between the two. It is not possible to use free-spacing in JavaScript without the XRegExp package. You have a selection with the additional flavors.

The regex uses alternation to accommodate the VAT numbers of all 27 EU countries. The essential formats are shown in Table 4-3.

 

Table 4-3. EU VAT number formats

Country VAT number format
Austria U99999999
Belgium 0999999999
Bulgaria 999999999 or 9999999999
Cyprus 99999999L
Czech Republic 99999999999999999, or 9999999999
Germany 999999999
Denmark 99999999
Estonia 999999999
Greece 999999999
Spain X9999999X
Finland 99999999
France XX999999999
United Kingdom 999999999999999999999, or XX999
Hungary 99999999
Ireland 9S99999L
Italy 99999999999
Lithuania 999999999 or 99999999999
Luxembourg 99999999
Latvia 99999999999
Malta 99999999
Netherlands 999999999B99
Poland 999999999
Portugal 999999999
Romania 99999999999999999999999999999999999999999999, or 9999999999
Sweden 99999999999
Slovenia 99999999
Slovakia 999999999

The VAT number includes the two-letter country code. However, since the billing address already reveals the country, it is frequently left out. The country code can be included or omitted from the VAT number when using the regular expression. Remove all the question marks from the regular expression if you want the country code to be required. The error message that alerts the user that their VAT number is invalid should clarify that you require the country code.

Customers from countries that aren’t listed in your order form’s country selection can skip the checkout process. Remove the | operator that separates an alternative from the next or previous one before deleting it. Your regular expression will have || instead if you don’t. As long as you include a VAT number in your order form, it will be accepted as valid as long as you do not include any other information in the field.

The 27 options are arranged in a row. An anchor for the regular expression to be applied to your string is inserted between a caret and a dollar sign. It is necessary to verify that the entire input is a VAT number.

Replace the anchors with b word boundaries if you’re looking for VAT numbers in a huge body of text.

Variations

In order to check for all 27 countries, you simply need to put one regular expression validation on your order form. There are 27 regular expressions you can use to improve your order form. Before anything else, make sure that the customer’s billing address is correct. Table 4-4 contains the proper regular expressions for each country.

Table 4-4. EU VAT number regular expressions

Country VAT number regular expression
Austria ^(AT)?U[0-9]{8}$
Belgium ^(BE)?0[0-9]{9}$
Bulgaria ^(BG)?[0-9]{9,10}$
Cyprus ^(CY)?[0-9]{8}L$
Czech Republic ^(CZ)?[0-9]{8,10}$
Germany ^(DE)?[0-9]{9}$
Denmark ^(DK)?[0-9]{8}$
Estonia ^(EE)?[0-9]{9}$
Greece ^(EL|GR)?[0-9]{9}$
Spain ^(ES)?[0-9A-Z][0-9]{7}[0-9A-Z]$
Finland ^(FI)?[0-9]{8}$
France ^(FR)?[0-9A-Z]{2}[0-9]{9}$
United Kingdom ^(GB)?([0-9]{9}([0-9]{3})?|[A-Z]{2}[0-9]{3})$
Hungary ^(HU)?[0-9]{8}$
Ireland ^(IE)?[0-9]S[0-9]{5}L$
Italy ^(IT)?[0-9]{11}$
Lithuania ^(LT)?([0-9]{9}|[0-9]{12})$
Luxembourg ^(LU)?[0-9]{8}$
Latvia ^(LV)?[0-9]{11}$
Malta ^(MT)?[0-9]{8}$
Netherlands ^(NL)?[0-9]{9}B[0-9]{2}$
Poland ^(PL)?[0-9]{10}$
Portugal ^(PT)?[0-9]{9}$
Romania ^(RO)?[0-9]{2,10}$
Sweden ^(SE)?[0-9]{12}$
Slovenia ^(SI)?[0-9]{8}$
Slovakia ^(SK)?[0-9]{10}$

Make use of Recipe 3.6 to check that the VAT number is valid against the regular phrase. This will let you know if the number is legitimate in the country the consumer claims to reside in…

You may compel the VAT number to start with the correct country code without asking the customer by using separate regular expressions. You should examine the first capturing group’s contents if the regular expression meets the given number. This is explained in Recipe 3.9. This indicates that the buyer did not begin their VAT number with a country code. Before saving the number in your order database, you can add the country code.

Two country codes are permitted for Greek VAT numbers. GR is the ISO country code for Greece, but EL has long been the standard for Greek VAT numbers.

See Also

Only a valid VAT number may be determined by using a regular expression. In order to pick out honest mistakes, this is sufficient. Using a regular expression to check if a company has a VAT number is obviously ineffective. To find out if a given VAT number belongs to a specific business, the European Union provides a website at http://ec.europaeu/taxation customs/vies/vieshome.do.

Chapter 2 explains the regular expressions utilized in this recipe. Classes for characters are described in detail in Recipe 2.3. Anchors are defined in Recipe 2.5. The concept of alternation is explained in detail in Recipe 2.8. Grouping is explained in detail in Recipe 2.9. Repeated actions are described in detail in Recipe 2.12.

 

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.