The right way to create a Area Identify Validator in Python

The problem

Create a website title validator largely compliant with RFC 1035, RFC 1123, and RFC 2181

The next guidelines apply:

  • Area title might comprise subdomains (ranges), hierarchically separated by . (interval) character
  • Area title should not comprise greater than 127 ranges, together with prime stage (TLD)
  • Area title should not be longer than 253 characters (RFC specifies 255, however 2 characters are reserved for trailing dot and null character for root stage)
  • Stage names should be composed out of lowercase and uppercase ASCII letters, digits and – (minus signal) character
  • Stage names should not begin or finish with – (minus signal) character
  • Stage names should not be longer than 63 characters
  • Prime stage (TLD) should not be absolutely numerical


  • Area title should comprise no less than one subdomain (stage) other than TLD
  • Prime stage validation should be naive – ie. TLDs nonexistent in IANA register are nonetheless thought of legitimate so long as they adhere to the foundations given above.

The validation perform accepts a string with the complete area title and returns a boolean worth indicating whether or not the area title is legitimate or not.


validate('aoms') == False
validate('') == True
validate('') == True
validate('AMAZON.COM') == True
validate('') == True
validate('') == False
validate('') == False
validate('[email protected]') == False
validate('') == False

The answer in Python

Choice 1:

import re

def validate(area):
    return re.match('''
        (?=^.{,253}$)          # max. size 253 chars
        (?!^.+.d+$)          # TLD just isn't absolutely numerical
        (?=^[^-.].+[^-.]$)     # would not begin/finish with '-' or '.'
        (?!^.+(.-|-.).+$)    # ranges do not begin/finish with '-'
        (?:[a-zd-]            # makes use of solely allowed chars
        {1,63}(.|$))          # max. stage size 63 chars
        {2,127}                # max. 127 ranges
        ''', area, re.X | re.I)

Choice 2:

def validate(area):
    if len(area) > 253 or len(area) == 0:
        return False
    els = area.break up('.')
    if len(els) > 127 or len(els) < 2:
        return False
    for x in els:
        if len(x) > 63 or len(x) == 0:
            return False

        if not x[0].isalnum() or not x[-1].isalnum():
            return False

        for l in x:
            if (not all(ord(c) < 128 for c in l) or not l.isalnum()) and l != '-':
                return False

    if els[-1].isnumeric():
        return False
    return True

Choice 3:

import re

def validLevel(lvl):
    return not bool('^-|-$', lvl)) and bool(re.match(r'[a-zA-Z0-9-]{1,63}$', lvl))

def validate(area):
    lst = area.break up('.')
    return len(area) <= 253 
           and a pair of <= len(lst) <= 127 
           and never lst[-1].isdigit() 
           and all( validLevel(lvl) for lvl in lst )

Take a look at circumstances to validate our answer

check.describe('Area title validator exams')
check.count on(not validate('aoms')) 
check.count on(validate(''))
check.count on(validate(''))
check.count on(validate('AMAZON.COM'))
check.count on(validate(''))
check.count on(not validate(''))
check.count on(not validate(''))
check.count on(not validate('[email protected]'))
check.count on(not validate(''))

