Table of Contents

19.6.5 Checking the syntax of strings

Especially in forms, data entered as strings must be checked intensively so that only valid data can be processed further and incorrect entries can be requested again.

19.6.5.1 Checking the syntax of strings

Imagine that in a Gambas project you have to enter or check the postal code, the e-mail address, the date of birth and a monetary value in addition to the first and last name in a form. Based on what you have learned so far, you can already guess that you will probably not be able to avoid the use of regular expressions if you are thinking, for example, of checking a syntactically correct e-mail address. The spontaneous thought of using string functions for the check has been abandoned in view of the large number of different but syntactically correct e-mail addresses.

As the simplest case, therefore, you pick out the postcode. Nothing could be easier! The German postcode consists of exactly 5 digits and this can be quickly mapped onto the following regular expression:

 Pattern = "^([0-9]){5}$"

A test program using the above Match(Subject As String, Pattern As String) function is quickly designed and the test seemed successful - until someone told you the following:

Would you have known? A remedy is a newly formulated regular expression:

  "^((?:0[1-46-9]\\d{3})|(?:[1-357-9]\\d{4})|(?:[4][0-24-9]\\d{3})|(?:[6][013-9]\\d{3}))$"

You had certainly already thought of that … .

19.6.5.2 Project: Checking the syntax of strings

In this project you will find suggestions on how to check the syntax of selected strings with clearly defined semantics by using regular expressions. Keep in mind that the use of regular expressions can be a very effective approach. But only if the designed pattern describes the class to be checked very well and sufficiently error-free! The patterns used in the project have been tested intensively. This does not exclude that there are still use cases that are not covered correctly or for all applications.

Prüfung

Figure 19.6.5.2.1: Use of regular expressions (syntax check)

The complete project can be found in the download area. Therefore, you will discover here only an excerpt from the source code for checking the syntax of an ISBN 10 - an International Standard Book Number. Since 2007, only ISBN 13 are assigned. While one can check the syntax with a regular expression, the correct check digit must be calculated conventionally, because this cannot be done with a regular expression:

  Public Sub btnPruefungISBNNummer10_Click()
 
    sSubject = txtISBNNummer10.Text
    sPattern = "^ISBN\\s(?=[-0-9xX ]{13}$)(?:[0-9]+[- ]){3}[0-9]*[xX0-9]$" ' ISBN 10
 
    If txtISBNNummer10.Text = "" Then
       Message.Warning("Enter an ISBN (10)!")
       txtISBNNummer10.SetFocus
       Return
    Endif
 
    If Match(sSubject, sPattern) = True Then
       SetLEDColor(pbISBN10, "green")
       bISBN_10 = True
       btnISBN_PZ10.Enabled = True
    Else
       SetLEDColor(pbISBN10, "red")
    Endif
 
  End

Calculation of the check digit as the last character in an ISBN (10) is done according to a clearly defined algorithm:

  Public Sub btnISBN_PZ10_Click()
    Dim iSumme, iPruefZiffer, iPruefZifferISBN, iCount As Integer
    Dim sISBN, iPruefZifferStringISBN As String ' Data type String because of check digit 10=X
 
    sISBN = Replace(txtISBNNummer10.Text, "ISBN ", "")
    sISBN = Replace(sISBN, " ", "")
    sISBN = Replace(sISBN, "-", "")
    iSumme = 0
    iPruefZifferStringISBN = Right(sISBN)
 
    If Upper(iPruefZifferStringISBN) = "X" Then
       iPruefZifferISBN = 10
    Else
       iPruefZifferISBN = Val(iPruefZifferStringISBN)
    Endif
 
    For iCount = 1 To Len(sISBN) - 1
        iSumme = isumme + iCount * Mid(sISBN, icount, 1)
    Next
 
    iPruefZiffer = iSumme Mod 11
 
    If iPruefZiffer <> iPruefZifferISBN Then
       Message.Error("Fehler!" & Chr(10) & "The syntax of the ISBN number (10) is correct." & Chr(10) & "However, the check digit (ISBN_10) is wrong.")
    Else
       Message.Info("The check digit ISBN-10 number is correct.")
       btnISBN_PZ10.Enabled = False
    Endif
 
  End

Figure 19.6.5.2.2: Annotated calculation of the check digit of an ISBN (10)

Download