User Tools

Site Tools


k19:k19.6:k19.6.6:start

19.6.6 Search and replace strings in a text

For searching and replacing strings in a given text, regular expressions on the one hand and methods for manipulating strings on the other hand were used in combination in a project. You can search for different - fixed - strings in the text that have a special meaning or enter your patterns yourself:

  • MAC address with a colon as separator.
  • Time - 11:45 p.m.
  • Decimal number with a comma as decimal separator.
  • Text in the form “Text”.
  • date
  • IP address
  • URL with http://www.werist.da or www.werist.da or as combination
  • Postcode for Germany
  • Monetary value with currency €
  • ISBN with hyphen according to ISO 2108
  • Telephone number (area code) number
  • E-mail address
  • Colour value with &HC3DDFF or &C3DDFF


SearchAndReplace

Figure 19.6.6.1: Searching and replacing strings in a text

It turns out that for complex texts, a pattern must be constructed for each class of strings so that it has at least one unique feature that distinguishes it from other classes. Otherwise, for example, you would not be able to distinguish the monetary value 12.44 € from 12.44 as a normal decimal number. Currently, you can only search for ISBN (10) formatted with hyphens according to ISO 2108.

The source code is given almost in its entirety, mainly to show you the interaction of regular expressions and methods for manipulating strings:

  ' Gambas class file
 
  Private $rExpression As Regexp
  Private $sPattern As String
 
  Public Sub Form_Open()
    $rExpression = New Regexp
    ...
  End
 
  Public Sub btnSearch_Click()
 
    If Not cbxRegexpression.Text Then
      Message.Error("Es ist KEIN Suchtext vorhanden!")
      cbxRegexpression.SetFocus
      Return
    Endif
 
    Try $rExpression.Compile(SetPattern(cbxRegexpression.Text)) ' Fehler im Muster abfangen!
    If Error Then
       Message.Error("FEHLER IM REGULÄREN AUSDRUCK!")
       Return
    Endif
 
    lbxSearch.List = Search(txaText.Text)
 
  End
 
  Public Sub btnReplace_Click()
 
    If Not cbxRegexpression.Text Then
      Message.Error("Es ist KEIN Suchtext vorhanden!")
      Return
      cbxRegexpression.SetFocus()
    Endif
 
    If Not txtReplace.Text Then
       If Message.Error("KEIN Text vorhanden!", "Mit NICHTS ersetzen!", "Abbrechen!") = 1 Then
          Try $rExpression.Compile(SetPattern(cbxRegexpression.Text))
          If Error Then
             Message.Error("FEHLER IM REGULÄREN AUSDRUCK!")
             Return
          Endif
          txaText.Text = TextReplace(txaText.Text, txtReplace.Text)
       Else
          Return
          cbxRegexpression.SetFocus()
       Endif
    Endif
  End
 
  Private Function Search(sText As String) As String[]
    Dim aSearchList As New String[]
    Dim iStart As Integer = 1
 
    Try $rExpression.Exec(sText)  ' ---> Fehler abfangen!
    If Error Then
       Message.Error("Es trat ein Fehler auf ...!")
       Return
    Endif
 
    While $rExpression.Offset <> -1
      aSearchList.Add($rExpression.Text)
      iStart += $rExpression.Offset + Len($rExpression.Text)
      If iStart > Len(sText) Then Break
      $rExpression.Exec(Mid$(sText, iStart))
    Wend
 
    Return aSearchList
 
  End
 
  Private Function TextReplace(sText As String, sReplace As String) As String
    Dim iStart As Integer = 1
 
    Try $rExpression.Exec(sText) ' ---> Fehler abfangen!
    If Error Then
       Message.Error("Es trat ein Fehler auf ...!")
       Return
    Endif
 
    While $rExpression.Offset <> -1
      iStart += $rExpression.Offset
      sText = Mid$(sText,1,iStart - 1) & sReplace & Mid$(sText, iStart + Len($rExpression.Text))
      iStart += Len(sReplace)
      If iStart > Len(sText) Then Break
      $rExpression.Exec(Mid$(sText, iStart))
    Wend
 
    Return sText
 
  End
 
'-- Die Funktion SetPattern() ist nur notwendig, um den Mustern einen Hinweistext voranzustellen.
 
  Private Function SetPattern(sInput As String) As String
     Dim iPosition As Integer
 
     iPosition = InStr(sInput, "--->")
     If iPosition = 0 Then
        $sPattern = sInput
     Else
        $sPattern = Replace(sInput, Left(sInput, iPosition + 3), "")
        $sPattern = Trim($sPattern)
     Endif
 
     Return $sPattern
 
  End

As a special feature, the two methods Regexp.Compile() and Regexp.Exec() are used in the project.

19.6.6.1 Method Regexp.Compile (gb.pcre)

  Syntax:	Sub Compile ( Pattern As String [ , CompileOptions As Integer ] )

The Compile() method allows you to pre-compile a regular expression (pattern) for later execution by the Exec method. This is useful if you want to use a pattern often for a lot of text.

19.6.6.2 Regexp.Exec (gb.pcre) method

  Syntax:	Sub Exec ( Subject As String [ , ExecOptions As Integer ] )

The Exec() method allows you to use a previously compiled regular expression. This is particularly useful if you want to check many different texts. You only need to compile a regular expression once and can then use Exec(..) repeatedly, which is expected to be faster.

The Compile(..) and Exec(..) methods are automatically executed when you specify a pattern and a text (subject) and call a (new) regexp object.

You can use a selection of the following constants as options in the two methods:

Anchored  BadMagic  BadOption  BadUTF8  BadUTF8Offset  Callout  Caseless  DollarEndOnly  DotAll
Extended  Extra  MatchLimit  MultiLine  NoAutoCapture  NoMatch  NoMemory  NoSubstring  NoUTF8Check
NotBOL  NotEOL  NotEmpty  Null  UTF8  Ungreedy  UnknownNode

Example: Regexp.Anchored → Constant Anchored As Integer = 16

If this constant is specified as an option, the pattern will be “anchored” to the first matching position in the subject. This effect can also be achieved by suitable constructs in the pattern itself.

In the next source code excerpt from a project for converting Gambas source code into HTML source code, among other things, a URL is searched for in the Gambas source code and replaced by a suitable hyperlink if the result is positive:

  Private Sub parseLinks(sURL As String) As String
    Dim sRegex As Regexp
    Dim sPattern, sReplace As String
 
    sPattern = "(https?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.-]*(\\?\\S+)?)?)?)|(@[\\w]*|#[\\w]*)"
    sRegex = New Regexp(sURL, sPattern)
    sReplace = "<a href=\"" & sRegex.Text & "\">" & sRegex.Text & "</a>"
 
    Return Replace(sURL, sRegex.Text, sReplace)
 
  End
 
  Public Sub Test_Click()
    Print parseLinks("http://www.gambas-buch.de")
  End

The result can be seen:

  <a href="http://www.gambas-buch.de">http://www.gambas-buch.de</a>

Download

The website uses a temporary session cookie. This technically necessary cookie is deleted when the browser is closed. You can find information on cookies in our privacy policy.
k19/k19.6/k19.6.6/start.txt · Last modified: 21.10.2023 by emma

Page Tools