Table of Contents

6.13 Class CsvFile

The class CsvFile (gb.util) can be used to read a CSV file and decode its contents. Note: It is assumed that the first line of the CSV file necessarily contains the field names. If not all field names are given, then the missing field name is replaced by a # character followed by the index of the field in the imaginary field list. In general: The CsvFile class cannot handle CSV files that do not have a (first) line with the field names (header), although the header with the field names is optional according to RFC 4180!

You can create an object of the class. It will open a CSV file for reading:

Dim hCsvFile As CsvFile
hCsvFile = New CsvFile ( Path As String [ , Separator As String, Escape As String ] )

6.13.1 Properties

The CsvFile class has three properties:

PropertyData typeDescription
EofBooleanTrue is returned if the end of the CSV file was reached while reading.
FieldsString[ ]Return a list of field names as string array.
LineIntegerReturns the number of the last read line.

Table 6.13.1.1 : Properties of the CsvFile class

All field names in the list of field names are normalized using the following algorithm:

6.13.2 Methods

The CsvFile class has only this one method:

Function Read() As Collection

The function reads a CSV file and returns it as a collection of field values indexed by their field names: CsvFile.Fields property. The field values are normalized using the following algorithm:

6.13.3 Notes

You should be aware of and follow the following hints:

The correct conversion of the rows in a CSV file to Variant[] instead of Collection, for example, would look like this, where you get all the field names via the CsvFile.Fields property:

Private Function GetData() As Variant[]
  Dim aFieldList As Variant[]
  Dim cLine As Collection
  Dim sField As String, aFields As String[]
 
  aFieldList = New Variant[]
 
  While Not hCSVFile.Eof
    cLine = hCSVFile.Read()
    aFields = New String[]
    For Each sField In hCSVFile.Fields
      aFields.Add(cLine[sField])
    Next
    aFieldList.Add(aFields)
  Wend
 
  Return aFieldList
 
End

For example, if you want to know if the field with the field name “location” is set in the current row, then you need to use the Exist() method for Collection:

Dim cLine As Collection
...
cLine = hCSVFile.Read()
If Not cLine.Exist("ort") Then ...

Here are more hints for creating and editing CSV files - found on the website https://www.thoughtspot.com/blog/6-rules-creating-valid-csv-files - by Jon Avrach, the observance of which will minimize problems, especially when using CSV files to import data into databases:

For more details on these notes, you can get good information on Wikipedia with appropriate search criteria and in the RFC 4180 (CSV specification).

In the projects about CSV, 2 cases have been distinguished. The first case is about writing CSV files with data from different sources. The sources were data in controls such as GridView or TextArea or XML files or database tables. The sources for the second case were CSV files whose content is read and processed, saved or displayed in different ways. Only for this case, the CsvFile class from the gb.util component can be used.

In the chapter → https://www.gambas-buch.de/doku.php?id=k17:k17.7:k17.7.6:start you can find a project for exporting selected data to a CSV file and in the chapter at https://www.gambas-buch.de/doku.php?id=k17:k17.7:k17.7.7:start you can find a project for importing data from a CSV file. Interesting in both cases is the possibility to specify both the separator character and the escape character. You can also specify whether the field names should be saved as headers as well, or preview to see if a (first) line with the field names as headers exists.

B1
Figure 6.13.3.1: Data export

B2
Figure 6.13.3.2: Data import.

The archives for these two revised projects can be found in the download area.

6.13.4 Project 1 - CSV2XML

The first project first reads a CSV file in the dialog. Then the content of the CSV file is prepared in such a way that

The thus prepared content is saved under the original file name. Afterwards, the content of the prepared CSV file is decoded with a new CsvFile object, specifying the field separator and the escape character, and written to an XML document (Chapter 27.2.1). Finally, the XML document is saved. There is no check if the CSV file contains a list of field names in the first line. A header is simply expected, because otherwise the use of the class CsvFile is not possible!


Figure 6.13.4.1: Content of the prepared CSV file


Figure 6.13.4.2: Extract from the contents of the XML file.

The source code is given in full and commented:

[1] ' Gambas class file
[2]
[3] Private $sSeparator As String
[4] Private $sEscape As String
[5] Public hXmlWriter As XmlWriter
[6] Public sFilePath As String
[7] Public hCSVFile As CsvFile
[8]
[9] Public Sub Form_Open()
[10]   FMain.Center
[11]   FMain.Resizable = True
[12]   FMain.Caption = "CSV2XML | Gambas-Version: " & System.FullVersion
[13]
[14]   $sSeparator = ","
[15]   $sEscape = "\"" ' Alternative: Chr(34) => "
[16]
[17]   btnConvertAndSave.Enabled = False
[18]
[19] End ' Form_Open()
[20]
[21] Public Sub btnPrepareAndSave_Click()
[22]   sFilePath = FileOpen()
[23]   If sFilePath Then
[24]      txaPrepare.Text = Prepare(sFilePath)
[25]      txaPrepare.Pos = 0
[26]      File.Save(sFilePath, Prepare(sFilePath))
[27]      Wait
[28]      btnPrepareAndSave.Enabled = False
[29]      btnConvertAndSave.Enabled = True
[30]   Else
[31]     Return
[32]   Endif
[33] End
[34]
[35] Public Sub btnConvertAndSave_Click()
[36]   WriteXML()
[37]   btnPrepareAndSave.Enabled = True
[38]   btnConvertAndSave.Enabled = False
[39] End
[40]
[41] Public Sub btnClose_Click()
[42]   FMain.Close()
[43] End ' Close
[44]
[45] Private Function Prepare(sFilePath As String) As String
[46]
[47]   Dim sCSVContent, sCSVChanged, sLine, sField, sNewLine As String
[48]
[49]   sCSVContent = File.Load(sFilePath)
[50]
[51]   For Each sLine In Split(sCSVContent, gb.NewLine, $sEscape, False, True)
[52]     sLine = Trim(sLine)
[53]     If Not sLine Then Continue ' Blank lines are ignored
[54]     sNewLine = ""
[55]     For Each sField In Split(sLine, $sSeparator, $sEscape, False, True)
[56]       If sField Not Begins $sEscape And If sField Not Ends $sEscape Then
[57]          sField = $sEscape & sField & $sEscape
[58]       Endif
[59]       If sField = $sEscape & $sEscape Or If sField = $sEscape & " " & $sEscape Then
[60]        ' sField = $sEscape & "" & $sEscape ' Case 1
[61]          sField = "ZERO"                   ' Case 2
[62]       Endif
[63]       sNewLine &= sField & $sSeparator
[64]     Next
[65]      If sNewLine Ends $sSeparator Then sNewLine = Left(sNewLine, -1)
[66]      sCSVChanged &= sNewLine & gb.NewLine
[67]   Next
[68]
[69]   sCSVChanged = Left(sCSVChanged, -1) ' Line end character of the *last* line is removed
[70]
[71]   Return sCSVChanged
[72]
[73] End ' Prepare(...)
[74]
[75] Private Sub WriteXML()
[76]
[77]   Dim k As Integer
[78]   Dim cLine As Collection
[79]   Dim sField As String
[80]
[81]   hCSVFile = New CsvFile(sFilePath, $sSeparator, $sEscape)
[82]
[83]   hXmlWriter = New XmlWriter
[84]   hXmlWriter.Open(Zero, True, "UTF-8")
[85]
[86]   hXmlWriter.PI("Document", "version=\"" & Format(Now, "dd.mm.yyyy-hh:nn") & "\"")
[87]
[88]   hXmlWriter.StartElement("root")
[89]   hXmlWriter.Comment("DataBase: " & File.Name(sFilePath))
[90]
[91]   While Not hCSVFile.Eof
[92]     hXmlWriter.StartElement("item")
[93]     cLine = hCSVFile.Read()
[94]     k = 0
[95]     For Each sField In hCSVFile.Fields
[96]       If sField = "zero" Then
[97]          hXmlWriter.Element("field_" & CStr(k + 1), cLine[sField])
[98]       Else
[99]          hXmlWriter.Element(sField, cLine[sField])
[100]     Endif
[101]     Inc k
[102]    Next
[103]    hXmlWriter.EndElement
[104]  Wend
[105]
[106]   hXmlWriter.EndElement ' root
[107]   hXmlWriter.EndDocument
[108]   txaPrepare.Text = hXmlWriter.Data
[109]   txaPrepare.Pos = 0
[110]   File.Save(File.Dir(sFilePath) &/ File.BaseName(sFilePath) & ".xml", hXmlWriter.Data)
[111]
[112] End
[113]
[114] Private Function FileOpen() As String
[115]   Dialog.Title = "Import a csv file!"
[116]   Dialog.Filter = ["*.csv", "csv-Files"]
[117]   Dialog.Path = Application.Path &/ "CSV"
[118]   If Dialog.OpenFile(False) = True Then ' Multiselect=False (Standard)
[119]      Message.Info("The opening of the csv file was cancelled!")
[120]      Return
[121]   Else
[122]      Return Dialog.Path
[123]   Endif
[124] End ' FileOpen()

Comment:

6.13.5 Digression - CSV2XML

In this digression, you will be presented with a (bash) script that likewise allows you to quickly transform the contents of a CSV file into the contents of an XML file. The executable script expects exactly 3 parameters:

→ the file name of the CSV file,→ the field separator, and the→ escape character:

hans@mint71 ~/GB3BUCH/.../Projekte/CSV_READ/SH-Skript $ ./test.sh "db.csv" "," "\""

If the number of parameters is not 3, then there is an error message with a hint about the correct syntax:

Syntax: ./test.sh 'CSV-File-Path' 'Field separator' 'Escape character'
# hans@mint71 ~/GB3BUCH/.../Projekte/CSV_READ/SH-Skript $ ./test.sh "db.csv" ","

If this is the content of the CSV file:

"First name","Age","Residence"
"Hans","68","Pößneck"
"Peter","56","Cork"
"Anna","18","Jena"

then here you can see the result in the XML file:

<?xml version="1.0" encoding="UTF-8"?>
<Data>
  <Record>
    <First name>Hans</First name>
    <Age>68</Age>
    <Residence>Pößneck</Residence>
  </record>
  <Record>
    <First name>Peter</First name>
    <age>56</age>
    <Residence>Cork</Residence>
  </record>
  <Record>
    <FirstName>Anna</FirstName>
    <Age>18</Age>
    <Residence>Jena</Residence>
  </Record>
</Data>

The tested source code from the (bash) script is well documented and is given in full:

#!/bin/bash # She-Bang Line → Bash interpreter
 
  if [ $# -ne 3 ] # If the number of parameters (= $#) is not equal to 3 ...
  then
    # FARB-WERTE:
    RO="\033[31m"    # red
    NO="\033[0m"     # regular
    echo -e "${RO}Syntax: $0 'CSV-File-Path' 'Field separator' 'Escape character'"
    exit 1
  fi
  # Saving the variable IFS in the variable backIFS | IFS = Internal Field Separator
  backIFS=$IFS
 
  file_in=$1 # Parameter 1
  echo $file_in
  # Filename for file_out (XML-File) generate
  filebasename=${file_in%.*}
  fileextension=".xml"
  file_out=$filebasename$fileextension
  echo $file_out
 
  separator=$2 # Parameter 2
  escape=$3    # Parameter 3
 
  # 1. Read line (= list of field names) from CSV file
  read header_line < "$file_in"
  IFS=$separator
  # store list of field names in array 'header' -> operator =()
  header=($header_line)
 
  IFS='*'
 
  # Read in the content of the CSV file line by line and save it in the array 'content'.
  i=0
  while read content[$i]
  do
    # echo "Line $i: ${content[i]}"
    ((i++))
  done < $1
 
  # 1. Delete row in array 'content
  unset content[0]
  # copy array - but without the now empty first element
  content=(${content[*]})
 
  # write content of the XML file
  #-------------------------------------------------------------------------------
  # XML-Prolog
  echo '<?xml version="1.0" encoding="UTF-8"?>' > $file_out
  # XML-Root-Element
  echo '<Data>' >> $file_out
  # XML-Elemente
  #-------------------------------------------------------------------------------
  for record in "${content[@]}"
  do
  echo '  <Record>' >> $file_out
    index=0
    #.............................................................................
 
    IFS=$separator list=($record)
 
    for (( c=0; c<=${#header[@]}-1; c++ ))
    do
      tag=${header[$c]#$escape}
      tag=${tag%$escape}
      value=${list[$c]#$escape}
      value=${value%$escape}
      echo '    <'${tag}'>'${value}'</'${tag}'>' >> $file_out
    done
 
    #.............................................................................
    echo '  </Record>' >> $file_out
    ((index++))
  done
  #-------------------------------------------------------------------------------
  echo '</Data>' >> $file_out
 
  IFS=$backIFS
 
  exit 0

6.13.6 Project 2 - CSV2GRID

In the second project, a CSV file is read. The data decoded with the CsvFile read method is displayed in a GridView.

Source code:

' Gambas class file
 
Public Sub btnReadCSV_Click()
 
  Dim i, iVal As Integer
  Dim sField As New String[]
  Dim sKey, sSeparator, sEscape As String
  Dim hCsvFile As CsvFile
  Dim cLine As Collection
  sSeparator = ","
  sEscape = "\"" ' Chr$(34)
' Open CSV file
  Try hCsvFile = New CsvFile(FileOpen(), sSeparator, sEscape)
  If Error Then Return
  GridView1.Columns.Count = 0
' Prepare header
  GridView1.Header = True
  sField = hCsvFile.Fields
  GridView1.Columns.Count = sField.Count
  For i = 0 To sField.Count - 1
    GridView1.Columns[i].Title = UCase(sField[i])
  Next
' Read CSV data
  While Not hCsvFile.Eof
    cLine = hCsvFile.Read()
    If cLine.Count > 0 Then ' Collection not empty?
       Inc GridView1.Rows.Count
       iVal = 0
       For Each sKey In hCsvFile.Fields
         Gridview1[GridView1.Rows.Count - 1, iVal].text = cLine[sKey]
         Inc iVal
       Next
    Endif
  Wend
 GridView1.MoveTo(0, 0)
 GridView1.Columns.Width = -1 ' Adjust column width automatically
End
 
Private Function FileOpen() As String
 
  Dialog.Title = ("Import csv-File!")
  Dialog.Filter = ["*.csv", "csv-Files"]
  Dialog.Path = Application.Path &/ "Files"
  If Dialog.OpenFile(False) = True Then ' Multiselect=False (Standard)
     Message.Info(("Opening the csv file has been canceled!"))
     Return "break"
  Else
     Return Dialog.Path
  Endif
 
End ' FileOpen()

B5
Figure 6.13.6.1: Data from a CSV file - displayed in a GridView.

6.13.7 Project 3 - QSLITE2CSV

Project 3 is dedicated to the case where data from a database table is stored in a CSV file. Note: In the case of database tables, MyQSL, PostgreSQL or SQlite provide dump or copy methods that can be used to save data directly to a CSV file. Therefore, only the method in which the selected data from a DB table is converted to a string representing the contents of a CSV file is presented:

Private Function GetCSVData() As String
 
  Dim sCSVLine, sRequest, sSeparator, sEscape As String
  Dim hRequest As SQLRequest
  Dim dbResult As Result
  Dim rField As ResultField
 
  hRequest = hConnection.SQL.Select("").From(cmbDBTabellen.Text)
  sRequest = hRequest.Get()
  dbResult = hConnection.Exec(sRequest)
 
  If dbResult.Count = 0 Then
     Message(("The number of selected records is zero!"))
     Return
  Endif
 
  FMain.Caption = "Data export from database: '"
  FMain.Caption &= File.Name(sDBFilePath) & "' "
  FMain.Caption &= " Table: '" & cmbDBTables.Text
  FMain.Caption &= "' in CSV file"
 
  sSeparator = ","  ' Field separator
  sEscape = "\""    ' Text separator
 
' Field list
  For Each rField In dbResult.Fields
    sCSVLine &= sEscape & Upper(rField.Name) & sEscape & sSeparator
  Next
  sCSVLine = Left(sCSVLine, -1) ' Last field name WITHOUT field separator
  sCSVLine &= gb.NewLine
 
' Data lines
  For Each dbResult
    For Each rField In dbResult.Fields
      If dbResult[rField.Name] = "" Then
         sCSVLine &= sEscape & "NULL" & sEscape & sSeparator
      Else
         sCSVLine &= sEscape & dbResult[rField.Name] & sEscape & sSeparator
      Endif
    Next
    sCSVLine = Left(sCSVLine, -1) ' Last field name WITHOUT field separator
    sCSVLine &= gb.NewLine
  Next
 
  sCSVLine = Left(sCSVLine, -1) ' Last line WITHOUT gb.NewLine
  Return sCSVLine
 
End

B6
Figure 6.14.7.1: Data Export DB Data to CSV File

In the TextArea you can see an excerpt from the content of the CSV file.

Hint:
In the chapter https://www.gambas-buch.de/doku.php?id=k27:k27.6:start#projekt_2xml_csv a project is presented, where the transformation XML → CSV is implemented.

Download