The class CsvFile (gb.util) can be used to read a CSV file and decode its contents. Note: It is assumed that the first line of the CSV file necessarily contains the field names. If not all field names are given, then the missing field name is replaced by a # character followed by the index of the field in the imaginary field list. In general: The CsvFile class cannot handle CSV files that do not have a (first) line with the field names (header), although the header with the field names is optional according to RFC 4180!
You can create an object of the class. It will open a CSV file for reading:
Dim hCsvFile As CsvFile hCsvFile = New CsvFile ( Path As String [ , Separator As String, Escape As String ] )
The CsvFile class has three properties:
Property | Data type | Description |
---|---|---|
Eof | Boolean | True is returned if the end of the CSV file was reached while reading. |
Fields | String[ ] | Return a list of field names as string array. |
Line | Integer | Returns the number of the last read line. |
Table 6.13.1.1 : Properties of the CsvFile class
All field names in the list of field names are normalized using the following algorithm:
The CsvFile class has only this one method:
Function Read() As Collection
The function reads a CSV file and returns it as a collection of field values indexed by their field names: CsvFile.Fields property. The field values are normalized using the following algorithm:
You should be aware of and follow the following hints:
The correct conversion of the rows in a CSV file to Variant[] instead of Collection, for example, would look like this, where you get all the field names via the CsvFile.Fields property:
Private Function GetData() As Variant[] Dim aFieldList As Variant[] Dim cLine As Collection Dim sField As String, aFields As String[] aFieldList = New Variant[] While Not hCSVFile.Eof cLine = hCSVFile.Read() aFields = New String[] For Each sField In hCSVFile.Fields aFields.Add(cLine[sField]) Next aFieldList.Add(aFields) Wend Return aFieldList End
For example, if you want to know if the field with the field name “location” is set in the current row, then you need to use the Exist() method for Collection:
Dim cLine As Collection ... cLine = hCSVFile.Read() If Not cLine.Exist("ort") Then ...
Here are more hints for creating and editing CSV files - found on the website https://www.thoughtspot.com/blog/6-rules-creating-valid-csv-files - by Jon Avrach, the observance of which will minimize problems, especially when using CSV files to import data into databases:
For more details on these notes, you can get good information on Wikipedia with appropriate search criteria and in the RFC 4180 (CSV specification).
In the projects about CSV, 2 cases have been distinguished. The first case is about writing CSV files with data from different sources. The sources were data in controls such as GridView or TextArea or XML files or database tables. The sources for the second case were CSV files whose content is read and processed, saved or displayed in different ways. Only for this case, the CsvFile class from the gb.util component can be used.
In the chapter → https://www.gambas-buch.de/doku.php?id=k17:k17.7:k17.7.6:start you can find a project for exporting selected data to a CSV file and in the chapter at https://www.gambas-buch.de/doku.php?id=k17:k17.7:k17.7.7:start you can find a project for importing data from a CSV file. Interesting in both cases is the possibility to specify both the separator character and the escape character. You can also specify whether the field names should be saved as headers as well, or preview to see if a (first) line with the field names as headers exists.
The archives for these two revised projects can be found in the download area.
The first project first reads a CSV file in the dialog. Then the content of the CSV file is prepared in such a way that
The thus prepared content is saved under the original file name. Afterwards, the content of the prepared CSV file is decoded with a new CsvFile object, specifying the field separator and the escape character, and written to an XML document (Chapter 27.2.1). Finally, the XML document is saved. There is no check if the CSV file contains a list of field names in the first line. A header is simply expected, because otherwise the use of the class CsvFile is not possible!
Figure 6.13.4.1: Content of the prepared CSV file
Figure 6.13.4.2: Extract from the contents of the XML file.
The source code is given in full and commented:
[1] ' Gambas class file [2] [3] Private $sSeparator As String [4] Private $sEscape As String [5] Public hXmlWriter As XmlWriter [6] Public sFilePath As String [7] Public hCSVFile As CsvFile [8] [9] Public Sub Form_Open() [10] FMain.Center [11] FMain.Resizable = True [12] FMain.Caption = "CSV2XML | Gambas-Version: " & System.FullVersion [13] [14] $sSeparator = "," [15] $sEscape = "\"" ' Alternative: Chr(34) => " [16] [17] btnConvertAndSave.Enabled = False [18] [19] End ' Form_Open() [20] [21] Public Sub btnPrepareAndSave_Click() [22] sFilePath = FileOpen() [23] If sFilePath Then [24] txaPrepare.Text = Prepare(sFilePath) [25] txaPrepare.Pos = 0 [26] File.Save(sFilePath, Prepare(sFilePath)) [27] Wait [28] btnPrepareAndSave.Enabled = False [29] btnConvertAndSave.Enabled = True [30] Else [31] Return [32] Endif [33] End [34] [35] Public Sub btnConvertAndSave_Click() [36] WriteXML() [37] btnPrepareAndSave.Enabled = True [38] btnConvertAndSave.Enabled = False [39] End [40] [41] Public Sub btnClose_Click() [42] FMain.Close() [43] End ' Close [44] [45] Private Function Prepare(sFilePath As String) As String [46] [47] Dim sCSVContent, sCSVChanged, sLine, sField, sNewLine As String [48] [49] sCSVContent = File.Load(sFilePath) [50] [51] For Each sLine In Split(sCSVContent, gb.NewLine, $sEscape, False, True) [52] sLine = Trim(sLine) [53] If Not sLine Then Continue ' Blank lines are ignored [54] sNewLine = "" [55] For Each sField In Split(sLine, $sSeparator, $sEscape, False, True) [56] If sField Not Begins $sEscape And If sField Not Ends $sEscape Then [57] sField = $sEscape & sField & $sEscape [58] Endif [59] If sField = $sEscape & $sEscape Or If sField = $sEscape & " " & $sEscape Then [60] ' sField = $sEscape & "" & $sEscape ' Case 1 [61] sField = "ZERO" ' Case 2 [62] Endif [63] sNewLine &= sField & $sSeparator [64] Next [65] If sNewLine Ends $sSeparator Then sNewLine = Left(sNewLine, -1) [66] sCSVChanged &= sNewLine & gb.NewLine [67] Next [68] [69] sCSVChanged = Left(sCSVChanged, -1) ' Line end character of the *last* line is removed [70] [71] Return sCSVChanged [72] [73] End ' Prepare(...) [74] [75] Private Sub WriteXML() [76] [77] Dim k As Integer [78] Dim cLine As Collection [79] Dim sField As String [80] [81] hCSVFile = New CsvFile(sFilePath, $sSeparator, $sEscape) [82] [83] hXmlWriter = New XmlWriter [84] hXmlWriter.Open(Zero, True, "UTF-8") [85] [86] hXmlWriter.PI("Document", "version=\"" & Format(Now, "dd.mm.yyyy-hh:nn") & "\"") [87] [88] hXmlWriter.StartElement("root") [89] hXmlWriter.Comment("DataBase: " & File.Name(sFilePath)) [90] [91] While Not hCSVFile.Eof [92] hXmlWriter.StartElement("item") [93] cLine = hCSVFile.Read() [94] k = 0 [95] For Each sField In hCSVFile.Fields [96] If sField = "zero" Then [97] hXmlWriter.Element("field_" & CStr(k + 1), cLine[sField]) [98] Else [99] hXmlWriter.Element(sField, cLine[sField]) [100] Endif [101] Inc k [102] Next [103] hXmlWriter.EndElement [104] Wend [105] [106] hXmlWriter.EndElement ' root [107] hXmlWriter.EndDocument [108] txaPrepare.Text = hXmlWriter.Data [109] txaPrepare.Pos = 0 [110] File.Save(File.Dir(sFilePath) &/ File.BaseName(sFilePath) & ".xml", hXmlWriter.Data) [111] [112] End [113] [114] Private Function FileOpen() As String [115] Dialog.Title = "Import a csv file!" [116] Dialog.Filter = ["*.csv", "csv-Files"] [117] Dialog.Path = Application.Path &/ "CSV" [118] If Dialog.OpenFile(False) = True Then ' Multiselect=False (Standard) [119] Message.Info("The opening of the csv file was cancelled!") [120] Return [121] Else [122] Return Dialog.Path [123] Endif [124] End ' FileOpen()
Comment:
In this digression, you will be presented with a (bash) script that likewise allows you to quickly transform the contents of a CSV file into the contents of an XML file. The executable script expects exactly 3 parameters:
→ the file name of the CSV file,→ the field separator, and the→ escape character:
hans@mint71 ~/GB3BUCH/.../Projekte/CSV_READ/SH-Skript $ ./test.sh "db.csv" "," "\""
If the number of parameters is not 3, then there is an error message with a hint about the correct syntax:
Syntax: ./test.sh 'CSV-File-Path' 'Field separator' 'Escape character' # hans@mint71 ~/GB3BUCH/.../Projekte/CSV_READ/SH-Skript $ ./test.sh "db.csv" ","
If this is the content of the CSV file:
"First name","Age","Residence" "Hans","68","Pößneck" "Peter","56","Cork" "Anna","18","Jena"
then here you can see the result in the XML file:
<?xml version="1.0" encoding="UTF-8"?> <Data> <Record> <First name>Hans</First name> <Age>68</Age> <Residence>Pößneck</Residence> </record> <Record> <First name>Peter</First name> <age>56</age> <Residence>Cork</Residence> </record> <Record> <FirstName>Anna</FirstName> <Age>18</Age> <Residence>Jena</Residence> </Record> </Data>
The tested source code from the (bash) script is well documented and is given in full:
#!/bin/bash # She-Bang Line → Bash interpreter if [ $# -ne 3 ] # If the number of parameters (= $#) is not equal to 3 ... then # FARB-WERTE: RO="\033[31m" # red NO="\033[0m" # regular echo -e "${RO}Syntax: $0 'CSV-File-Path' 'Field separator' 'Escape character'" exit 1 fi # Saving the variable IFS in the variable backIFS | IFS = Internal Field Separator backIFS=$IFS file_in=$1 # Parameter 1 echo $file_in # Filename for file_out (XML-File) generate filebasename=${file_in%.*} fileextension=".xml" file_out=$filebasename$fileextension echo $file_out separator=$2 # Parameter 2 escape=$3 # Parameter 3 # 1. Read line (= list of field names) from CSV file read header_line < "$file_in" IFS=$separator # store list of field names in array 'header' -> operator =() header=($header_line) IFS='*' # Read in the content of the CSV file line by line and save it in the array 'content'. i=0 while read content[$i] do # echo "Line $i: ${content[i]}" ((i++)) done < $1 # 1. Delete row in array 'content unset content[0] # copy array - but without the now empty first element content=(${content[*]}) # write content of the XML file #------------------------------------------------------------------------------- # XML-Prolog echo '<?xml version="1.0" encoding="UTF-8"?>' > $file_out # XML-Root-Element echo '<Data>' >> $file_out # XML-Elemente #------------------------------------------------------------------------------- for record in "${content[@]}" do echo ' <Record>' >> $file_out index=0 #............................................................................. IFS=$separator list=($record) for (( c=0; c<=${#header[@]}-1; c++ )) do tag=${header[$c]#$escape} tag=${tag%$escape} value=${list[$c]#$escape} value=${value%$escape} echo ' <'${tag}'>'${value}'</'${tag}'>' >> $file_out done #............................................................................. echo ' </Record>' >> $file_out ((index++)) done #------------------------------------------------------------------------------- echo '</Data>' >> $file_out IFS=$backIFS exit 0
In the second project, a CSV file is read. The data decoded with the CsvFile read method is displayed in a GridView.
Source code:
' Gambas class file Public Sub btnReadCSV_Click() Dim i, iVal As Integer Dim sField As New String[] Dim sKey, sSeparator, sEscape As String Dim hCsvFile As CsvFile Dim cLine As Collection sSeparator = "," sEscape = "\"" ' Chr$(34) ' Open CSV file Try hCsvFile = New CsvFile(FileOpen(), sSeparator, sEscape) If Error Then Return GridView1.Columns.Count = 0 ' Prepare header GridView1.Header = True sField = hCsvFile.Fields GridView1.Columns.Count = sField.Count For i = 0 To sField.Count - 1 GridView1.Columns[i].Title = UCase(sField[i]) Next ' Read CSV data While Not hCsvFile.Eof cLine = hCsvFile.Read() If cLine.Count > 0 Then ' Collection not empty? Inc GridView1.Rows.Count iVal = 0 For Each sKey In hCsvFile.Fields Gridview1[GridView1.Rows.Count - 1, iVal].text = cLine[sKey] Inc iVal Next Endif Wend GridView1.MoveTo(0, 0) GridView1.Columns.Width = -1 ' Adjust column width automatically End Private Function FileOpen() As String Dialog.Title = ("Import csv-File!") Dialog.Filter = ["*.csv", "csv-Files"] Dialog.Path = Application.Path &/ "Files" If Dialog.OpenFile(False) = True Then ' Multiselect=False (Standard) Message.Info(("Opening the csv file has been canceled!")) Return "break" Else Return Dialog.Path Endif End ' FileOpen()
Figure 6.13.6.1: Data from a CSV file - displayed in a GridView.
Project 3 is dedicated to the case where data from a database table is stored in a CSV file. Note: In the case of database tables, MyQSL, PostgreSQL or SQlite provide dump or copy methods that can be used to save data directly to a CSV file. Therefore, only the method in which the selected data from a DB table is converted to a string representing the contents of a CSV file is presented:
Private Function GetCSVData() As String Dim sCSVLine, sRequest, sSeparator, sEscape As String Dim hRequest As SQLRequest Dim dbResult As Result Dim rField As ResultField hRequest = hConnection.SQL.Select("").From(cmbDBTabellen.Text) sRequest = hRequest.Get() dbResult = hConnection.Exec(sRequest) If dbResult.Count = 0 Then Message(("The number of selected records is zero!")) Return Endif FMain.Caption = "Data export from database: '" FMain.Caption &= File.Name(sDBFilePath) & "' " FMain.Caption &= " Table: '" & cmbDBTables.Text FMain.Caption &= "' in CSV file" sSeparator = "," ' Field separator sEscape = "\"" ' Text separator ' Field list For Each rField In dbResult.Fields sCSVLine &= sEscape & Upper(rField.Name) & sEscape & sSeparator Next sCSVLine = Left(sCSVLine, -1) ' Last field name WITHOUT field separator sCSVLine &= gb.NewLine ' Data lines For Each dbResult For Each rField In dbResult.Fields If dbResult[rField.Name] = "" Then sCSVLine &= sEscape & "NULL" & sEscape & sSeparator Else sCSVLine &= sEscape & dbResult[rField.Name] & sEscape & sSeparator Endif Next sCSVLine = Left(sCSVLine, -1) ' Last field name WITHOUT field separator sCSVLine &= gb.NewLine Next sCSVLine = Left(sCSVLine, -1) ' Last line WITHOUT gb.NewLine Return sCSVLine End
Figure 6.14.7.1: Data Export DB Data to CSV File
In the TextArea you can see an excerpt from the content of the CSV file.
Hint:
In the chapter https://www.gambas-buch.de/doku.php?id=k27:k27.6:start#projekt_2xml_csv a project is presented, where the transformation XML → CSV is implemented.