官术网_书友最值得收藏!

Handling unexpected types

We just saw that CSV data is read into Go as [][]string. However, Go is statically typed, which allows us to enforce strict checks for each of the CSV fields. We can do this as we parse each field for further processing. Consider some messy data that has random fields that don't match the type of the other values in a column:

4.6,3.1,1.5,0.2,Iris-setosa
5.0,string,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
5.3,3.7,1.5,0.2,Iris-setosa
5.0,3.3,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,string,Iris-setosa
5.5,3.5,1.3,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
4.4,3.0,1.3,0.2,Iris-setosa

To check the types of the fields in our CSV records, let's create a struct variable to hold successfully parsed values:

// CSVRecord contains a successfully parsed row of the CSV file.
type CSVRecord struct {
SepalLength float64
SepalWidth float64
PetalLength float64
PetalWidth float64
Species string
ParseError error
}

Then, before we loop over the records, let's initialize a slice of these values:

// Create a slice value that will hold all of the successfully parsed
// records from the CSV.
var csvData []CSVRecord

Now as we loop over the records, we can parse into the relevant type for that record, catch any errors, and log as needed:


// Read in the records looking for unexpected types.
for {

// Read in a row. Check if we are at the end of the file.
record, err := reader.Read()
if err == io.EOF {
break
}

// Create a CSVRecord value for the row.
var csvRecord CSVRecord

// Parse each of the values in the record based on an expected type.
for idx, value := range record {

// Parse the value in the record as a string for the string column.
if idx == 4 {

// Validate that the value is not an empty string. If the
// value is an empty string break the parsing loop.
if value == "" {
log.Printf("Unexpected type in column %d\n", idx)
csvRecord.ParseError = fmt.Errorf("Empty string value")
break
}

// Add the string value to the CSVRecord.
csvRecord.Species = value
continue
}

// Otherwise, parse the value in the record as a float64.
var floatValue float64

// If the value can not be parsed as a float, log and break the
// parsing loop.
if floatValue, err = strconv.ParseFloat(value, 64); err != nil {
log.Printf("Unexpected type in column %d\n", idx)
csvRecord.ParseError = fmt.Errorf("Could not parse float")
break
}

// Add the float value to the respective field in the CSVRecord.
switch idx {
case 0:
csvRecord.SepalLength = floatValue
case 1:
csvRecord.SepalWidth = floatValue
case 2:
csvRecord.PetalLength = floatValue
case 3:
csvRecord.PetalWidth = floatValue
}
}

// Append successfully parsed records to the slice defined above.
if csvRecord.ParseError == nil {
csvData = append(csvData, csvRecord)
}
}
主站蜘蛛池模板: 盖州市| 乌拉特后旗| 丹棱县| 阳原县| 涿州市| 陆良县| 高邮市| 关岭| 中牟县| 读书| 恩平市| 台安县| 三台县| 霍邱县| 油尖旺区| 兴业县| 通州区| 靖安县| 九龙坡区| 兴隆县| 彰武县| 泽州县| 莱阳市| 中宁县| 定边县| 志丹县| 康平县| 昆山市| 萍乡市| 伊川县| 崇义县| 广灵县| 肥乡县| 榆中县| 榆林市| 邵东县| 罗山县| 正阳县| 西充县| 南澳县| 平潭县|