官术网_书友最值得收藏!

Keeping and representing data from a CSV file

Comma Separated Value (CSV) is a format to represent a table of values in plain text. It's often used to interact with data from spreadsheets. The specifications for CSV are described in RFC 4180, available at http://tools.ietf.org/html/rfc4180.

In this recipe, we will read a local CSV file called input.csv consisting of various names and their corresponding ages. Then, to do something useful with the data, we will find the oldest person.

Getting ready

Prepare a simple CSV file with a list of names and their corresponding ages. This can be done using a text editor or by exporting from a spreadsheet, as shown in the following figure:

The raw input.csv file contains the following text:

$ cat input.csv 

name,age
Alex,22
Anish,22
Becca,23
Jasdev,22
John,21
Jonathon,21
Kelvin,22
Marisa,19
Shiv,22
Vinay,22

The code also depends on the csv library. We may install the library through Cabal using the following command:

$ cabal install csv

How to do it...

  1. Import the csv library using the following line of code:
    import Text.CSV
  2. Define and implement main, where we will read and parse the CSV file, as shown in the following code:
    main :: IO ()
    main = do
      let fileName = "input.csv"
      input <- readFile fileName
  3. Apply parseCSV to the filename to obtain a list of rows, representing the tabulated data. The output of parseCSV is Either ParseError CSV, so ensure that we consider both the Left and Right cases:
      let csv = parseCSV fileName input
      either handleError doWork csv
    handleError csv = putStrLn "error parsing"
    doWork csv = (print.findOldest.tail) csv
  4. Now we can work with the CSV data. In this example, we find and print the row containing the oldest person, as shown in the following code snippet:
    findOldest :: [Record] -> Record
    findOldest [] = []
    findOldest xs = foldl1
              (\a x -> if age x > age a then x else a) xs
    
    age [a,b] = toInt a
                                   
    toInt :: String -> Int                               
    toInt = read
  5. After running main, the code should produce the following output:
    $ runhaskell Main.hs
    
    ["Becca", "23"]
    

    Tip

    We can also use the parseCSVFromFile function to directly get the CSV representation from a filename instead of using readFile followed parseCSV.

How it works...

The CSV data structure in Haskell is represented as a list of records. Record is merely a list of Fields, and Field is a type synonym for String. In other words, it is a collection of rows representing a table, as shown in the following figure:

The parseCSV library function returns an Either type, with the Left side being a ParseError and the Right side being the list of lists. The Either l r data type is very similar to the Maybe a type which has the Just a or Nothing constructor.

We use the either function to handle the Left and Right cases. The Left case handles the error, and the Right case handles the actual work to be done on the data. In this recipe, the Right side is a Record. The fields in Record are accessible through any list operations such as head, last, !!, and so on.

主站蜘蛛池模板: 泰宁县| 三明市| 白玉县| 邓州市| 连城县| 民和| 金堂县| 徐汇区| 米脂县| 庆云县| 噶尔县| 呼伦贝尔市| 平谷区| 年辖:市辖区| 新沂市| 黎川县| 平邑县| 河池市| 金塔县| 穆棱市| 文登市| 安义县| 马鞍山市| 宝鸡市| 米脂县| 金沙县| 洛扎县| 临潭县| 墨竹工卡县| 毕节市| 信阳市| 延边| 西安市| 成武县| 桦甸市| 连山| 秭归县| 射阳县| 贡嘎县| 海城市| 察雅县|