Author Topic: CSV Line Formatting Function  (Read 2231 times)

support

  • Administrator
  • *****
  • Posts: 843
    • Script BASIC Open Source Project
CSV Line Formatting Function
« on: November 10, 2014, 01:27:42 PM »
This Script BASIC function makes it easy to quickly format raw CSV (comma-separated values) files that has been a standard way of normalizing data for export / import. While searching for sample CSV data to play with I came across a site that posted the Sacramento crime log for January 1st, 2006 (24 hour period)

FYI: You can use either of the numeric formatting masks Script BASIC offers (BASIC or C style) and if you leave out the format info for a column, it will be ignored. This function will be added to the T.bas (Tools) extension module include file for the 2.2 release.



Code: Script BASIC
  1. ' result = FormatLine(in_str, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB
  2.  
  3. FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  4.   SPLITAQ ln BY "," QUOTE qc TO col
  5.   SPLITA fmtstr BY "|" TO fmtcmd
  6.   rs = ""
  7.   FOR x = 0 to UBOUND(col)
  8.     SPLITA fmtcmd[x] BY ":" TO fmt
  9.     IF fmt[0] = "L" THEN
  10.       tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
  11.       GOSUB Margin
  12.     ELSE IF fmt[0] = "R" THEN
  13.       IF fmt[2] <> undef THEN
  14.         tmp = FORMAT(fmt[2],col[x])
  15.       ELSE
  16.         tmp = col[x]
  17.       END IF
  18.       tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
  19.       GOSUB Margin
  20.     ELSE IF fmt[0] = "C" THEN
  21.       pad = fmt[1] - LEN(col[x])
  22.       pboth = pad \ 2
  23.       prt = pad % 2
  24.       tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
  25.       GOSUB Margin
  26.     END IF
  27.   NEXT
  28.   GOTO Done
  29.  
  30.   Margin:
  31.   IF nsp = -1 THEN
  32.     tmp &= "\t"
  33.   ELSE
  34.     tmp &= STRING(nsp," ")
  35.   END IF
  36.   rs &= tmp  
  37.   RETURN
  38.  
  39.   Done:
  40.   FormatLine = rs
  41. END FUNCTION
  42.  
  43. OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
  44. OPEN "sac.fmt" FOR OUTPUT AS #2
  45. fmtstr = "L:15|L:30|R:4|L:4|R:6|L:35|L:6|R:10:%~-##0.0000~|R:10:%~-##0.0000~"
  46. LINE INPUT #1, hdr
  47. WHILE NOT EOF(1)
  48.   LINE INPUT #1, csvln
  49.   csvln = CHOMP(csvln)
  50.   PRINT #2, FormatLine(csvln,fmtstr,"",2),"\n"
  51. WEND  
  52.  
  53. CLOSE(1)
  54. CLOSE(2)
  55.  

Output (7584 rows)

jrs@laptop:~/sb/sb22/test$ time scriba fmtline.sb

real   0m0.454s
user   0m0.415s
sys    0m0.036s
jrs@laptop:~/sb/sb22/test$

Code: [Select]
1/1/06 0:00      3108 OCCIDENTAL DR                 3  3C      1115  10851(A)VC TAKE VEH W/O OWNER        2404       38.5504   -121.3914 
1/1/06 0:00      2082 EXPEDITION WAY                5  5A      1512  459 PC  BURGLARY RESIDENCE           2204       38.4735   -121.4902 
1/1/06 0:00      4 PALEN CT                         2  2A       212  10851(A)VC TAKE VEH W/O OWNER        2404       38.6578   -121.4621 
1/1/06 0:00      22 BECKFORD CT                     6  6C      1443  476 PC PASS FICTICIOUS CHECK         2501       38.5068   -121.4270 
1/1/06 0:00      3421 AUBURN BLVD                   2  2A       508  459 PC  BURGLARY-UNSPECIFIED         2299       38.6374   -121.3846 
« Last Edit: November 10, 2014, 09:03:55 PM by support »
Script BASIC Project Manager

support

  • Administrator
  • *****
  • Posts: 843
    • Script BASIC Open Source Project
CSV to SQLite3 function
« Reply #1 on: November 12, 2014, 11:56:27 AM »
I thought I would take this another step and create a CSV2SQL function for the T.bas (Tools) extension module include file. This is just my proof of concept attempt and I'll make a function call out of the following example for the 2.2 release.

I attached an example of the final CSV2SQL SUB running on Android Linux native.

Code: Script BASIC
  1. IMPORT sqlite.bas
  2.  
  3. OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
  4. db = sqlite::open("sac116.db")
  5. fmtstr = "SSISISIRR"
  6. LINE INPUT #1, hdr
  7. hdr = CHOMP(hdr)
  8. SPLITA hdr BY "," TO col
  9. SPLITA fmtstr BY "" TO typ
  10. lastcol = UBOUND(col)
  11. sql = "CREATE TABLE crime ("
  12. FOR x = 0 TO lastcol
  13.   tmp = ""
  14.   IF typ[x] = "S" THEN
  15.     tstr = " TEXT"
  16.   ELSE IF typ[x] = "I" THEN
  17.     tstr = " INTEGER"
  18.   ELSE IF typ[x] = "R" THEN
  19.     tstr = " REAL"
  20.   END IF
  21.   tmp &= col[x] & tstr
  22.   IF x <> lastcol THEN tmp &= ", "
  23.   sql &= tmp
  24. NEXT
  25. sql &= ");"
  26. sqlite::execute(db, sql)
  27. sqlite::execute(db, "BEGIN TRANSACTION")
  28. WHILE NOT EOF(1)
  29.   sql = "INSERT INTO crime VALUES ("
  30.   LINE INPUT #1, csvln
  31.   csvln = CHOMP(csvln)
  32.   SPLITAQ csvln BY "," QUOTE "" TO col
  33.   FOR x = 0 TO lastcol
  34.     IF typ[x] = "S" THEN
  35.       tmp = "'" & col[x] & "'"
  36.     ELSE
  37.       tmp = col[x]
  38.     END IF
  39.     IF x <> lastcol THEN tmp &= ", "
  40.     sql &= tmp
  41.   NEXT
  42.   sql &= ");"
  43.   sqlite::execute(db, sql)
  44. WEND
  45. sqlite::execute(db, "END TRANSACTION")
  46. sqlite::close(db)
  47. CLOSE(1)
  48.  

Output

jrs@laptop:~/sb/sb22/test$ time scriba csv2sql.sb

real   0m0.763s
user   0m0.457s
sys    0m0.016s
jrs@laptop:~/sb/sb22/test$ sqlite3
SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .open sac116.db
sqlite> SELECT COUNT(*) FROM crime;
7584
sqlite> .q
jrs@laptop:~/sb/sb22/test$

« Last Edit: November 13, 2014, 06:50:01 PM by support »
Script BASIC Project Manager

support

  • Administrator
  • *****
  • Posts: 843
    • Script BASIC Open Source Project
SQLite/CSV Line Formatting Function
« Reply #2 on: November 13, 2014, 10:39:05 PM »
I modified the fmtline() function to support either CSV or SQLite3 rows.

Code: Script BASIC
  1. ' result = FormatLine(in_str/array, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB
  2.  
  3. IMPORT sqlite.bas
  4.  
  5. FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  6.   IF ISARRAY(ln) THEN
  7.     col = ln
  8.   ELSE
  9.     SPLITAQ ln BY "," QUOTE qc TO col
  10.   END IF
  11.      
  12.   SPLITA fmtstr BY "|" TO fmtcmd
  13.   rs = ""
  14.   FOR x = 0 to UBOUND(col)
  15.     SPLITA fmtcmd[x] BY ":" TO fmt
  16.     IF fmt[0] = "L" THEN
  17.       tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
  18.       GOSUB Margin
  19.     ELSE IF fmt[0] = "R" THEN
  20.       IF fmt[2] <> undef THEN
  21.         tmp = FORMAT(fmt[2],col[x])
  22.       ELSE
  23.         tmp = col[x]
  24.       END IF
  25.       tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
  26.       GOSUB Margin
  27.     ELSE IF fmt[0] = "C" THEN
  28.       pad = fmt[1] - LEN(col[x])
  29.       pboth = pad \ 2
  30.       prt = pad % 2
  31.       tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
  32.       GOSUB Margin
  33.     END IF
  34.   NEXT
  35.   GOTO Done
  36.  
  37.   Margin:
  38.   IF nsp = -1 THEN
  39.     tmp &= "\t"
  40.   ELSE
  41.     tmp &= STRING(nsp," ")
  42.   END IF
  43.   rs &= tmp  
  44.   RETURN
  45.  
  46.   Done:
  47.   FormatLine = rs
  48. END FUNCTION
  49.  
  50. db = sqlite::open("sac16.db")
  51. stmt = sqlite::query(db,"SELECT * FROM crime LIMIT 1")
  52. sqlite::row(stmt)
  53. sqlite::FetchArray(stmt,columns)
  54. fmtstr = "L:15|L:30|R:4|L:4|R:6|L:35|L:6|R:10:%~-##0.0000~|R:10:%~-##0.0000~"
  55. PRINT FormatLine(columns,fmtstr,"",2),"\n"
  56. sqlite::close(db)
  57.  

Output
Code: [Select]
jrs@laptop:~/sb/sb22/test$ scriba fmtsqlrow.sb
1/1/06 0:00      3108 OCCIDENTAL DR                 3  3C      1115  10851(A)VC TAKE VEH W/O OWNER        2404       38.5504   -121.3914 
jrs@laptop:~/sb/sb22/test$
Script BASIC Project Manager