• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • SQL Quickly and Dirtily Extracts a Field from a CSV File

    October 22, 2008 Ted Holt

    A colleague and I were recently tracking the way data progresses through a poorly documented information system. We were trying to determine at which point in the process a certain field’s decimal positions were being discarded. We began at the beginning, which happened to be a CSV file from a PC-based system. How could we quickly determine whether or not the CSV file had values in the decimal positions of the seventh field?

    We used SQL. I’ll provide a simple illustration. Here are the commands, in case you want to try this yourself.

    Let’s first create a CSV file we can play with. Run this command from the CL command line:

    crtpf qtemp/custcsv rcdlen(80)
    

    Next, load some data into the CSV file.

    INSERT INTO QTEMP/CUSTCSV
    SELECT CUSNUM||',"'||trim(LSTNAM)||'","'||INIT||'","'|| 
     trim(STREET)||'","'||trim(CITY)||'","'||STATE||'",'||ZIPCOD||
     ','||trim(char(CDTLMT))||','||CHGCOD||','|| 
          trim(char(BALDUE))||','||trim(char(CDTDUE)) 
     FROM qiws/qcustcdt 
    

    The example CSV file, QTEMP/CUSTCSV, looks like this:

    938472,"Henning","G K","4859 Elm Ave","Dallas","TX",75217,5000,3,37.00,.00 
    839283,"Jones","B D","21B NW 135 St","Clay","NY",13041,400,1,100.00,.00    
    392859,"Vine","S S","PO Box 79","Broton","VT",5046,700,1,439.00,.00        
    938485,"Johnson","J A","3 Alpine Way","Helen","GA",30545,9999,2,3987.50,33.50
    397267,"Tyron","W E","13 Myrtle Dr","Hector","NY",14841,1000,1,.00,.00     
    389572,"Stevens","K L","208 Snow Pass","Denver","CO",80226,400,1,58.75,1.50
    846283,"Alison","J S","787 Lake Dr","Isle","MN",56342,5000,3,10.00,.00     
    475938,"Doe","J W","59 Archer Rd","Sutter","CA",95685,700,2,250.00,100.00  
    693829,"Thomas","A N","3 Dove Circle","Casper","WY",82609,9999,2,.00,.00   
    593029,"Williams","E D","485 SE 2 Ave","Dallas","TX",75218,200,1,25.00,.00 
    192837,"Lee","F L","5963 Oak St","Hector","NY",14841,700,2,489.50,.50      
    583990,"Abraham","M T","392 Mill St","Isle","MN",56342,9999,3,500.00,.00   
    

    Now, find the records that have a lowercase “e” in the second position of the city field, which is the fifth field.

    with t1 as (select rrn(custcsv) as rrrn,                        
                substr(custcsv,locate(',',custcsv)+1) as x          
                from qtemp/custcsv),                                
    t2 as (select rrrn, substr(x,locate(',',x)+1) as x from t1),    
    t3 as (select rrrn, substr(x,locate(',',x)+1) as x from t2),    
    t4 as (select rrrn, substr(x,locate(',',x)+1) as x from t3),    
    t5 as (Select rrrn, substr(x,1,locate(',',x)-1) as city from t4)
    select * from t5                                                
    where city like '"_e%'                                          
    

    Here’s the result set.

    RRRN   CITY    
       4   "Helen" 
       5   "Hector"
       6   "Denver"
      11   "Hector"
    

    So, how does it work?

    Common table expression T1 extracts everything that follows the first comma, which is the second field and following. It also gets the relative record number of the original file.

    with t1 as (select rrn(custcsv) as rrrn, 
                substr(custcsv,locate(',',custcsv)+1) as x
                from qtemp/custcsv), 
    

    Common table expression T2 carries along the relative record number, and extracts everything that follows the first comma of field T1, which is the third field and the following:

    t2 as (select rrrn, substr(x,locate(',',x)+1) as x from t1), 
    

    Common table expressions T3 and T4 work like T2, carrying along the relative record number of the original file and peeling one field from the front of the CSV data.

    Common table expression T5 carries along the relative record number and extracts the first field of T4, which is the fifth field of CUSTCSV.

    t5 as (Select rrrn, substr(x,1,locate(',',x)-1) as city from t4)
    

    All that remains to be done is select the desired records.

    select * from t5 
    where city like '"_e%' 
    

    I’d hesitate to use this technique in production, but for quick and dirty data analysis, it worked great for us.

    My colleague and I determined that the decimal positions were in the CSV file, and were able to continue with our analysis.



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    Manta Technologies

    The Leader in IBM i Education!
    Need training on anything i?
    Manta is all you need.

    130 courses and competency exams on:
    · IBM i operations
    · System Management and Security
    · IBM i Programming Tools
    · Programming in RPG, COBOL, CL, Java
    · Web Development

    SQL, DB2, QueryProduct features:
    · Runs in every popular browser
    · Available 24/7/365
    · Free Student Reference Guides
    · Free Student Administration
    · Concurrent User License
    · Built-In IBM i Simulator

    You can download our 200-page catalog and take sample sessions at MantaTech.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    MKS:  FREE white paper: From WDSC to RDi. Making Software Change Easier with MKS Integrity for IBM i
    Vibrant Technologies:  The leading source for IBM Power Systems and Upgrades
    COMMON:  Join us at the annual 2009 conference, April 26 - 30, in Reno, Nevada

    IT Jungle Store Top Book Picks

    Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
    Getting Started with PHP for i5/OS: List Price, $59.95
    The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
    The System i Pocket RPG & RPG IV Guide: List Price, $69.95
    The iSeries Pocket Database Guide: List Price, $59.00
    The iSeries Pocket Developers' Guide: List Price, $59.00
    The iSeries Pocket SQL Guide: List Price, $59.00
    The iSeries Pocket Query Guide: List Price, $49.00
    The iSeries Pocket WebFacing Primer: List Price, $39.00
    Migrating to WebSphere Express for iSeries: List Price, $49.00
    iSeries Express Web Implementer's Guide: List Price, $59.00
    Getting Started with WebSphere Development Studio for iSeries: List Price, $79.95
    Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
    Getting Started with WebSphere Express for iSeries: List Price, $49.00
    WebFacing Application Design and Development Guide: List Price, $55.00
    Can the AS/400 Survive IBM?: List Price, $49.00
    The All-Everything Machine: List Price, $29.95
    Chip Wars: List Price, $29.95

    InfoPrint Launches a Product for Automated Document Factories IBM’s Q3 in Servers, Redux: The i and p Platforms Do OK

    Leave a Reply Cancel reply

Volume 8, Number 36 -- October 22, 2008
THIS ISSUE SPONSORED BY:

Help/Systems
WorksRight Software
Aldon

Table of Contents

  • Visual Explain for Run SQL Scripts
  • SQL Quickly and Dirtily Extracts a Field from a CSV File
  • Healing Failed Windows-i5/OS FTP Transfers

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • IBM Unveils Manzan, A New Open Source Event Monitor For IBM i
  • Say Goodbye To Downtime: Update Your Database Without Taking Your Business Offline
  • i-Rays Brings Observability To IBM i Performance Problems
  • Another Non-TR “Technology Refresh” Happens With IBM i TR6
  • IBM i PTF Guide, Volume 27, Number 18
  • Will The Turbulent Economy Downdraft IBM Systems Or Lift It?
  • How IBM Improved The Database With IBM i 7.6
  • Rocket Celebrates 35th Anniversary As Private Equity Owner Ponders Sale
  • 50 Acres And A Humanoid Robot With An AI Avatar
  • IBM i PTF Guide, Volume 27, Number 17

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle