• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • SQL Quickly and Dirtily Extracts a Field from a CSV File

    October 22, 2008 Ted Holt

    A colleague and I were recently tracking the way data progresses through a poorly documented information system. We were trying to determine at which point in the process a certain field’s decimal positions were being discarded. We began at the beginning, which happened to be a CSV file from a PC-based system. How could we quickly determine whether or not the CSV file had values in the decimal positions of the seventh field?

    We used SQL. I’ll provide a simple illustration. Here are the commands, in case you want to try this yourself.

    Let’s first create a CSV file we can play with. Run this command from the CL command line:

    crtpf qtemp/custcsv rcdlen(80)
    

    Next, load some data into the CSV file.

    INSERT INTO QTEMP/CUSTCSV
    SELECT CUSNUM||',"'||trim(LSTNAM)||'","'||INIT||'","'|| 
     trim(STREET)||'","'||trim(CITY)||'","'||STATE||'",'||ZIPCOD||
     ','||trim(char(CDTLMT))||','||CHGCOD||','|| 
          trim(char(BALDUE))||','||trim(char(CDTDUE)) 
     FROM qiws/qcustcdt 
    

    The example CSV file, QTEMP/CUSTCSV, looks like this:

    938472,"Henning","G K","4859 Elm Ave","Dallas","TX",75217,5000,3,37.00,.00 
    839283,"Jones","B D","21B NW 135 St","Clay","NY",13041,400,1,100.00,.00    
    392859,"Vine","S S","PO Box 79","Broton","VT",5046,700,1,439.00,.00        
    938485,"Johnson","J A","3 Alpine Way","Helen","GA",30545,9999,2,3987.50,33.50
    397267,"Tyron","W E","13 Myrtle Dr","Hector","NY",14841,1000,1,.00,.00     
    389572,"Stevens","K L","208 Snow Pass","Denver","CO",80226,400,1,58.75,1.50
    846283,"Alison","J S","787 Lake Dr","Isle","MN",56342,5000,3,10.00,.00     
    475938,"Doe","J W","59 Archer Rd","Sutter","CA",95685,700,2,250.00,100.00  
    693829,"Thomas","A N","3 Dove Circle","Casper","WY",82609,9999,2,.00,.00   
    593029,"Williams","E D","485 SE 2 Ave","Dallas","TX",75218,200,1,25.00,.00 
    192837,"Lee","F L","5963 Oak St","Hector","NY",14841,700,2,489.50,.50      
    583990,"Abraham","M T","392 Mill St","Isle","MN",56342,9999,3,500.00,.00   
    

    Now, find the records that have a lowercase “e” in the second position of the city field, which is the fifth field.

    with t1 as (select rrn(custcsv) as rrrn,                        
                substr(custcsv,locate(',',custcsv)+1) as x          
                from qtemp/custcsv),                                
    t2 as (select rrrn, substr(x,locate(',',x)+1) as x from t1),    
    t3 as (select rrrn, substr(x,locate(',',x)+1) as x from t2),    
    t4 as (select rrrn, substr(x,locate(',',x)+1) as x from t3),    
    t5 as (Select rrrn, substr(x,1,locate(',',x)-1) as city from t4)
    select * from t5                                                
    where city like '"_e%'                                          
    

    Here’s the result set.

    RRRN   CITY    
       4   "Helen" 
       5   "Hector"
       6   "Denver"
      11   "Hector"
    

    So, how does it work?

    Common table expression T1 extracts everything that follows the first comma, which is the second field and following. It also gets the relative record number of the original file.

    with t1 as (select rrn(custcsv) as rrrn, 
                substr(custcsv,locate(',',custcsv)+1) as x
                from qtemp/custcsv), 
    

    Common table expression T2 carries along the relative record number, and extracts everything that follows the first comma of field T1, which is the third field and the following:

    t2 as (select rrrn, substr(x,locate(',',x)+1) as x from t1), 
    

    Common table expressions T3 and T4 work like T2, carrying along the relative record number of the original file and peeling one field from the front of the CSV data.

    Common table expression T5 carries along the relative record number and extracts the first field of T4, which is the fifth field of CUSTCSV.

    t5 as (Select rrrn, substr(x,1,locate(',',x)-1) as city from t4)
    

    All that remains to be done is select the desired records.

    select * from t5 
    where city like '"_e%' 
    

    I’d hesitate to use this technique in production, but for quick and dirty data analysis, it worked great for us.

    My colleague and I determined that the decimal positions were in the CSV file, and were able to continue with our analysis.



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    VISUAL LANSA 16 WEBINAR

    Trying to balance stability and agility in your IBM i environment?

    Join this webinar and explore Visual LANSA 16 – our enhanced professional low-code platform designed to help organizations running on IBM i evolve seamlessly for what’s next.

    🎙️VISUAL LANSA 16 WEBINAR

    Break Monolithic IBM i Applications and Unlock New Value

    Explore modernization without rewriting. Decouple monolithic applications and extend their value through integration with modern services, web frameworks, and cloud technologies.

    🗓️ July 10, 2025

    ⏰ 9 AM – 10 AM CDT (4 PM to 5 PM CEST)

    See the webinar schedule in your time zone

    Register to join the webinar now

    What to Expect

    • Get to know Visual LANSA 16, its core features, latest enhancements, and use cases
    • Understand how you can transition to a MACH-aligned architecture to enable faster innovation
    • Discover native REST APIs, WebView2 support, cloud-ready Azure licensing, and more to help transform and scale your IBM i applications

    Read more about V16 here.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    MKS:  FREE white paper: From WDSC to RDi. Making Software Change Easier with MKS Integrity for IBM i
    Vibrant Technologies:  The leading source for IBM Power Systems and Upgrades
    COMMON:  Join us at the annual 2009 conference, April 26 - 30, in Reno, Nevada

    IT Jungle Store Top Book Picks

    Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
    Getting Started with PHP for i5/OS: List Price, $59.95
    The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
    The System i Pocket RPG & RPG IV Guide: List Price, $69.95
    The iSeries Pocket Database Guide: List Price, $59.00
    The iSeries Pocket Developers' Guide: List Price, $59.00
    The iSeries Pocket SQL Guide: List Price, $59.00
    The iSeries Pocket Query Guide: List Price, $49.00
    The iSeries Pocket WebFacing Primer: List Price, $39.00
    Migrating to WebSphere Express for iSeries: List Price, $49.00
    iSeries Express Web Implementer's Guide: List Price, $59.00
    Getting Started with WebSphere Development Studio for iSeries: List Price, $79.95
    Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
    Getting Started with WebSphere Express for iSeries: List Price, $49.00
    WebFacing Application Design and Development Guide: List Price, $55.00
    Can the AS/400 Survive IBM?: List Price, $49.00
    The All-Everything Machine: List Price, $29.95
    Chip Wars: List Price, $29.95

    InfoPrint Launches a Product for Automated Document Factories IBM’s Q3 in Servers, Redux: The i and p Platforms Do OK

    Leave a Reply Cancel reply

Volume 8, Number 36 -- October 22, 2008
THIS ISSUE SPONSORED BY:

Help/Systems
WorksRight Software
Aldon

Table of Contents

  • Visual Explain for Run SQL Scripts
  • SQL Quickly and Dirtily Extracts a Field from a CSV File
  • Healing Failed Windows-i5/OS FTP Transfers

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Liam Allan Shares What’s Coming Next With Code For IBM i
  • From Stable To Scalable: Visual LANSA 16 Powers IBM i Growth – Launching July 8
  • VS Code Will Be The Heart Of The Modern IBM i Platform
  • The AS/400: A 37-Year-Old Dog That Loves To Learn New Tricks
  • IBM i PTF Guide, Volume 27, Number 25
  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle