fhg
Volume 10, Number 4 -- January 27, 2010

A Few Excel Export to CSV Tips

Published: January 27, 2010

by Michael Sansoterra

Comma separated variable (CSV) files are frequently used when exporting DB2 data for use with Microsoft Excel. However, because CSV files contain text without any additional formatting instructions, it can be somewhat time-consuming for users to format their worksheets whenever they get a new export. This tip will address a few of the shortcomings of using CSV files with Excel and present a few possible workarounds to make the lives of your Excel user community easier.

Many of these ideas are only applicable to CSV exports intended for Excel usage. If the export file is dual purpose (one for users and another for a computer system), the ideas may not be applicable. Also, many of these tips will only work if you have the ability to manipulate the data programmatically (with RPG, COBOL, SQL, etc.) before or during the creation of the CSV file.

(Note: If you're comfortable with Java and are looking for a flexible, programmatic solution for creating native Excel worksheets complete with formatting, check out the Apache POI project, which is called the Java API for Microsoft Documents. This site contains Java classes that enable developers to read and write native Excel and other Microsoft Office file formats.)

Display Issues with Numbers

It is quite common for CSV export files to contain text columns that contain numeric data. Unfortunately, Excel pays no attention to the fact that these columns should be treated as text fields and instead formats the data as it sees fit. Two often witnessed problems with these "text" columns are: the display of large numbers in scientific notation, and the removal of leading zeros that are intended for display.

For example, a large GL account number such as "1000084002402" will display something like 1E+12 or 1.00008E+12 (depending on the width of the column). This scientific notation format is a pain to read and users end up re-formatting the column as text.

Zip codes are an example of string data that may contain leading zeros. For instance, some Massachusetts Zip codes have a leading zero: 02351. But there's nothing more annoying than these Zip codes showing up in a spreadsheet without the leading zero!

The solution to resolve both these issues is to simply prefix a single quote (') in front of the data column in the CSV file. So if you want Excel to leave a number like this alone:

"001236412"

Simply put it in the CSV file as:

"'001236412"

This will tell Excel to display the number as is.

Display Issues with Embedded Double Quotes

In CSV files, the most common string delimiter is the double quote ("). An item description within a CSV file may look like this:

"Red Bicycle"

A description for 12-inch PVC pipe might look like this:

"12" PVC Pipe"

Having the delimiter embedded within the description causes Excel to misinterpret where the data for the column begins and ends. In the above PVC description, Excel will interpret the data as two columns: a column with "12" and another column containing "PVC Pipe."

One way to fix this problem is to replace the double quote with a rare substitute character such as a tilde when creating the CSV file. However, once the user opens the CSV file in Excel, he will have to do a replace operation (use Control+H) to replace the substitute character with the double quote in order to return things to normal.

A better solution is to "double up" each double quote within the data itself. Excel will interpret this "doubled up" quote as one quote character instead of a delimiter:

"12" PVC Pipe"

If you have the liberty to use SQL to manipulate your data table before exporting it to CSV, this operation becomes a piece of cake by simply using the REPLACE built-in function with an UPDATE statement:

Update MyTable
   Set MyData=Replace(MyData, '"', '""')

The one caveat here is that the MyData column definition may need to be expanded in order to hold a few additional characters for the storage of the extra quotes.

Trimming Spaces

Extraneous spaces in CSV files make it cumbersome for a user to work with the data. No one likes editing a cell only to have to backspace over a large number of trailing spaces. Trimming extraneous spaces will save file space and reduce some drudgery for users.

If you create CSV files using the CPYTOIMPF command, don't overlook the fact that this command has a "remove blank parameter" for trimming leading and/or trailing spaces: RMVBLANK(*BOTH).

Formatting Column Widths

One big pet peeve I have when working with CSV files in Excel is that they usually need some sort of formatting when they're first opened. One of the most time consuming tasks is sizing the columns to fit the data.

An easy way to automatically size the columns once the spreadsheet is open is to click on the handle that sits in the upper-left corner of the spreadsheet (to the left of the A column and above the first row) as shown in Figure 1.


Figure 1--Locate the handle in the upper-left corner of the spreadsheet.

Single clicking on this handle will highlight the entire spreadsheet. Then, let your mouse hover on the line that separates the A and B column headers. When hovering over this line, the mouse icon will change to the crosshair symbol. When it does, simply double-click the line between the two column headers and watch all the columns in the spreadsheet dynamically resize to fit the data. Keep in mind the entire sheet must be selected for this behavior to work for all columns. This trick also works for resizing all rows: simply repeat the process for the columns except double-click the line between the indicators for row one and two.

Implementing these few tips can add up to substantial time savings for users as they'll be spending much less time trying to fix up their spreadsheets.


Michael Sansoterra is a DBA for Broadway Systems in Grand Rapids, Michigan. Send your questions or comments for Mike via the IT Jungle Contact page.




                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
WORKSRIGHT SOFTWARE

Do you need area code information?
Do you need ZIP Code information?
Do you need ZIP+4 information?
Do you need city name information?
Do you need county information?
Do you need a nearest dealer locator system?

We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

Just call us and we'll arrange for 30 days FREE use of either
ZIP/CITY or PER/ZIP4.

WorksRight Software, Inc.
Phone: 601-856-8337
Fax: 601-856-9432
E-mail: software@worksright.com
Web site: www.worksright.com


Senior Technical Editor: Ted Holt
Technical Editor: Joe Hertvik
Contributing Technical Editors: Erwin Earley, Brian Kelly, Michael Sansoterra
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

Help/Systems:  Event-driven job scheduling for UNIX, Linux, Windows & IBM i servers
LANSA:  Transport your apps to a new dimension with RAMP. FREE Webinar!
COMMON:  Join us at the annual 2010 conference, May 3 - 6, in Orlando, Florida


 

IT Jungle Store Top Book Picks

Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
The iSeries Express Web Implementer's Guide: List Price, $49.95
The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
Can the AS/400 Survive IBM?: List Price, $49.00
Chip Wars: List Price, $29.95


 
The Four Hundred
IBM Preps Power7 Launch For February

Looks Like i 7.1 Is Coming In April

The IBM Profit Engine Keeps A-Rolling in Q4

As I See It: What Did You Do At Work Today, Daddy?

The System iWant, 2010 Edition: Midrange Boxes

Four Hundred Stuff
Tripwire Rides Log Management Gig into SIEM Business

LANSA Likes Its Chances as Wal-Mart Ramps Up GDSN Mandate

CCSS Adds MIMIX Monitoring to i/OS Systems Management Suite

JobQGenie Now Protects Job Queue Contents Through IPLs

LTO 5 Speed, Capacity Lower Than Expected

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

System i PTF Guide
January 23, 2010: Volume 12, Number 04

January 16, 2010: Volume 12, Number 03

January 9, 2010: Volume 12, Number 02

January 2, 2010: Volume 12, Number 01

December 26, 2009: Volume 11, Number 52

December 19, 2009: Volume 11, Number 51

TPM at The Register
Voltaire pairs InfiniBand and Ethernet

Red Hat sponsors open source religion

VMware profits pinched in Q4

IT spending to rise in 2010, says Gartner

Platform and Calypso serve risk analysis to traders

Fujitsu puts systems chief at helm

Schwartz puts comforting arm around stricken Sun

Oracle: Mine is bigger and, um, more integrated

Liquid Computing evaporates staff

IBM buys spook-riddled DC services expert

IBM's Power7 servers imminent

China picks MIPS for super-duper super

THIS ISSUE SPONSORED BY:

ProData Computer Services
WorksRight Software
Halcyon Software


Printer Friendly Version


TABLE OF CONTENTS
Creating Yes/No Fields in SQL Queries

A Few Excel Export to CSV Tips

Admin Alert: Speeding Up i5/OS Access Path Rebuilds

Four Hundred Guru

BACK ISSUES




 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2010 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement