Newsletters Subscriptions Media Kit About Us Contact Search Home

Stuff
OS/400 Edition
Volume 2, Number 18 -- September 11, 2003

Cast Your Data Conversion Troubles Away


by Kevin Vandever

Have you ever attempted to download data from your iSeries to an Excel spreadsheet or some other PC format? Are you involved with a complete data conversion from the iSeries to some other platform? Whatever the case, depending on how the character data is defined, you might have difficulty converting it. When conversion issues arise, what do you do? Panic? Transfer the data over and over, hoping it eventually works? Re-key? I have a better solution.

Character Set

First, I should share with you the reason why you may have these character-conversion issues in the first place. It's probably right in front of you, and you haven't even noticed it. Take a look at any physical file field definition, using the Display File Field Definition (DSPFFD) command.

Notice that under each character field there is something called the coded character set ID (CCSID). In my example, this is set to 37 for each character field. This setting is the key to whether your character data will convert correctly. The CCSID is a cross-system and multinational support mechanism for the management of character information provided using the Character Data Representation Architecture (CDRA). The CDRA is set by IBM to handle a minimum set of characters for cross-platform and cross-national language support. The CCSID is a 16-bit number that defines a specific collection of coding-related information that uniquely identifies a coded character set. For example, the CCSID for the English language is 37. You can see that the character fields defined in my file have all been defined with a CCSID of 37. The CDRA defines the CCSID values to identify classifications used to represent characters and to convert those characters, as needed, to preserve their meaning. DB2 tags character columns with CCSIDs, either explicitly using a data structure definition or implicitly using the job or system. The data is not converted when it is sent to another system; rather, the receiving job converts the data to its own CCSID if it is different. If that receiving job were on a French-based machine, the CCSID would be converted to 297; whereas a Greek-based PC would convert the CCSID to 4965. A CCSID that you will run into, even on a U.S.-based machine, is 65535. A CCSID of 65535 indicates that the data is hex and should not be converted. This is the CCSID that will cause your character data conversion problems, and is the focus of this article.

I know what you're thinking. You've created hundreds, maybe even thousands, of physical files and never had to worry about setting the CCSID. You are correct, and as long as you only convert data from files that you had some control in creating, or were created using default CCSID settings, you can probably quit reading this article and checkout your fantasy football stats, but the truth is that some third-party software vendors code most of their character data with the dreaded 65535. I don't want to mention any names (J.D. Edwards), but I have run into this myself. But, as I mentioned above, there is a solution, and it's pretty simple to use.

SQL Casting

Casting in SQL is used to change one data type to another. It can also be used to change a data type to the same type but with a different length, precision, or scale. Casting can be done implicitly, by the database manager, or explicitly, by using the cast function in an SQL statement. I am going to discuss the explicit casting of a character field in this article, but I invite you to investigate all the other wonders of casting, including supported rules when casting various data types, cast expressions, and implicit casting, as I mentioned above. This information can be found in the SQL reference manual at IBM's Information Center.

Say you have a physical file on the iSeries that for some reason contains character data defined with a CCSID of 65535. As I said before, some software vendors do this on purpose, but you can change it yourself in DDS by using the CCSID keyword at the field level or when creating or altering a table using SQL by setting the CCSID at that time. Anyway, you have your data file and now you want to send this data to your PC. There are a number of ways to accomplish this, and it doesn't matter which one you use, but for now let's just use the data transfer tool that comes with IBM's Client Express. My specific file contains three fields: a numeric ID that is defined as 7 bytes numeric with 0 decimal places. The other two fields are first and last names and are both defined as character 20. However, the first name is defined as the default, CCSID 37, and I defined the last name using CCSID 65535. I won't walk you through all the steps to download a file just yet, but using the basic download tool, when I download this file, my results are shown here:

1KEVIN       E5C1D5C4C5E5C5D9404040404040404040404040
2KATHY       C5D3D3C9D6E3E340404040404040404040404040
3JIM         C3D6D2C5D9404040404040404040404040404040
4BETH        C7C1D3D3C9C8C5D9404040404040404040404040
5CHARLES     C6D3C5E3C3C8C5D9404040404040404040404040
6JIM         C3D9D6C7C1D54040404040404040404040404040
7KATHY       E2C3D6E3E3404040404040404040404040404040
8TED         C3C8D6E640404040404040404040404040404040
9DONALD      C8C1D3D340404040404040404040404040404040
10ELDON      C7D9C1C2C5D4C5E8C5D940404040404040404040
11JOHN       C8C9D3E3D6D54040404040404040404040404040
12JIM        D5C1E2C9E4D44040404040404040404040404040
13JOHN       D3E8D5E4D4404040404040404040404040404040
14TOM        D7C6C5C9D3404040404040404040404040404040
15ANDY       E3C1D5C740404040404040404040404040404040
16SUSIE      E2C9D4D4D6D5E240404040404040404040404040
17SANDY      E6C5E2E3D7C8C1D3404040404040404040404040
18SUZANNE    C4C9C5D740404040404040404040404040404040
19CHRIS      C5E2E3D6D3D6D5C9D64040404040404040404040
20LUPITA     C6C1D1C1D9C4D640404040404040404040404040
21YOUNG      D2C9D44040404040404040404040404040404040
22LEWIS      D3C9C2D4C1D54040404040404040404040404040

Here is where casting comes into play. As I mentioned before, casting is used to change a data type from one type to another or to allow it to remain the same type but change the attributes of that data type. In the example, you don't need to change the data type, length, or precision, but you do need to change the CCSID, and you can do so using the SQL statement, as follows.

SELECT emidnm, emfnam, CAST(emlnam AS CHAR(20) CCSID 37)

The cast function is used to change the CCSID of the employee last name, emlnam, from 65535, which is unconvertible hex, to 37, which is U.S. English. The format is to include the source field--emlnam, in this case--followed by the keyword AS and the list of attributes you want to change. The data type is required; that's why you see that I defined the 20-byte character field as a 20-byte character field. After the required data type comes my CCSID entry. Any other attribute that you want to change would be listed after the data type, in any order. Pretty simple, huh?

There are many ways to transfer data from an iSeries to a PC, and the SQL statement above will work for all of those I've tried. If you're using a tool that builds SQL for you, you can still use this syntax. Let's see how it works using Client Express. From a Client Express 5250 emulation session, select Actions in the menu bar at the top, then select Receive from host. On that first screen, you select the file you want to transfer from the iSeries and the name and location of the "to file" on the PC. You can also select the format of the "to file" and whether you want it created or want to overwrite an existing file. From this screen, press the Data Options button and you will get an SQL build screen. Click here to see the new entry in the Select portion of the display. Press OK on this screen, then transfer away.

The last names are now readable:

1KEVIN                VANDEVER
2KATHY                SMITH
3JIM                  JONES
4BETH                 THOMPSON
5CHARLES              IN CHARGE
6JIM                  SMITH AGAIN
7KATHY                RALPHS
8TED                  DYBEAR
9DONALD               DUCK
10ELDON               SMITH
11JOHN                JONES
12JIM                 NASIUM
13JOHN                ARAHCMEY
14TOM                 JONES
15ANDY                ROONEY
16SUSIE               SUMMERS
17SANDY               BEACH
18SUZANNE             NORTH
19CHRIS               COLUMBUS
20LUPITA              YANG
21YOUNG               TOO
22LEWIS               LEFTY

Changing the CCSID without SQL Casting

If you are uncomfortable with SQL or don't want to modify data transfer scripts, there are a couple other ways to change the CCSID before performing the data transfer. For one, you can use the ALTER TABLE command, within SQL, to change the attributes of a field even if it was created using DDS. However, that doesn't solve the problem of your discomfort with SQL. Another way is to copy the DDS of the file you want to transfer and make sure that each field in question is defined, using the default CCSID. In my case, that meant removing the CCSID keyword entry from the copied DDS. Then compile your copied DDS, and copy the data from the file containing CCSID 65535 data to your new file. Your character data will be converted to CCSID 37. Now you can transfer the data from your iSeries as you normally would.

Cast Away!

You should now have a better understanding of some of the issues behind character data conversions and how to solve them. Casting is a wonderful tool for solving these problems. I've showed you alternatives to fixing the CCSID for data transfers, in case you need them, and they may not look that much more difficult to employ, but the advantage of casting is that it has more features than I discussed in this article and one of those features may also come in handy when converting or transferring data from the iSeries. Enjoy!


Sponsored By
ASNA

Why we chose ASNA Visual RPG for .NET:

"I choose AVR for .NET over WebSphere because learning new programming techniques and methods can be a daunting task, but ASNA makes it much easier with the quality of support and examples that are provided. Add ASNA training and the combination is unbeatable. A HUGE savings in development cycle is very probable, even with the time it will take to become proficient in .Net."
--Dean Bathke, Mid-Continent

Learn more about AVR for .NET today!
www.asna.com/infoavrdotnet.aspx


THIS ISSUE
SPONSORED BY:

Lakeview Technology
T.L. Ashford
ASNA
Profound Logic Software


BACK ISSUES

TABLE OF
CONTENTS
iSeries Access for Web

Data Normalization, Part 3

Cast Your Data Conversion Troubles Away

OS/400 Alert: Nasty Little Viruses


Editors
Shannon O'Donnell
Kevin Vandever

Managing Editor
Shannon Pastore

Contributing Editors:
Howard Arner
Raymond Everhart
Joe Hertvik
Ted Holt
Marc Logemann
David Morris

Publisher and
Advertising Director:

Jenny Thomas

Advertising Sales Representative
Kim Reed

Contact the Editors
Do you have a gripe, inside dope or an opinion?
Email the editors:
editors@itjungle.com


Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.