fhg
Volume 9, Number 26 -- August 19, 2009

Validate DBCS-Open Data

Published: August 19, 2009

by Michael Sansoterra

The DBCS open data type is used in many legacy applications as a way to support storage of either single byte character set (SBCS) data, double byte character set (DBCS) data, or even data from both character sets within a single data column. Defined with a data type of O within DDS specs, this data type carries an inherent agnostic coded character set identifier (CCSID) of 65535.

The DBCS open data type operates by storing single byte characters, as traditional single byte character columns do. However, when a DBCS open column stores double byte characters, it uses special embedded escape characters to indicate when DBCS data starts and ends within the data. Within an open DBCS character string, a hex value of X'0E' (decimal 14) signifies a "shift out," or in other words, "the following data should be interpreted as double byte." All character data encountered after the shift out character will be read in two byte pairs.

When the double byte data has ended and the remaining data should be interpreted as single byte again, a hex value of X'0F' (decimal 15) signifies a "shift in," or in other words, "we're back to interpreting single byte character data again."

However, under various circumstances, these DBCS shift-in and shift-out delimiters can become mismatched (usually a final "shift-in" is omitted) and can cause the data to be displayed or interpreted incorrectly.

Detecting these data errors within a DBCS open column can be difficult. To assist with this problem, presented below is a user-defined function named IsDBCSOpenValid that can be used to detect a mismatch between shift-in and shift-out characters within DBCS open data columns. The function simply checks to make sure there are an equal number of shift-in and shift-out characters within the data and that the shift characters are in the proper sequence.

If the number of shift-in and shift-out characters are equal and ordered correctly, the data is valid and the function returns a 'Y' to indicate the data is OK. If the shift-in characters are not equal or out of order the function returns an 'N'.

Create Function IsDBCSOpenValid
(DBCSOpen VarChar(512) CCSID 65535)
Returns Char(1)
Language SQL
Deterministic
Set Option Commit=*None
Begin
    Declare i Int Not Null Default 1;
    Declare ShiftOut Int Not Null Default 0;
    Declare ShiftIn  Int Not Null Default 0; 
    Declare TestChar Char(1) Not Null Default ' ';

    While (i<=Length(DBCSOpen)) Do
        Set TestChar=Substr(DBCSOpen,i,1);
        If TestChar=X'0E' Then
            Set ShiftOut=ShiftOut+1;
            If ShiftOut-ShiftIn>1 Then
                Return 'N';
            End If;
        ElseIf TestChar=X'0F' Then
            Set ShiftIn=ShiftIn+1;
            If ShiftIn>ShiftOut Then
                Return 'N';
            End If;
        End If;
        Set i=i+1;
    End While;

    If ShiftIn=ShiftOut Then
        Return 'Y';
    End If;
    Return 'N';
End;

The UDF can be used in a query as follows to test for invalid data in a table that uses one or more open DBCS columns. The Where clause specifies that only rows containing invalid "DBCS open" data will be returned.

Select *
  From (
Select RRN(A) As Record_No, My_Key_Field,
       DBCSData1,
       IsDBCSOpenValid(DBCSData1) As Test1,
       DBCSData2,
       IsDBCSOpenValid(DBCSData2) As Test2
  From MyDBCSOpenTable A) InnerSelect
 Where Test1='N ' Or Test2='N'

For the record, if you're creating a new application and want to store either single byte or double byte data in the same column, use an eight-bit (if the data will be predominantly SBCS) or 16-bit Unicode columns instead of using the antiquated DBCS open data type (which is definable in DDS only).

In V5R4 Unicode columns are defined as follows:

MyData CHAR(30) CCSID 1208   /* UTF8 - Single Byte Unicode */
MyData GRAPHIC(30) CCSID 1200  /* UTF16 - Double Byte Unicode */

In V6R1, the NCHAR and NVARCHAR data types can be used in place of the GRAPHIC and VARGRAPHIC data types:

MyData NCHAR(30) CCSID 1200  /* UTF16 - Double Byte Unicode */


Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. Send your questions or comments to Mike via the IT Jungle Contact page.




                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
HELP/SYSTEMS

SEQUEL ViewPoint®--Data Access & Analysis
for Power Systems™ Servers

                                               · Easy to use by IT and end users
                                               · Automated data access and display
                                               · Complete BI package: reports, tables,key
                                                  performance indicators, and dashboards
                                               · IBM i-centric for real-time data analysis
                                               · Expert support and training
                                               · Secure data access
                                               · Green screen, Web, browser

SEQUEL--Fast, efficient & cost-effective data analysis

Click here for a FREE Information Kit!


Senior Technical Editor: Ted Holt
Technical Editor: Joe Hertvik
Contributing Technical Editors: Erwin Earley, Brian Kelly, Michael Sansoterra
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

VAULT400:  White paper: National bike retailer "rolls" with Vault400
Raz-Lee Security:  iSecurity Compliance Evaluator: Instant network-wide compliance checks
COMMON:  Celebrate our 50th anniversary at annual conference, May 2 - 6, 2010, in Orlando


 

IT Jungle Store Top Book Picks

Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
The iSeries Express Web Implementer's Guide: List Price, $49.95
The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
Can the AS/400 Survive IBM?: List Price, $49.00
Chip Wars: List Price, $29.95


 
The Four Hundred
RPG: A Great Language with a Greater History

IBM Tweaks Power 570, 595 Deals Yet Again

Reconsidering SAN in Wake of SCSI Disk's End

As I See It: Daniel, Part Two

A Quick Analysis of Business Intelligence Planning

Four Hundred Stuff
ASTI Sees Promise in Plasmon's UDO Technology

Stonebranch Bolsters i OS Support in Workload Automation Tools

Data Control Issues Bring vLegaci QuickerApps to Market

Oracle Gives JDE More Supply Chain Planning Brains

Infor Snaps Up SoftBrands, Gets i OS-Based Hotel Suite

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

System i PTF Guide
August 15, 2009: Volume 11, Number 33

August 8, 2009: Volume 11, Number 32

August 1, 2009: Volume 11, Number 31

July 25, 2009: Volume 11, Number 30

July 18, 2009: Volume 11, Number 29

July 11, 2009: Volume 11, Number 28

July 4, 2009: Volume 11, Number 27

TPM at The Register
Big Blue bundles pound down mainframe prices

Uncle Sam shells out $62m for 100GbE

Novell lands full-time staff on openSUSE

Parallels juggles servers for John Q. ISP

Citrix touts XenServer hypervisor ramp

openSUSE will sacrifice 10.3 release on Halloween

Oracle spells out VM tools plans

Sun sets on Solaris Express Community Edition

NEC punts 0% financing, deferred payments

System sales might have hit bottom

HP sued by own sales reps

BMC snaps up message queue maven

IT shops struggle to control personnel costs

Sun's Rock is barefoot on Abbey Road

THIS ISSUE SPONSORED BY:

Help/Systems
System i Developer
Twin Data


Printer Friendly Version


TABLE OF CONTENTS
Validate DBCS-Open Data

Formatting Dates with SQL

Admin Alert: Correcting and Expanding the Program to Change User Passwords on the Fly

Four Hundred Guru

BACK ISSUES




 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2009 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement