Guru: Better Check Constraints
November 11, 2019 Ted Holt
This article has three purposes. If you use check constraints in your database, the purpose is to help you make better use of check constraints. If you don’t use check constraints, the purpose is to encourage you to use them and to point you in the right direction. If you already know all this stuff, the purpose is to goad you to email me and teach me something I don’t know.
The purpose of check constraints is to keep invalid data out of the database. That may seem unnecessary. Isn’t that what the applications are supposed to do? Yes, but the fact is that many databases have much bad data in them. I have chased down too many problems that were caused by bad data, and chances are good that you have too.
Here, then, are a few facts to know about check constraints.
1. A check constraint is like a WHERE clause without the WHERE.
If the condition proves false, the system rejects the INSERT or UPDATE. (Don’t forget that MERGE can also perform INSERT and UPDATE.) If the condition proves true or unknown, the operation is allowed.
Consider the following table definition, which has two check constraints.
create table EmployeeMaster for system name EmpMast ( ID dec(5) primary key check (ID > 0), Name varchar(16), Type char(1) check (Type in ('S', 'H', 'T', 'P')), Salary dec(9,2), Wage dec(5,2) )
The first check constraint ensures that the ID number assigned to an employee is positive. The second one allows the employee type to have one of four values. Employee type can also be null, but null is not a value, which brings me to another fact.
2. A null value creates an unknown condition, which is acceptable to the database manager. If you want to disallow nulls, you’ll need it to be specific about it by adding the NOT NULL constraint.
create table EmployeeMaster for system name EmpMast ( ID dec(5) primary key check (ID > 0), Name varchar(16), Type char(1) not null check (Type in ('S', 'H', 'T', 'P')), Salary dec(9,2), Wage dec(5,2) )
Type must be S, H, T, or P, and it may not be null.
3. You may specify check constraints in CREATE TABLE and ALTER TABLE statements.
For CREATE TABLE examples, see the previous examples. For an ALTER TABLE example, see the following example.
alter table EmployeeMaster add constraint EmployeeMaster_valid_types check (Type in ('S', 'H', 'T', 'P'))
Unlike CREATE TABLE, ALTER TABLE requires me to name the constraint. I gave it the clever name EmployeeMaster_valid_types. This leads me to the next point.
4. You may name constraints or you may let the system name them.
The previous example shows how to name a check constraint in ALTER TABLE. Here’s how you name a check constraint in CREATE TABLE.
create table EmployeeMaster for system name EmpMast ( ID dec(5) primary key, Name varchar(16), Type char(1), Salary dec(9,2), Wage dec(5,2), constraint EmployeeMaster_ID_is_positive check (ID > 0), constraint EmployeeMaster_valid_types check (Type in ('S', 'H', 'T', 'P')) )
I have two named check constraints: EmployeeMaster_ID_is_positive and Employee_valid_types. I like to name constraints. If you don’t name a constraint, the system creates a name that is not usually as descriptive as the one I would come up with. Descriptive names are good to have when you’re using interfaces such as the Work with PF Constraints (WRKPFCST) CL command or the SYSCST and SYSCHKCST views.
You may have noticed that I placed the constraints after the column definitions. That brings up another point.
5. If a check constraint refers to more than one column, you must define it after all the column definitions.
If a check constraint only refers to one column, you may define the constraint with the column definition or you may define it after all the column definitions. But if a check constraint refers to more than one column, you must place it after the column definitions, like this:
create table EmployeeMaster for system name EmpMast ( ID dec(5) primary key, Name varchar(16), Type char(1), Salary dec(9,2), Wage dec(5,2), constraint EmployeeMaster_ID_is_positive check (ID > 0), constraint EmployeeMaster_valid_types check ( case when Type = 'S' and Salary > 0 then 1 when Type in ('H', 'T', 'P') and Wage > 0 then 1 else 0 end = 1 ) );
Check constraint EmployeeMaster_valid_types refers to three columns—Type, Salary and Wage. If Type is S (salaried), the salary must be a positive number. If type is H (hourly), T (temporary) or P (part-time), the wage column must be positive. It’s interesting how that test works. Let’s look at that in more detail.
6. CASE expressions open up great possibilities in check constraints.
I learned this (along with many other things) from Joe Celko. The CASE expression will return the value 1 (one) if an employee is salaried and salary is positive. It will return the value 1 if the employee is of another type and the hourly wage is positive. Otherwise, the CASE expression returns a zero. The check constraint compares the result from the case to 1 to decide whether or not to approve the data. This also catches invalid type codes. You can do a lot with CASE expressions.
7. You can do a lot with check constraints. Should you?
Check constraints relieve your applications of the necessity to validate data, but does that mean that your applications should not validate data? The answer is a matter of opinion, and I can see both points of view.
I don’t consider it bad to make an application program check employee type for an acceptable value. The advantage is that the error handling is easier if the application program performs validation. In such a case, the check constraint is there as a last resort. The downside is that the application will require modification if someone defines another acceptable employee type to the database.
Check constraints are an effective way to prevent invalid data from being stored into a database. Use them as much as you can.
RELATED STORIES
Simulate a Boolean Data Type in a Database Table
Guru: Handling Constraint Violations In RPG
Another reason to use check constraints is related to the abundance of SQL tools which may be used to perform updates and inserts. These actions occur outside of any application control or validation. Check constraints cannot be bypassed.
In #7 regarding application validation versus check constraint, if a new employee type is allowed any check constraint defined with employee type will also have to changed, i.e. DROP and ADD.
When I use CASE expressions for complex check constraints, I prefer ‘Pass’ and ‘Fail’ rather than 1 and 0. I believe it improves readability and clarity.
Perhaps you could write another installment describing the steps to update a check constraint, CHECK PENDING state, etc.
Ted,
This is a great topic and glad you covered the use of CASE on a check constraint as the CASE is much overlooked in many SQL scenarios. While I am an advocate of CHECK and RI Constraints, one issue is that RPG appears to lack the ability to easily know which constraint has fired. In a perfect world, it would be simple to obtain the constraint name from a %BIF, %ERROR MONITOR opcode or perhaps the program / file data structure. Some articles have been written about ways to use APIs or program message stack(s) to get this information, but the techniques are complex in my opinion. I authored an RFE to IBM suggesting an improvement to RPG for constraint handling. Still waiting for votes and implementation. If you’d like to write a followup article about this, maybe it would spur votes and push IBM to action? Please see link at: http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=137495
regards,
Rick
Ted,
Excellent article.
Your last paragraph is something that has troubled me since i started using constraints.
The benefit of using constraints is to move the business logic out of the program and into the database. Ideally all the validation should be done in the database.
However, it makes our programs less user friendly if we wait until we attempt to insert or update data in the database before we tell the user they entered bad data.
This is especially true when there are multiple screens of data that need to be entered.
If we put validation into our programs and also in the database then we have 2 sets of validation that need to be maintained. It is possible to make a change to one without changing the other.
I guess we have always had this problem if you consider unique keys in DDS physical files. We have been coding these since Sys/38 days and also coding the validation in our RPG programs.
I guess it is just a matter of getting used to adding validation in both places. And perhaps adding a catch all on the inserts and updates.
Doug