Applications Misfire When Database Integrity Ignored
September 24, 2012 Dan Burger
By the time the cow is out of the barn, it’s too late to close the door. For many companies, including those in the IBM midrange community, applications that have served well for many years with patches and fixes are needing attention just like that barn door. The choice between proactive and reactive problem solving is one way of looking at the situation. The choice between application-centric development and data-centric development also awaits.
Just about every IBM i shop is engaged in application-centric development, which is both good and bad. Application development is the engine that drives business and can move it forward in leaps and bounds. Relationships between data and business rules, however, have a way getting jumbled over the course of time. And when new developers are unaware of how and why applications were created and altered along the way, things get messy. At some point–and a lot of organizations are at that point now–what you get is garbage in/garbage out. What we’re talking about is referential integrity.
Referential integrity is the critical issue in choices between approaches that are either proactive or reactive and between data-centric processing and application-centric development. It also plays a large role in making future development much easier or much more difficult. The catch? It requires a change from the traditional thinking.
“A lot of people don’t want to hear that,” says Dan Cruikshank, one member of IBM DB2 on i team who has seen what old applications can do to snarl new development.
The truth of the matter, Cruikshank says, is that as companies go forward with browser-based application development and cross-platform data access, it’s the database that plays the critical role. Often when external applications come into the mix, problems come to the surface. In short, it’s bad data, or data that lacks integrity. Errors are made under the covers because data relationships are incorrect or hidden within the applications.
The blame is often placed on the IBM i system, on RPG, or on the DB2 for i database. And conclusions that the system is antiquated and should be eliminated often lead to expensive, time-consuming, and misguided endeavors.
Here’s a typical example. The person who wrote the in-house application or the software company that wrote the application is no longer in the picture. Although the app still works, it needs to be updated and extended to fulfill a new business need. Let’s say that process involves the use of PHP to create a browser-based front end.
The PHP developer, because of his training and education, works with the assumption that the DB2 for i database is handling data integrity–the relationships between data and business rules. That turns out to be wrong. The application is responsible for the relationships, but it’s not obvious how the data relationships are structured. So the PHP developer starts noticing incorrect results and may be unknowingly inserting and deleting data thinking the additions and deletions are the master versions of the data. Orphaned details are discovered with no trace of where they belong.
Most Web developers have learned modern database design that includes referential integrity. Application-centric designs cause them to slow development and attempt to figure out where the errors are occurring.
“That leads to ‘end runs’ and the app teams start developing using another, more familiar, relational database,” Cruikshank says. “They come from a data-centric background and they may think that DB2 for i is not a data-centric database.”
“Decisions have to be made regarding when the programs will no longer be responsible for checking the integrity of the data and that responsibility is on the database,” says Mike Cain, senior technical staff member at IBM’s DB2 for i Center of Excellence.
This is widely accepted by the application development community outside of IBM i’s sphere of influence. The benefit is worry-free database integrity when writing new apps or modifying old apps.
“Referential integrity checking needs to be done in the database, so that all apps that are searching for data have one place to look and no data can be hidden or bypassed because of an app developer idiosyncrasy,” Cain says.
When database relationships are implied and defined inside a program without good data modeling–and someone tries to write, for instance, a PHP or Java app without duplicating the logic in the original RPG program–it creates a problem. It either shows up as bad data or the developer has to invest more time (which means more money) in writing a program and replicating all that logic when it should always be in a low level within the database.
“Sometimes there is no one at the shop who can explain what’s going on,” Cain says. “And sometimes the shop might be motivated to use a different (non-IBM i) agenda. Other times it is discovered that the database and the applications were written in a style that is no longer current and widely acceptable.”
Another example of undermined referential integrity is when the goal is to take data and turn it into information. When querying the data and building reports, the lack of integrity may be recognized, but it requires someone to dig into a project and notice left over artifacts that are disconnected from business logic.
Cain advises companies to be proactive about getting a handle on data integrity by profiling it and understanding it.
“Application programmers get around the disconnects logically, but the junk remains–and eventually relationships between data can change. Instead of relying on a programmer who may or may not be on watch in the future, it is better to put the logic in the database that is always there and on task,” Cain says.
Giving the database the rules, the responsibility to maintain the rules, and the responsibility to maintain integrity will be a benefit query optimization, he says.
Cruikshank, in defense of the DB2 database, says there’s a notion that relational databases are going away, but he believes it won’t be anytime soon.
“The people who designed relational databases were visionary,” he says. “They saw the scalability issue. Other databases like SQL scale horizontally, which adds hardware and people to manage it. When databases grow vertically, they only require more memory and disk, if you follow a blueprint.”
The key to the blueprint is building business rules into the database, he says. Traditional application-based development is a waste of money. He describes data-centric programming as “world-class” enterprise programming and by comparison refers to application-based programming as third world. Cruikshank’s estimates that 80 to 90 percent of traditional IBM i customers do not have the skills for data-centric development. Those skills need to be upgraded.
“If you love the IBM i platform, your growth strategy should be moving to data-centric programming,” he says. “It’s important to continue using RPG skills, while adding SQL skills to replace DDS for defining and accessing the database. All the functions and little things you put in RPG get moved to functions withinDB2. It’s the perfect roadmap. The transition is not difficult when combined with modern development tools.”
Companies that are developing in-house should be investing in training and skills and tools and proactively preparing for the future with a database plan.
“Show me someone who doesn’t have a business problem today or is not going to experience a business problem, and I’ll show you a company that is going out of business or is going to outsource its IT. If I was working for a company that was not willing to invest, I’d be working on my resume.”