Resolution: TurboIMAGE database being corrupted by QTP
Walter J. Murray
wmurray at surewest.net
Sun Jan 4 23:38:34 CST 2009
Greetings,
Thanks to all who responded to my Christmas Eve plea for help. The
problem is resolved, although not fully understood.
After running a number of tests and trying many things, I finally broke
down and did a DBUNLOAD, purged and rebuilt the database, and did a
DBLOAD. That process took about 18 hours on our test machine, but only
about 2 hours on the production machine. The problem has not occurred
since.
The problem had nothing to do with QTP. I was able to duplicate it with
a simple COBOL program, and even with QUERY. A series of DBPUTs to the
data set in question would eventually result in an error, and leave the
database in a corrupted state.
Several people asked whether the database used Large File Data Sets
(LFDS). I had originally discounted that, because I knew that none of
the data sets, even if dynamically expanded to their maximum capacity,
would come very close to 4GB in size. However, upon checking, I found
that, according to both DBUTIL and the LFDSDET utility, the database did
supposedly have at least one LFDS. I do not know how it got that way,
or how long it had been that way.
My guess is that, at one time, the database might have used LFDS,
because of the data set in question, which does use dynamic data set
expansion, having a maximum capacity that would put it over 4GB.
Perhaps the maximum capacity was lowered, but the LFDS attribute
remained. Even so, I don't think there were ever enough entries in that
data set to cause it to be larger than 4GB, so this corruption doesn't
fit the description of the known LFDS bug in TurboIMAGE. There should
never have been a DBPUT that would cross a 4GB boundary.
Further diagnostics revealed two other anomalies. Each of the two
automatic masters linked to the detail in question contained one
"stand-alone" entry, that is, a master entry not linked to any detail
entries. This condition shouldn't exist, but, supposedly, it is not an
error and shouldn't cause a problem. Again, I don't know how this
condition arose, and I don't know how long it had existed.
The problem occurred as soon as the delete chain was exhausted. The
very next DBPUT triggered the error and the corruption. Prior to that
DBPUT, the database was reported to be structurally sound. However,
looking into the data set in question, I saw two entries in the first
block of data that had corrupted chain pointers.
My hunch is that the problem was somehow related to the LFDS feature.
At some point in the past, the database became corrupted, but in a way
not detected by our utility that checks such things. Everything
appeared to work fine until the delete chain was exhausted, and then the
latent corruption caused a serious error and more extensive corruption.
Lessons learned: (1) I'll run LFDSDET occasionally to make sure no LFDS
databases have been created. (2) In the future, I might waste less time
trying to diagnose and repair the problem, and just go ahead and do the
DBUNLOAD and DBLOAD. (3) I'll recommend, again, that we install 7.5
PowerPatch 5, which, I believe, includes the patch that disables the
creation of LFDS databases.
Thanks again to all who offered suggestions, especially those who
suggested the possibility of corruption caused by the LFDS bug.
Walter
Walter J. Murray
-----Original Message-----
From: Walter J. Murray [mailto:wmurray at surewest.net]
Sent: Wednesday, December 24, 2008 7:59 PM
To: HP3000-L at RAVEN.UTC.EDU
Subject: TurboIMAGE database being corrupted by QTP
[Also posted to the PowerHouse list]
TurboIMAGE database being corrupted by QTP
Please help me get to the root of this problem. It looks like either a
bug in TurboIMAGE (I hope not!) or a bug in PowerHouse 4GL (hard to
imagine).
Environment: MPE/iX 7.5 PP3, PowerHouse 8.39.C1.
I have this production database that occasionally gets corrupted during
a QTP run. I can duplicate the problem consistently on a test machine.
I don't just mean corrupted with bad data--I mean structural errors in
the database. The errors always seem to be associated with a particular
detail data set. DBGENERAL typically reports the errors as (1) FREE
SPACE COUNT INCONSISTENT, (2) DELETE CHAIN MISSING ENTRIES, and (3)
DELETE CHAIN BROKEN. Sometimes there are also structural errors related
to the paths between this detail data set and the two related automatic
masters. I can repair the damage with DBGENERAL, but the database gets
corrupted again in a few days.
QTP reports the problem with these error messages:
"Data access error."
"Action Taken: Run terminated."
"MPE FILE ERROR 0 RETURNED BY FREADDIR ON ROOT FILE"
After QTP terminates the run, I check the database and find it
corrupted.
I am thinking that TurboIMAGE usually does an excellent job of
protecting itself against structural errors. Without using PM, it
shouldn't be possible for an application to damage a database in this
way. I wonder whether QTP always uses the IMAGE intrinsics, or whether
it might sometimes sidestep IMAGE and modify internal database
structures directly. Does anybody know?
The detail data set in question has about 15,000,000 entries. It is
enabled for dynamic expansion, but the current capacity is still equal
to the initial capacity, so no expansion has occurred yet. None of the
master data sets is enabled for dynamic expansion. The data set is
related to two automatic masters, and both of those paths are sorted.
I've placed a support call with COGNOS, but they haven't come up with
anything yet. I also plan to place calls to HP and Bradmark. Can
anybody help in the mean time? Thanks.
Walter
Walter J. Murray
More information about the powerh-l
mailing list