PHDBKEYSCOPE and Rdb

Thu, 25 Feb 1999 10:59:47 -0500

Chris Walker wrote:

> I am interested in the implications of changing the Rdb dbkey 
> scope used by Powerhouse from ATTACH to TRANSACTION, and I 
> would be grateful for reports of experience in this area.

Disclaimer: it's been years (literally) since I last thought about this
topic (since approximately the 6.10E era when the feature was first proposed
for PowerHouse) so I'm sure I've forgotten some (most) of the details.  

[In fact, it's been years since I last thought ... but that's a different
story. :-) ]

There was, by the way, a presentation entitled "DBkey Scope with PowerHouse
and InterBase" given by Jim Daly at the Cognos 1995 North American user
conference (and perhaps elsewhere, I don't know) -- despite the title, that
presentation also discussed Rdb.  If you can't find a copy of that, let me
know and I'll see if I can dig one up for you.

> 
> The Powerhouse 7.10 release notes seem to imply that setting 
> the scope to TRANSACTION has no effect on the way Powerhouse 
> uses dbkeys - it just goes ahead in the hope that they are 
> still valid, and rolls back the transaction if they aren't.  
> Is this true?  Can anybody give me any idea how likely such 
> invalid dbkeys are?

You are correct that setting the scope to TRANSACTION doesn't affect how or
when PowerHouse uses dbkeys.  (I assume you know when they are used, so I
won't bother describing that. )

Invalid dbkeys can occur if the row referenced by the dbkey has been updated
by another user (or if that row has been deleted by another user) between
the time you read the row initially and the time you get around to updating
it.

The chances of this occurring are slim, however, especially if you are using
any of the basic transaction models without too many changes (in particular,
changes such as committing Query transactions before the Update phase
starts).  In the basic Consistency model the updates are done in the same
transaction as did the reads so dbkeys are valid regardless of the setting
of dbkey scope.  In the basic Concurrency/Optimistic models the start/commit
of the Query transaction effectively 'brackets' the Update transaction, so
that holding dbkey values stable for the duration of the Query transaction
has the effect of ensuring they are valid within the Update transaction ...

[I think the waving hands or drawing pictures might help explain some of
this ...]  Ahh, but perhaps this summary from some investigations done here
might help:

"The general problem with changing the scope is that there are times when
the DBKEY PowerHouse has saved may become invalid or the record pointed to
by the DBKEY may not be the record originally referenced by the key.  This
is only possible if the record has been updated or deleted by another user.
Garbage collection (InterBase only) only removes old records; it does not
change the location of records and therefore will not change DBKEYs. 
With the default models, this risk is only an issue in the Concurrency or
Optimistic models, because in the Consistency model updates are always done
within the same transaction as the reads.  Outside of the default models,
the risk is still limited, but exists anytime records are read in one
transaction and updated by DBKEY in another.
In Quick it is possible for the Query transaction to terminate before the
Update transaction is started.  In the interval, updates could occur or
garbage collection could take place on records read in the Query
transaction.  
In all cases this appears to be captured as an error. The message returned
may be different from what we normally see. If the record has been deleted
by another user or the DBKEY has changed because of garbage collection PH
will return an error on the re-read. In Rdb the error message is very
concise "dbkey is no longer valid" and a reference to the actual DBKEY used.
However, in InterBase the message is the more obscure "request
synchronization error". There will be no way to differentiate this from
other synchronization errors that may occur during the update cycle.
If the DBKEY is pointing at a different record, then the normal checksum
process will detect the discrepancy and issue the "record has been changed"
error. There is a very minute chance that the checksum of the new record is
the same as the first record but it is no greater than the case where an
updated record would produce the same checksum as the original.
In the case of QTP, the default model is the Consistency model, therefore
all work is done within a single transaction.  If, however, the Concurrency
model is being used, there are several areas where DBKEYs could change if
transaction level DBKEY scope is used. Whenever a sort is required QTP will
read all the records into a work file and then sort it. The sorted file will
then be read to process the updates. These will be done in separate
sequential transactions and therefore be susceptible to update collisions as
well as invalid DBKEYs. 
This same scenario is possible whenever the input and output files are the
same because QTP uses a temporary file to save all the records before
updating the database. "

Does that answer your question?  [Hello?  Anybody still awake?]

> Is there much chance of improving locking performance by 
> changing to TRANSACTION?

What aspect of locking performance are you hoping to improve?  

As far as I recall, the dbkey scope setting didn't affect locking directly,
or rather that wasn't the main reason we added the option to allow customers
to set dbkey scope themselves.  
[On the other hand, I did hear that changing the dbkey scope setting
resolved locking problems caused (or perhaps encountered?) by someone
running Quiz reports.  Urban legend?  Perhaps.]

As far as I recall, there were two main problems encountered by customers:

- For InterBase users, dbkey scope of 'attach' prevented 'garbage cleanup'
from occurring, leading to overall performance degradation and other
problems as the list of 'interesting transactions' got longer and longer.

- For Rdb users, dbkey scope of 'attach' meant that entries on pages could
not be re-used until the attach was released.  Often this had little impact,
though in cases such as jobs that deleted a lot of rows in one transaction,
then inserted a lot of rows in a subsequent transaction, the space freed by
the delete step could not be used in the insert step.  This resulted in
wasted space, and possibly forced area extents to occur.  If I remember
correctly, for Rdb as soon as any user attaches with dbkey scope of
'attach', all users must assume this mode.   

As I said at the beginning of this message, I've forgotten a lot of the
details about this topic -- if you've had replies from people who used this
feature, I'd be interested in knowing what their experiences were.

Not sure if any of that helps,

db (brownd@cognos.com)

Diane Brown
Cognos Inc., Ottawa, Canada
mailto:Diane.Brown@cognos.com

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Subscribe: "subscribe powerh-l" in message body to majordomo@lists.swau.edu
Unsubscribe: "unsubscribe powerh-l" in message to majordomo@lists.swau.edu
powerh-l@lists.swau.edu is gatewayed one-way to bit.listserv.powerh-l
This list is closed, thus to post to the list, you must be a subscriber.