Performance Questions and a little need for some education.

Tue Aug 18 13:51:30 CDT 2009

Guy's comments reminded me that you can use CHOOSE with an index name 
(CHOOSE VIAINDEX) to retrieve data via that index name. You may be able to 
get rid of a sort (use SORTED) or at least make a sort more efficient 
because the data will be read in part of the required order.

What follows are some tips that were from the Advanced PowerHouse Topics 
Seminar. They should still be useful.

Bob

ACCESS

• The ACCESS statement is intended to provide a data structure that can be 
used to report or update.  The actual retrieval sequence should be 
optimized for that purpose.

• ACCESS statements should be coded keeping data structure and content in 
mind, in order to achieve significant I/O reductions.

  Example

  > ACCESS fileA LINK TO fileB LINK TO file C
  > SELECT IF conditions

  If the conditions that apply to fileC are seldom satisfied, but the 
conditions that apply to fileB are frequently satisfied, then a 
significant I/O reduction is achieved by coding.

  > ACCESS fileA LINK TO fileC LINK TO fileB

• Similarly, specify required files early in the ACCESS list, and OPTIONAL 
files later.

• Take advantage of records with unique keys.  QUIZ/QTP has "smart" file 
retrieval where these values are concerned, and only rereads them when the 
key value changes.

• Sorting can be avoided by linking via a key which would have been the 
sort-key.  This is especially true in QTP.  SORTED can be used instead of 
SORT when it is important to have all transactions with a specific key 
value together, but the sequence of key values is unimportant.

CHOOSE Statement

• The CHOOSE statement forces a keyed retrieval on the primary file in the 
ACCESS statement.  When no CHOOSE statement is specified, the primary file 
is read sequentially.  Employing CHOOSE can improve performance by 
reducing the number of records read.  Always use CHOOSE instead of SELECT, 
if specific key item values are known.

  Example

  > ACCESS fileA        ; contains 10,000 records
  > SELECT IF key-item-of-A = "ok"      ; true for only 100 records

  will read 10,000 records, whereas the statement

  > ACCESS fileA
  > CHOOSE key-item-of-A "ok"

  will read only 100 - a 100-fold improvement.

• CHOOSE can be used without a key value for KSAM files to avoid sorting, 
since the read is in key sequence.

SELECT Statement

• SELECT file IF is more efficient than SELECT IF, due to the timing of 
the condition evaluation.  Use SELECT file IF when the condition is based 
on one file.

• The one exception to this is when a file in the SELECT file IF is 
retrieved via a unique key.  If the condition fails, the buffer is 
initialized and the record must be reread for the next complex, even if 
the key value is the same (the "smart" retrieval cannot be done).  In this 
case, use SELECT IF.

DEFINE Statement

• DEFINE statements in QUIZ are evaluated once per record complex, as soon 
as the required data is read.  The DEFINE is evaluated only when the 
records required to evaluate the DEFINE are reread.  Evaluation can be 
delayed by conditioning on a file at the end of the ACCESS list, using 
RECORD file EXISTS.  This is useful when the DEFINE is based on items in a 
file that occur early in the ACCESS list, but a large percentage of record 
complexes will be rejected.

• DEFINE statements in QTP are evaluated when the name is referenced. 
Combine expressions whenever possible to avoid extra evaluations.

  Example

  > DEFINE A = 1
  > DEFINE B = A + A
  > DEFINE C = B + B

  When C is referenced, A must be evaluated four times.  Note that in 
QUIZ, each DEFINE is evaluated at most once per record complex.

• If the condition and result are constants, the CASE option can be used 
and is more efficient than IF/ELSE.  With either option, sequence the 
conditions such that the most likely occurs first. 

SORT vs SORTED

• Overuse of the SORT statement is a common error.

  Example

  > ACCESS fileA LINK TO fileB
  > SORT ON sort-key-1
  > REPORT SUMMARY sort-key-1 sort-key-2 other-item
  > SET SUBFILE
  > GO
  > ACCESS *QUIZWORK LINK TO fileC
  > SORT ON sort-key-1

  In this case, the sort in the second pass should read SORTED because the 
subfile is already in the correct sequence.

• SORTED, instead of SORT, is especially useful in QTP when the sort-key 
is also the key used for linkage and retrieval.  Because all of the 
records for one key value are retrieved together, (and if no other file 
items must be sorted), there is no need to sort.  It is the grouping by 
key value that is important, not the sequence of groups.  Note that SORTED 
does not check to ensure that records are in true sequence.

• For indexed files only, a sorted read on the key can be forced using the 
CHOOSE statement with no key values.  In this case, the KSAM key file 
structure eliminates the need for a SORT.

  Example

  > ACCESS indexed-file
  > CHOOSE VIAINDEX indexed-key
  > SORTED ON indexed-key-segment

• Because the limiting factor in sorting is generally physical I/Os, 
efficiency can be improved by presorting a smaller record. In the case 
where fileB is a big record (many bytes per record).

  Example

  > REQUEST ONE
  > ACCESS fileA
  > SORT ON segment
  > SUBFILE SORTKEY INCLUDE segment
  > REQUEST TWO
  > ACCESS *SORTKEY LINK segment TO segment OF fileA LINK segment OF fileA 
TO segment OF fileB
  > SORTED ON segment
  >—-> other statements <—

  The above example is more efficient than

  > ACCESS fileA LINK TO fileB
  > SORT ON segment
  >—-> other statements <—

  because the sort is performed on a smaller record so that a single 
physical I/O transfers many more logical records.

• This technique is most useful in QTP, which sorts the entire 
transaction, but is also useful in QUIZ if a large record complex is to be 
sorted.  This technique not only improves speed, but also requires less 
disk space for sorting.

• The previous technique works if the subfile can include the key item 
that links to the other files.  Sometimes, the sort-keys are usually not 
the key items, and the key items cannot be conveniently added to the 
subfile.  This may be the case in a complex linkage where the subfile 
construction requires the complete ACCESS statement.  In effect, all 
linkages and record retrieval would be done twice.

• If disk space is a concern, and the transaction is large, a three pass 
technique can be used.  The linkage and record retrieval is done in the 
first pass, which creates two subfiles - one with the sort-keys and a 
counter, the other with the transaction.  Any selection should be done in 
the first pass.  The sort-key subfile is sorted and then linked to the 
second subfile by record number.

  Example

  > REQUEST ONE
  > ACCESS file1 LINK TO file2… 
  > TEMPORARY RECORD-COUNT INTEGER SIZE 4
  >   ITEM RECORD-COUNT COUNT
  > SUBFILE SORTKEY INCLUDE RECORD-COUNT, sort-key1, sortkey2…
  > SUBFILE TRANS INCLUDE file1, file2…
  > REQUEST TWO
  > ACCESS *SORTKEY
  > SORT ON sort-key1, sort-key2…
  > SUBFILE SORTKEYS INCLUDE SORTKEY
  > REQUEST THREE
  > ACCESS *SORTKEYS &
  >   LINK TO RECORD (RECORD-COUNT - 1) OF *TRANS
  > SORTED ON sort-key1, sort-key2…
  >——->other statements<——

QTP Techniques

• Use a TEMPORARY with ITEM statements instead of a DEFINE.  The DEFINE is 
evaluated when referenced.  The request can be constructed to evaluate the 
TEMPORARY once only.  This is useful if the item value does not change 
during the request, and therefore should not be evaluated.  Constant 
values include items based only on execution-time parameters.  If a value 
does not change for an entire run, use a GLOBAL TEMPORARY item.

• Update at control-breaks whenever possible.

• Condition changes to record items for files to be updated, if the record 
status may not change.  QTP updates only if item values change.

• Evaluate the structure of large runs and requests (and reports in QUIZ). 
 It may be possible to combine passes that access the same file, thereby 
reducing I/O.  If this is done, ensure that protective mechanisms are not 
being bypassed, as when using one pass to update master files and the next 
pass to delete the transactions already used.  Alternatively, it may be 
more efficient to split a large request into two, using subfiles to pass 
data.