Which is faster? (long)

Wed Apr 5 05:38:19 CDT 2006

(I wrote this before I read some of the other excellent replies, so this goes over some of the same ground, but I didn't want to waste it :-))

I like eggs and have eaten (but rarely sucked) many in my life (both personal and private), however to get philosophical about PowerHouse ...

This may get long (hopefully not winded) ...

Keep in mind that PowerHouse does not presume how the relational database will do its retrieval and selection. We pass as much as possible down to the database without making assumptions. So, while the CHOOSE is intended to be used with indexes, when we're talking relational, we don't assume an index will be used. And because it's a relational database, you can use the CHOOSE with any column, regardless of whether it is an index or not.

Think of it this way. The CHOOSE statement, in it's original concept for indexed files (KSAM on HP3000) and IMAGE (a two-level networked hierarchical database) was to allow keyed or indexed retrieval. That is, it used a property of the file or database. What does that mean physically? It means that you provide a value to the file or database, and the file or database returns a record based on that value. For an indexed file, the file system reads down the index and returns the next record. For IMAGE, it might be the unique record in an IMAGE master or the next record in an IMAGE detail chain. The point is that the file or database returns the record based on the value. Plus, you can't tell an indexed file or IMAGE database to return a value based on a non-key or non-index item because the file or database system doesn't know how to do that. If you wanted that, you'd have to do it yourself (i.e. at the PowerHouse level) by doing a SELECT.

But a relational database DOES let you ask about ANY column, not just keys or indexes. The relational database quite happily processes a request on any column and returns the appropriate rows. So the CHOOSE statement, in it's more modern incarnation, is to specify a value (in the primary or driving file, dataset, or table) that the file or database system will process and return values. For a relational database system, this means ANY column.

One of the ideas behind a relational database is that processing gets pushed down to the database level. This is just one example. Since the CHOOSE value gets pushed down to the database level (it does not get "converted" to a SELECT), it is theoretically more efficient that the SELECT, which stays at the PowerHouse level.

It is still true that SELECT acts on CHOOSEn records/rows. The CHOOSE acts at the file/database level to restrict the number of records/rows returned to PowerHouse. The SELECT (both SELECT IF and SELECT file IF) act on the records/rows returned.

We boldly assume that the relational database will take the most efficient method to retrieve the rows we request. Even though we provide a VIAINDEX, we do not retrieve by index. We use that to break out columns that are index segments. Again, we assume that if we provide column information that corresponds to index segments that the relational database will use the index. But it may not. It's up to the database.

Now to the second part - how PowerHouse treats relational queries based on it's own statements.

I've mentioned before that the ACCESS statement looks at the tables specified as if they were indexed files. What I mean is that QUIZ/QTP retrieves each table individually as opposed to pushing a join to the relational database level. The CHOOSE is used to modify the SQL SELECT for the first table. The QUIZ/QTP SELECT statements are applied the same was they are for indexed files, on retrieved (CHOOSEn) rows. This is why writing your own cursor is so much more efficient.

AS for QUICK, the FILE statement is a declaration and does no retrieval (unlike an ACCESS which is both declaration and action). The retrieval generated based on the FILE statements are GET verbs which are single retrievals. The SELECT in QUICK is also applied at the QUICK level to retrieved rows.

I think that's enough for now.

Bob

-----Original Message-----
From: powerh-l-bounces+bob.deskin=cognos.com at lists.sowder.com
[mailto:powerh-l-bounces+bob.deskin=cognos.com at lists.sowder.com]On
Behalf Of fernando.olmos at hpa.com.au
Sent: April 4, 2006 6:56 PM
To: powerh-l at lists.sowder.com
Subject: RE: Which is faster?

Sorry guys and I don' mean to tell you both how to "suck eggs", but the
CHOOSE statement is *meant to be used* with indexes, and not just any
column. I was thought to believe that a CHOOSE on a non-indexed column,
PH would convert it to a "SELECT tbl IF col = value", so what is the
difference?

I think what I am getting to, with all these questions and this thread
of discussion, is how EXACTLY does PH convert ACCESS statements (or FILE
statements in QDESIGN) into SQL queries? Now this is going into the
bowels of the PH language and how the parser modules of the language
actually work, I guess.

     This message may contain privileged and/or confidential information.  If you have received this e-mail in error or are not the intended recipient, you may not use, copy, disseminate or distribute it; do not open any attachments, delete it immediately from your system and notify the sender promptly by e-mail that you have done so.  Thank you.