Top Ranked Segmentation in Sql Server 2000

Google Buzz

While doing some SQL for a BI report application, I needed to do a sort of complex view of the data, where users could segment their “clients” both by setting level of any variable (number of daily sells, number of zero selling, etc ..) or by defining rank levels like, the 1000 top sellers(/any other fact variable), then the 20,000 top sellers(/any other fact variable), and so on.
Oh.. and the application is upon an almost relational database, which only make things harder (for reporting in general).
On the first example, it’s quite simple you just a select inside another one. On the inner Select, you gather the values, on the outer Select, you group them according to your levels. Of course, if the number of segments is variable, you probably have to do a dynamic query (creating the query on-the-fly) or you could just use another table to put the name of the segment and the limit. That will do the trick.
I my case and most cases, this isn’t just this easy, because the gathering of values is also dynamic, depends of many filters, grouping options, aggregating options and most of all, the variable to apply the segmentation. But that are don’t really matter for this post.

The problem with all of this Report is the second type, the Top raking Report.
I was stuck with Sql Server 2000.. The problem was the MicroSoft Sql Server part, but the 2000 one!
Although version 2005 (and now 2008) already has the ROW_NUMBER clause and you can just order the inner Select and use the row_number field to group by on the outer Select, on sql server 2000 you can’t do none of it!
There are some options that you can use to do simulate row_number. One of then is :

Select avg(sales) , (CASE WHEN id <= [Limit 1] THEN 'A' ELSE (CASE WHEN id > [Limit 3]  THEN 'C' ELSE 'B' END) END) as my_group,

from 	(Select (Select count(*) from (Select [ big query with all client info aggregation process]) ClientAggregateData as a2
	 where a2.LName <= a1.LName) as my_row_numb, a1.sum(sales)
	 from (Select [ big query with all client info aggregation process]) ClienteAggregateData as e1) rank_table

group by my_group

(this query gives you the avg of the total sum you your clients segmented by three levels)

But as you can see above you have to have your table with the gathered value twice, in this case, a1 and a2. In some cases, like mine, creating this table only once is just enough of time consuming and the application must work as real-time as possible. You could save our “ClienteAggregateData” table as a temporary table and us it twice. Yes, you could and that is a perfectly good solution. It would look like this:

Select * INTO #ClienteAggregateData [the rest of it]

Select avg(sales) , (CASE WHEN my_rank <= [Limit 1] THEN 'A' ELSE (CASE WHEN my_rank > [Limit 3]  THEN 'C' ELSE 'B' END) END) as my_group,

from (Select (Select count(*) from #ClienteAggregateData as a2
    where a2.LName <= a1.LName) as my_rank, a1.sum(sales)
    from #ClienteAggregateData as e1) rank_table

group by my_group

DROP TABLE #ClienteAggregateData

This is a nice solution, but there are several others. I went with that one that seamed more logic and sound to me. The simples solution that I recommend for simulating ROW_NUMBER clause was to create a temporary table with the aggregated data, like before, but already ordered by the field with want to top rank and insert an Id field. Then, just do another select and group it by the rank and finally drop the temporary table (you only have to do this if you want it to drop before you close the connection, because a #temp table is always drop on connection closing).
Here is the query for the same example:

Select IDENTITY(int, 1,1) my_rank, * INTO #ClienteAggregateData [the rest of it with the correct Order By]

Select avg(sales) , (CASE WHEN my_rank <= [Limit 1] THEN 'A' ELSE (CASE WHEN my_rank > [Limit 3]  THEN 'C' ELSE 'B' END) END) as my_group,

from  #ClienteAggregateData rank_table

group by my_group

DROP TABLE #ClienteAggregateData

Hope this helps you in some way :)
Have better way of doing this. Do tell!



 
Was it any good?

Add to Technorati Favorites

AddThis Social Bookmark Button

Add 
to Mixx!