SQL Bar Charts
April 13, 2011 Hey, Ted
A while back I had a problem that appears to be quite common, but just rare enough that it’s a little challenging when it hits you. Often in our roles as database administrators we need to determine various figures, such as how many key values in a table are used up in case we need to extend the field. Or sometimes we just to provide a statistical representation of data. What I needed was to produce a histogram using SQL.
Let’s say you have an array of sample statistics and you want to break them into groups of an arbitrary size. Then you want to count the number in each group. Using the modulus function (remainder after division) in SQL you can determine the start value of each group as:
value - mod(value, group_size)
Then you only need to count them and group them.
select value-mod(value,500), count(*) from MyFile where My_Selection_Criteria_is_true group by value-mod(value,500)
This produced a result in my case like:
Limit Count ===== ===== 0 480 500 290 1,000 134 1,500 247 2,000 268 2,500 287 3,000 259 3,500 230 4,000 232 4,500 221
This is a pretty darn cool use of the mod() function. It could be glorified even more being wrapped in a procedure that generically opens a passed file, field, and group size.
Daz’s interesting idea–which included the word histogram–got me to thinking.
Everybody likes pictures. Businesses take the numbers IT gives them and change them into all sorts of pretty graphs and charts. Why can’t SQL do the same, albeit it at a rudimentary level, in order to call attention to certain rows in the result set?
Here’s a report that shows customers and how much they spent with us.
CUSTOMER AMOUNT ============ ========== Bill 489.00 Bob 1,276.00 Sue 1,750.00 Jim 1,355.00 Henry 67.00 Kyle 3,000.00 Betsy 2,600.00 Joe 1,688.00 Hector 1,250.00 Ray 749.00 Flossie 2,610.00 Pete 1,399.00
Nothing stands out, so let’s look at it another way. Here I print an X for each 500 units of currency they spent with us. The amounts are rounded, so one X really means 250 to 749 currency units. Also, I used a different way of calculating from the one Daz used, just to show another way to group the data.
CUSTOMER AMOUNT BAR ============ ========== ======= Bill 489.00 X Bob 1,276.00 XXX Sue 1,750.00 XXXX Jim 1,355.00 XXX Henry 67.00 Kyle 3,000.00 XXXXXX Betsy 2,600.00 XXXXX Joe 1,688.00 XXX Hector 1,250.00 XXX Ray 749.00 X Flossie 2,610.00 XXXXX Pete 1,399.00 XXX
The highlights and lowlights stand out, don’t they?
Here’s another view of the same data. This time I count the number of customers in each range of sales values.
with Summary as (select int(round((amount/500),0)*500) as Category from salesdata) select Category, count(*) as Count, repeat ('X', count(*)) as Bar from Summary group by Category order by Category CATEGORY COUNT BAR ========== ===== ========== 0 1 X 500 2 XX 1,500 5 XXXXX 2,000 1 X 2,500 2 XX 3,000 1 X
It’s obvious that most customers spend around 1,500 units of currency with us.
There are probably a lot of other things we could do with SQL. We just need a little more imagination.