Fun with SQL and in line queries
July 29, 2007 – 10:40 amSo I’ve grown to not like the coding style of the guy before me at all. Some of his methods are quite, recursive and hellish when it comes to SQL. Several times, the easy way was taken rather than the correct way. To give you a bit of background, the main application I support controls at outbound call center. Agents log into the system and dial out with it. In order to track what a current agent is doing there is a table called actions. Everything from logging in, logging out, and more is logged here. Over the past 2 years, this has led to over 7 million records in the table. If you’re not a tech person, it’s about to get deep.
In order to determine if a user is logged in, the previous programmer would do the following. Query the database for all users with a type of 2 (call center agent). After that for each user in the database of type 2 (call center agents, currently ~120 agents exist), he would query the actions table for the users last action, whether it’s action type 1 (login) or 2 (logout). If the first returned row is of action type 1, the user is logged in. Lastly, the code queries the actions table to see what the last campaign a user logged into. (action type 20). After throwing in some basic code to track the queries and time of this (and thanking John Davis along the way for teaching the importance of big O notation) I learned that an average check of who’s logged in totaled around 150-200 queries along with taking 45 seconds to a minute. So this was one of my goals in Nashville last week, to write one massive query I could use to get the results I’m looking for. To assist me in the insanity, I called master of all that is shoes, Brian Schumacher. We worked together back at Irresistible Ink as data developers and since he’s gone on to be a DBA in Klamath Falls, Oregon. Well he quit last week but that’s his story not mine.
So after about 45 minutes of being on the phone and utilizing the greatness that is pastebin, we worked out a viable solution. I present to you, possibly the most complicated query I’ve written to date.
SELECT
loggedIn.lastname, loggedIn.firstname, campaigns.cpnname, DATEDIFF(ss,[actions_4].[actionwhen],getdate()) AS LoggedIn, DATEDIFF(ss,[actions_3].[actionwhen],getdate()) AS CampaignLogin
FROM
(
SELECT
employees.unqempid, employees.lastname, employees.firstname, ActionIDLoggedIn.actionID
FROM
(
SELECT
unqempid, MAX(actionid) AS actionID
FROM
actions
WHERE
(actiontypeid = ‘1′)
GROUP BY unqempid
) AS ActionIDLoggedIn
INNER JOIN
(
SELECT
unqempid, MAX(actionid) AS actionID
FROM
actions AS actions_2
WHERE
(actiontypeid = ‘2′)
GROUP BY
unqempid
) AS ActionIDLoggedOut
ON
ActionIDLoggedIn.unqempid = ActionIDLoggedOut.unqempid AND
ActionIDLoggedIn.actionID > ActionIDLoggedOut.actionID
INNER JOIN
employees ON employees.unqempid = ActionIDLoggedIn.unqempid
WHERE (employees.type = 2)
) AS loggedIn
INNER JOIN
(
SELECT
unqempid, MAX(actionid) AS actionID
FROM
actions AS actions_1
WHERE
(actiontypeid = ‘20′)
GROUP BY
unqempid
) AS maxCampaign
ON maxCampaign.unqempid = loggedIn.unqempid
INNER JOIN
actions AS actions_3 ON maxCampaign.actionID = actions_3.actionid
INNER JOIN
campaigns ON actions_3.actioninfo = campaigns.cpnid
INNER JOIN
actions AS actions_4 ON loggedIn.actionID = actions_4.actionid
ORDER BY
loggedIn.lastname, loggedIn.firstname
The end result, we’ve shortened this down to 1 query that takes 2-3 seconds. Hopefully this post can only serve as help to some programmer stumbling along a similar problem as mine.