In this post, we’re going to review simple design pattern changes in our code that can have huge impacts on performance with moderate-to-large data sets. Where we might default to using GlideRecord
, we can leverage GlideAggregate
instead.
The use cases here might be few & far between, but as the community-at-large starts building bigger and more complex applications on the platform (a shared space), every bit of performance counts!
Scenarios to Use GlideAggregate
Below are a few examples where leveraging GlideAggregate
can be noticeably more performant than GlideRecord
when working with large data sets (thousands to tens/hundreds of thousands).
When you need to query Problems for all Incidents with a Problem
The example here is a scenario where in Table A
(Problem) we need to query all Problems that have at least one incident Table B
. So for all “Incident.Problem” values, we need to get all the unique problems. But we need data from those problem records, so we still need to query the problem table.
The Trap
The trap many of us fall into is defaulting to GlideRecord
– at best, building a unique array of problem IDs, and then querying the problem table. At worst, making a query to the problem table for every incident we loop through!
NOTE: I am purposefully ignoring groupBy()
, as GlideAggregate
has other performance improvements I’ll get to later.
The GlideRecord Version
I’m terrible at math, so I’ll instead describe what’s going on and provide a code example.
In this scenario, we use GlideRecord
to get the unique Problem set, build a unique array of Problem IDs, and then re-query the problem table. Here, the performance can quickly get out of hand as your instance ages and grows (ie – the number of incidents/problems increase).
Let’s take, for example, that you have 250,000 incidents and 5000 problems, where 1000 problems have incidents. If we use GlideRecord
to build our array:
- We setup an empty array to store our “unique IDs”
- We make 1 DB query to get all 250,000 incidents (and ALL of their data, including thousands of characters descriptions, resolution notes, etc)
- We loop through 250,000 incidents, checking to see if
incident.problem
exists in our array of unique problem IDs
- We loop through 250,000 incidents, checking to see if
- We exit our loop, make our query to the 1000 problem records, and loop through this 1000 times (It’s where our data is, so no avoiding it here)
GlideRecord Example (Run time: 00:01:05.486 – 1 minute 5 seconds)
var uniqueProbs = [];
var incs = new GlideRecord('incident');
incs.addQuery('active', true);
incs.addQuery('problem', '!=', '');
incs.query();
while(incs.next()){
if(uniqueProbs.indexOf(incs.getValue('problem')) == -1){
//not in our list, add it!
uniqueProbs.push(incs.getValue('problem'));
}
}
var problems = new GlideRecord('problem');
problems.addQuery('sys_id', 'IN', uniqueProbs.join(','));
problems.query();
while(problems.next()){
gs.info(problems.number); //do other stuff if ya want / need
}
The GlideAggregate Version
In the GlideAggregate
version, we’re going to leverage the COUNT
operator to only return back the unique Problem_ids. This also has the side-effect of giving us the “number” of incidents with that given problem_id. However, we do not care about that in this scenario. What’s interesting is that, most often, GlideAggregate
is seen ONLY for this purpose (Counting, Avg, Max, Min, etc) where it does have some novel uses as well.
Another unique aspect of GlideAggregate
is that it ONLY queries for the “Aggregate Data” and does not query for ALL data for each record. This is the difference between asking the DB to serve up potentially Megabytes of data instead of only Kilobytes of data – in some cases, only single-digit kilobytes of data.
- In this version, we still make our UniqueID array to store the problem sys_ids
- We then make 1 DB query to our 250,000 incidents. However, instead of getting back 250,000 rows, we get back “entries” for each unique problem_id value (150)
- We then loop through our 150 results (instead of 250,000)
- We can SKIP the “Do you exist check” since our
GlideAggregate
took care of that
- Finally, we make our call to the Problem table to get our unique problems
GlideAggregate Example (Run time: 00:00:07.219 – 7 Seconds)
var uniqueProbs = [];
var incGA = new GlideAggregate('incident');
incGA.addQuery('active', true);
incGA.addQuery('problem_id', '!=', '');
incGA.addAggregate('COUNT', 'problem_id');
incGA.query();
while(incGA.next()){
uniqueProbs.push(incGA.problem_id + ''); //coerce to string or we get object reference instead
}
var problems = new GlideRecord('problem');
problems.addQuery('sys_id', 'IN', uniqueProbs.join(','));
problems.query();
while(problems.next()){
gs.info(problems.number); //do other stuff if ya want / need
}
With one minor change to our code block, we reduce our run time by considerable margins over GlideRecord
. These performance gains will compound with more records returned in the first query.
The Trap: Scenario B – The Worst Case
While not often, I have seen scenarios like this out in the wild. In short, we loop through 100,000 incidents, keep track of our unique problems, but query INSIDE of our incident loop for each problem individually as we find it. I’ll post a code snippet instead of explaining to save us all some time (pun intended).
var uniqueProbs = [];
var incs = new GlideRecord('incident');
incs.addQuery('active', true);
incs.addQuery('problem_id', '!=', '');
incs.query();
while (incs.next()) {
var probId = incs.getValue('problem_id');
if (uniqueProbs.indexOf(probId) == -1) {
uniqueProbs.push(probId);
//this is causing a DB Query for EVERY problem we find. This will absolutely wreck your performance at scale.
var problem = new GlideRecord('problem');
if (problem.get(probId)) {
gs.info(problems.number); //do other stuff if ya want / need
}
}
}
Summary
To sum it up, using GlideAggregate
to build unique arrays or simply reduce the data being returned can be a huge speed improvement.
- GlideRecord Run Time: 1 minute 5 seconds
- GlideAggregate Run Time: 0 minute 7 seconds. (Yea, almost 10 times faster! CRAZY!)