I am attempting to write a search that creates arbitrary "buckets" for qualifying events using a numeric code (1-5). For this particular search, I'm calculating the number of orders that individuals placed during a specific window of time, then I need to take the maximum number of orders by ANY of the individuals, divide that by 5, and then assign a classification to all of the individuals based on how many orders they placed.
Example:
User Orders Code
1 2 1
2 5 3
3 10 5
4 4 2
Because of the format that the data is in, finding out how many orders each user has is really complicated and I really don't want to have to run it twice.
To get the largest number of orders in the dataset, I run:
... | stats max(Orders) as MaxOrders
Unfortunately, once I use a stats command, I can't go back and run calculations against the dataset without doing a join or append search that would run ALL the data through the calculations to find out how many orders they have all over again.
I've tried using appendpipe:
... | appendpipe [ stats max(Orders) as MaxOrders ]
This creates a whole new event where everything is null except the "MaxOrders" column. But then I need create buckets (something like this):
... | eval Buckets=(MaxOrders/5) |
eval Bucket1=Buckets |
eval Bucket2=(Bucket1+Buckets) |
eval Bucket3=(Bucket2+Buckets) |
eval Bucket4=(Bucket3+Buckets) |
eval Bucket5=(Bucket4+Buckets)
At this point, I have the one extra event at the end of my dataset which contains the values of MaxOrders, Buckets, Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5.
After the buckets are created, I need to compare Orders (for each user) against the buckets to assign the user their code, which requires an eval if statement:
... | eval Code=if(Orders<=Bucket1, 1, if(Orders<=Bucket2, 2, if(Orders<=Bucket3, 3, if(Orders<=Bucket4, 4, 5))))
But since the values for Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5 only exist in that one event (not in all the events), I get 5 for all the users' Code value.
Is there a way to get the values for Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5 to exist in all the records without having to do a join/append to rerun the calculation for Orders all over again?