Hi guys,
I've got a bit of a poser here. I'm trying to calculate the average capacity required for a grouping of virtual machines over a period of time. I can get a basic picture of it through the following search :
Events, hourly audit of machines in use:
2014-01-25T23:33:56 id=virtualmachine1,name=ExampleServer1,size=small,state=running
2014-01-25T23:33:56 id=virtualmachine2,name=ExampleServer2,size=small,state=running
So I can get an idea of the monthly average running per day through this search over the previous month:
index=* sourcetype=machineaudit state=running
| stats count as day_hours by date_mday id size
| dedup date_mday id
| stats count as machines avg(day_hours) as avg_day_hours by date_mday size
| stats avg(count) as avg_machines min(count) as min_machines max(count) as max_machines stdev(count) as std_deviation avg(avg_day_hours) as hours by size
| eval avg_machines=floor(avg_machines)
However, there is a permutation of usage this does not take into account, thats where a machine has been terminated and a new one of the same size used later. ie. a small machine was started at 09:00, terminated at 11:00 and a new one created at 12:00 running until 19:00. This could use one block of resource.
So, I can gather the first and last times of the events to see when they were started and terminated :
index=* sourcetype=machineaudit state=running | stats count as day_hours first(_time) as first last(_time) as last by date_mday id size
Now I have a table with start and end times for each machine, however I need to evaluate those fields to find where a terminate time is earlier than a start time of the same size and join those two day_hours fields.
More than likely that didn't make a lot of sense! I've tried looking into overlap and concurrency but I'm not making a lot of headway.