All,
As I understand it; The Splunk JOIN command does not have a 'full outer join' option. I was able to look-up an example of using the APPEND command, but the results are not what I expected.
I have 3 data sources. In a full join, the returned records (rows) should total the largest record (row) return total. So if source 1 has 1000 records, and 2-3 only have 50; The total records (rows) returned would be 1000. This is not occuring using the example provided here;
http://answers.splunk.com/answers/81741/full-outer-join
My 3 sources;
Splunk AD index (with deduped cn) - 29,860 events/stats
DB 1 (mcaffe encryption) - 75,178 records
DB 2 (marimba) - 161,791
Matching columns; AD=cn, DB1=ComputerName, DB2=name
If these were all SQL tables, a FULL OUTER JOIN would return 161,791 records as that is the largest table. As said earlier this does not occur with the example linked above.
I'm using this query;
index=ad_test
| rename cn as ComputerName
| dedup ComputerName
| stats count by ComputerName distinguishedName
| append [ | dbquery mcafee "SELECT * FROM EPOComputerProperties" | stats count by ComputerName CPUType]
| append [ | dbquery marimba_2014 "SELECT * FROM inv_machine" | rename name AS ComputerName | stats count by ComputerName serial_number]
| stats values(serial_number) as serial_number values(distinguishedName) as distinguishedName values(CPUType) as CPUType by ComputerName
The returned result count; 129,645
I'm trying to understand the behavior of why this is happening, and what do I need to do to pull a complete recordset from these 3 sources as a full outer join.
Thanks in advance.