Quantcast
Channel: Latest Questions on Splunk Answers
Viewing all articles
Browse latest Browse all 13053

Multi-Source Full Outer Join using Append

$
0
0

All,

As I understand it; The Splunk JOIN command does not have a 'full outer join' option. I was able to look-up an example of using the APPEND command, but the results are not what I expected.

I have 3 data sources. In a full join, the returned records (rows) should total the largest record (row) return total. So if source 1 has 1000 records, and 2-3 only have 50; The total records (rows) returned would be 1000. This is not occuring using the example provided here;

http://answers.splunk.com/answers/81741/full-outer-join

My 3 sources;

Splunk AD index (with deduped cn) - 29,860 events/stats

DB 1 (mcaffe encryption) - 75,178 records

DB 2 (marimba) - 161,791

Matching columns; AD=cn, DB1=ComputerName, DB2=name

If these were all SQL tables, a FULL OUTER JOIN would return 161,791 records as that is the largest table. As said earlier this does not occur with the example linked above.

I'm using this query;

index=ad_test 
| rename cn as ComputerName 
| dedup ComputerName 
| stats count by ComputerName distinguishedName 
| append [ | dbquery mcafee "SELECT * FROM EPOComputerProperties" | stats count by ComputerName CPUType] 
| append [ | dbquery marimba_2014 "SELECT * FROM inv_machine" | rename name AS ComputerName | stats count by ComputerName serial_number] 
| stats values(serial_number) as serial_number values(distinguishedName) as distinguishedName values(CPUType) as CPUType by ComputerName

The returned result count; 129,645

I'm trying to understand the behavior of why this is happening, and what do I need to do to pull a complete recordset from these 3 sources as a full outer join.

Thanks in advance.


Viewing all articles
Browse latest Browse all 13053

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>