We experienced communication issues between the forwarders and the splunk server, say on the 2nd of November. Everything was back online after 3 days (say 5th Nov) and this resulted to the loss of data on 2 of our indexes.
The main reasons are:
1) On one stanza we have setup the ignoreOlderThan = 1d
, and also the whitelist is specific to only watch sample.log
files, on the 3 days the files have been aged to sample.log.1
sample.log.2
and so on (due to log4j.xml
config).
2) The other stanza is monitoring a particular directory but in that 3 days of no forwarding, the log files have been aged and MOVED to a different directory, and also it is now .gz! This stanza also has the ignoreOlderThan = 1d
parameter set.
Some part of the day of 2nd-Nov (from midnight until 2am) was indexed and nothing more after the issue happened.
QUESTIONS for solution:
1) Will I solve the issue just by working back the days that have passed from the 2nd of Nov by setting the ignoreOlderThan
attribute to older than the number of days to look all the way back to that date? Also, because the log files have aged and have a different file names (sample.log.3
and sample.log.gz
for example), will it still automagically pick up where it left off? So it ingests the rest of the 2nd-Nov logs and the days that follow?
2) On number two above, will I solve this by changing the monitor attribute to point to the directory where the aged files were moved? And, again will it ingest the logs but pick up where it left off? So it ingests the rest of the 2nd-Nov logs and the days that follow?
Thank you and looking forward to your responses!