How to make fake data in Splunk using SPL
Sometimes, you need to fake something in Splunk. Might be during development and you don't feel like writing a real search, but you really need a number for a dashboard panel to look right. Maybe you are helping someone with a hairy regex, and you don't want to index data just to test it on your instance. Whatever the reason, here are some searches that have helped me out.
Note that when using these techniques, you are not going through the indexing and parsing pipelines, so you can't test everything.
| makeresults | eval msg="hello", seq=1
This uses random() function to the eval command. Unfortunately, this command does not have a range parameter, so it spits out a random 32-bit integer. We can make it fit a desired range with the modulo operator. Since modulo math "wraps around", you know that the remainder will alwys be less than your divisor, in this case, 10. Note that all events generated this way will have the same _time.
| makeresults count=10 | eval int=random() % 10
And if you want to modify the range, you could add to it. This will create numbers between 1..10:
| makeresults count=10 | eval int=random() % 10 + 1
This will create events containing one of the two "answers" as supplied to the if() function. Because we are using random() and modulo again, we know the remainder will be 0..divisor (which is 5 here). Because the if() checks for equality to 1, this means that the first option will appear in approximately 25% of the events. I'm using a higher count of 100 to give the results more entropy. This makes for better fake data.
| makeresults count=100 | eval poll=if((random()%5) == 1, "Option A", "Option B")
These are good to pass to a stats function (and then visualiations) like this:
| makeresults count=100 | eval poll=if((random()%5) == 1, "Option A", "Option B") | stats count by poll
To generate more than 2 values, you'll need to use case():
| makeresults count=10 | eval num = random() % 100, error = case( num < 10, 404, num >= 10 AND num < 13, 500, num >= 13, 200 ), error_msg = case(error == 404, "Not found", error == 500, "Internal Server Error", error == 200, "OK")
In this example, the field
resp_time can be used in ITSI as a threshold field. By adding 81 (as random() is zero-indexed), the random values will range from 80-120. Note that the
gettime macro is included with the ITSI app and not core Splunk.
| makeresults | eval resp_time = random() | eval resp_time = resp_time % 40 | eval resp_time = resp_time + 81 | fields resp_time | `gettime`
Now I'm just slapping a whole event into _raw so that I can test a regular expression on it. Be careful with embedded quotes, those will not parse well and it's hard to escape them. Since we are faking it, it's usually ok to just remove or alter double quotes to single.
| makeresults | eval _raw = "2016-09-13 22:23:28,289 INFO [57d8b4a04210814f1d0] cached:77 - memoized decorator used on function <function getEntities at 0x106af22a8> with non hashable arguments" | rex field=_raw "\[(?<foo>.*)\]"