Site Search:

Splunk Searching

Back>

In a nutshell, splunk stores your input information in a database then use splunk search language (SPL) to query the database. If you have already used SQL, SPL works similarly.

Splunk stores your machine generated data (mainly all sorts of logs) as events with a time. Each event has a raw data and index. The index in SPL is a similar concept as table in SQL -- you specify the index to point to where the data is stored, then you apply matching clauses to get the subset of the index data. Besides time, a splunk event has other fields like host, source and sourcetype.

For example, the following query says, I want to find all the events stored in index myStore. I only want those events generated from host ny.bakery.com, when I index these events, I tag some events as source=tomcatFatLog, sourcetype=users_impression, give me only those events. Finally only gives those events from 3 days back to the beginning of today.

index=nyStore host=ny.bakery.com source=tomcatFatLog sourcetype=users_impression earliest=-3d@d latest=@d

As you might already know, "AND" can be omitted in SPL, there is no "WHERE" clause in SPL.

When you want to join two SPL index together, you use "OR" clause, for example, compare to the previous example, the following query get those events from 2 indexes, the host fields have a wildcard in it.

(index=nyStore OR index=bostonStore) host=*.bakery.com source=tomcatFatLog sourcetype=users_impression earlist=-3d@d latest=@d


When you input data into splunk database, splunk can breaks down an event as name value pairs if the sourcetype is one of those predefined ones. For example, if you specify an event's sourcetype as csv, splunk can take advantage of this knowledge and try to parse the event into comma separated values when storing them, so that you can later use more specific matchings. For example, your input data could looks like:

firstname,lastname,language,class,instructor
swim,fish,perl,2008,Peter Pan
swim,fish,sql,2001,John Smith
 The following query looks up the events in index nyStore, which has host sourcetype matching the specified value. In addition, the following fields from csv -- firstname, lastname, language, instructor are also used to matching the events. We only need to look back events within the previous 3 hours.

index=nyStore host=ny.bakery.com sourcetype=csv firstname=swim lastname=fish language=perl instructor!="John Smith" earlist=-3h

There are other predefined sourcetype, just name a few: mysqld, catalina, tcp, cisco_syslog, _json.