Introduction

Custom Formatting allows users to modify the log content by chaining different operations. Below are a few example scenarios where this custom formatting proves advantageous.

Parse JSON embedded within a parsed field

This section refers to the process of extracting and interpreting JSON data that is nested within a larger text field that has already been parsed using regular expressions (regex).

Scenario 1: JSON is a part of the message

In cases where the JSON data is included as part of a parsed message or another group, as explained in the example below, we can employ the following approach:
logfile sample:

2024-02-25 [1024] Info something important happened with details {"key1": "value1", "key2": "value2"} proceeding 

log-config.yml  
inputs: 
  test: 
    type: file 
    source: normal_tst 
    include: 
      - /var/log/normal.log 
    parser_type: "regex" 
    regex: '^(?P<timestamp>\d{4}-\d{2}-\d{2})\s\[(?P<pid>\d+)\]\s(?P<level>[^\s]*)\s(?P<message>.*)$' 
    custom_formatting: |- 
      [ 
        { 
          "type": "regex_parser", 
          "parse_from": "body.message", 
          "parse_to": "body", 
          "regex": "(?P<parsed_json>{[^}]*})" 
        }, 
        { 
          "type": "json_parser", 
          "parse_from": "body.parsed_json", 
          "parse_to": "body" 
        }, 
        { 
          "type": "remove", 
          "field": "body.parsed_json" 
        } 
      ] 

In the example log line provided above, the regex is matched first as shown below:


Having matched the log line, we now have a JSON object: {"key1": "value1", "key2": "value2"}. To extract and process this data, you need to employ few custom formatting operators.
To get the JSON part from the message, you can use (?P<parsed_json>{[^}]*}) regular expression:


You can specify this regex match using the regex_parser operator as follows:


After extracting the JSON using the regex_parser operator, you can further parse its keys and values using the JSON parser by specifying the following:


Finally, to eliminate redundant entries in logs, we remove the parsed_json regex capture group, which was initially used to match the JSON. This can be achieved by specifying the following:

Scenario 2: The entire message is a JSON

If the matched group is a valid JSON structure, you can proceed as follows:

logfile sample
2024-02-25 [1026] Info {"user_name":"vivek","message":"There was an error processing the request","request_id":"1234567890","user_id":"abcdefghij"} 

log-config.yml  
inputs: 
  test: 
    type: file 
    source: normal_tst 
    include: 
      - /var/log/normal.log 
    parser_type: "regex" 
    regex: '^(?P<timestamp>\d{4}-\d{2}-\d{2})\s\[(?P<pid>\d+)\]\s(?P<level>[^\s]*)\s(?P<message>.*)$' 
    custom_formatting: |- 
      [ 
        { 
          "type": "json_parser", 
          "parse_from": "body.message", 
          "parse_to": "body" 
        } 
      ] 

In the example log line provided above, the regex is matched first as shown below:



Since the entire message is already a well-formed JSON, you can directly use the json_parser to parse the JSON without the need for prior regex matching.


The resulting output will be displayed in the portal as follows: