Introduction

Calico is an open-source project that provides networking and security for Kubernetes. It’s a plug-in that implements the Kubernetes Container Network Interface (CNI).

Usually it exposes Prometheus metrics via two main components:

  • Felix (for smaller cluster)
  • Typha (for large cluster)

Kubernetes 2.0 ConfigMap

Users should update or append the existing ConfigMap named opsramp-workload-metric-user-config by adding application-specific fields. These fields may include authentication credentials, collection frequency, and other relevant configurations.

apiVersion: v1
kind: ConfigMap
metadata:
  name: opsramp-workload-metric-user-config
  namespace: opsramp-agent
data:
  workloads: |
    calico:
    - name: calico
      collectionFrequency: 59s
      port: 9091
      filters: # optional
        - regex: 'felix_cluster_num_hosts'
          action: exclude
        - regex: 'felix_ipset_lines_executed'
          action: include
      targetPodSelector:
        matchLabels:
          - key: k8s-app
            operator: ==
            value:
              - calico-node

Supported Metrics

Supported metrics for this workload as provided by the Kubernetes 2.0 Agent.

MetricDescription
felix_iptables_restore_errorsTotal number of errors encountered during the iptables restore process in Felix.
felix_iptables_save_callsTotal number of calls made to save iptables rules in Felix.
felix_active_local_endpointsTotal number of active local endpoints managed by Felix.
felix_int_dataplane_addr_msg_batch_sizeBatch size for address messages sent in the internal dataplane in Felix.
felix_int_dataplane_apply_time_secondsTime (in seconds) spent applying changes to the internal dataplane in Felix.
felix_log_errorsTotal number of error logs generated by Felix.
felix_bpf_happy_dataplane_endpointsNumber of endpoints in the BPF dataplane that are considered happy (in a good state) in Felix.
felix_hostThe hostname of the machine running the Felix agent.
felix_int_dataplane_msg_batch_sizeBatch size for messages sent within the internal dataplane in Felix.
felix_logs_droppedTotal number of logs dropped by Felix due to system limitations or issues.
felix_iptables_save_errorsTotal number of errors encountered during the iptables save process in Felix.
felix_resyncs_startedTotal number of resync operations started by Felix to reconcile state.
felix_route_table_list_secondsTime (in seconds) taken to list the route table entries in Felix.
felix_active_local_policiesTotal number of active local policies applied by Felix.
felix_active_local_selectorsTotal number of active local selectors in use by Felix.
felix_calc_graph_update_time_secondsTime (in seconds) spent calculating the graph updates for network policies in Felix.
felix_cluster_num_workload_endpointsTotal number of workload endpoints in the Felix-managed cluster.
felix_int_dataplane_failuresTotal number of failures occurring in the internal dataplane in Felix.
felix_cluster_num_hostsTotal number of hosts in the Felix-managed cluster.
felix_ipsets_calicoTotal number of Calico IP sets used by Felix.
felix_iptables_lines_executedTotal number of iptables lines executed by Felix.
felix_iptables_lock_retriesTotal number of retries to acquire the iptables lock in Felix.
felix_route_table_per_iface_sync_secondsTime (in seconds) spent synchronizing the route table per network interface in Felix.
felix_ipset_errorsTotal number of errors encountered while managing IP sets in Felix.
felix_ipset_lines_executedTotal number of ipset lines executed by Felix.
felix_ipsets_totalTotal number of IP sets managed by Felix.
felix_calc_graph_updates_processedTotal number of graph updates processed by Felix for network policies.
felix_iptables_rulesTotal number of iptables rules applied by Felix.
felix_cluster_num_policiesTotal number of policies applied in the Felix-managed cluster.
felix_int_dataplane_iface_msg_batch_sizeBatch size for interface messages in the internal dataplane of Felix.
felix_iptables_restore_callsTotal number of iptables restore calls made by Felix.
felix_cluster_num_host_endpointsTotal number of host endpoints in the Felix-managed cluster.
felix_iptables_chainsTotal number of iptables chains created and managed by Felix.
felix_exec_time_microsTotal execution time (in microseconds) for Felix operations.
felix_ipset_callsTotal number of calls made to manage IP sets in Felix.
felix_iptables_lock_acquire_secsTotal time (in seconds) spent acquiring the iptables lock in Felix.
felix_bpf_num_ip_setsNumber of IP sets used by the BPF dataplane in Felix.
felix_calc_graph_output_eventsTotal number of events generated by the calculation of the graph for network policies in Felix.
felix_cluster_num_profilesTotal number of profiles in the Felix-managed cluster.
felix_bpf_dataplane_endpointsTotal number of endpoints in the BPF dataplane managed by Felix.
felix_resync_stateCurrent state of the resynchronization process in Felix.
felix_bpf_dirty_dataplane_endpointsNumber of dirty (not synchronized) endpoints in the BPF dataplane in Felix.
felix_int_dataplane_messagesTotal number of messages processed by the internal dataplane in Felix.