Skip to main content
Version: 3.1.0

Monitoring

As a developer, you will encounter bugs. 😲

To help you in your development journey, Luos provides monitoring and debugging mechanisms that can give you the opportunity of having clearer visibility of your network.

Luos self-healing capabilities

caution

Make sure to read and understand how to create Luos services before reading this page.

Finding, understanding, and managing bugs on multiple boards running multiple services can be hard. To make your life easier, Luos engine allows you to get some basic information about any problems in your system, allowing you to adapt to them.

Service exclusion

Luos engine includes an acknowledgment management using the SERVICEIDACK target_mode. This mode guarantees the proper reception of critical messages.

If Luos fails to reach its target using SERVICEIDACK, it will retry 10 times. If the acknowledgment still fails, the targeted service is declared excluded. Excluded services are removed from the routing table to avoid messaging from any services, preserving bandwidth for the rest of the system.

note
  • Gates services can report service exclusion through JSON.
  • Pyluos can report service exclusion through gates.

Luos engine's statistics

Luos engine monitors some values representing the sanity of your nodes and services.

note
  • Gates services can report statistics through JSON.
  • Pyluos can display statistics through gates.

Node statistics

Inside any service, you can access the host node's statistics values using the luos_stats_t structure. This structure gives you access to several values:

  • memory: Memory statisctics information.
  • rx_msg_stack_ratio: Percentage of memory occupation of Rx message management tasks.
  • engine_msg_stack_ratio: Percentage of memory occupation of Luos engine's tasks.
  • tx_msg_stack_ratio: Percentage of memory occupation of Tx message management tasks.
  • buffer_occupation_ratio: Percentage of memory occupation of the message buffer.
  • msg_drop_number: Number of messages dropped due to a lack of memory (older messages are dropped to be replaced by new ones).
  • max_loop_time_ms: Maximum time in ms between Luos_loop executions.

You can access to node statistics by using service.node_statistics.

Service statistics

In any service, you have access to statistics values using the service_stats_t structure. This structure gives you access to a specific service's statistic value:

  • max_retry: Maximum number of sent retries due to a NAK or collision with another service.

You can access node statistics by using service.statistics.

Assert

Luos engine allows you to declare a critical failure on a service to an entire network. To handle it, Luos engine exposes a LUOS_ASSERT macro that will enable you to test some conditions on it to prevent wrong values. For example:

 LUOS_ASSERT(arg_ptr != NULL);

In this case, if arg_ptr is not initialized, Luos engine will crash the entire node and send a message to all other services with the file and line where the crash occurred. All other nodes will remove all the services from the crashed node from the routing table.

note
  • Gates services can [report asserting of other nodes through JSON].
  • Pyluos can display assert through gates.