What do we expect from logs
A support team will look for relevant events or user actions that can explain a troublesome situation. Here is an example:09:50:05 [INFO] [AppOperation] Mix configured with CMP=Nitrogen, CST=SynthAir, C=245,8 ppm 09:50:05 [INFO] [Solver] Computed flows for (C=2445,8 pp) : MFC1=1200 mlmin, MFC2=2450 mlmin 09:50:06 [INFO] [AppCalibration] Compensation of MFC1=1200 mlmin to MFC1=1202 mlmin 09:50:06 [INFO] [AppCalibration] Compensation of MFC2=2450 mlmin to MFC1=2449 mlmin 09:50:06 [INFO] [Peripherals] Setting MFC1=1200 mlmin 09:50:06 [INFO] [Peripherals] Setting MFC2=2449 mlmin 09:50:06 [INFO] [Peripherals] Setting Hardware state to Mix2Levels 09:50:06 [INFO] [AppOperation] Mix started 09:52:08 [ERROR] [AppAlarm] (Stability Alarm ON) MFC1 flow is not stable (Measured=1015 mlmin)
In this scenario, the user complains about poor results on his device. We suppose that the Alarm was not signaled on a visual indicator to inform the user. With this bunch of logs, developers can immediately:
- Point the origin of the issue
- Get the conditions of the operation that lead to the issue and try to run the same experiment at his office
It is essential for developers to have the following information in logs:
- Timestamp
- Log level : developers will filter the errors or warnings first to find an obvious reason for the malfunction
- Synthetic information : Long phrases are useless: a log must go the point, there is no litterature trophy for well-written logs. Most of the time, use action verbs and parameters.
- Workable data : the parameters listed in our logs specify numerical values with a unit. It seems obvious to say that without the unit, the developer cannot investigate. Plus, the location where the error occured is necessary. In our example, the name of classes are displayed. Either do this or use functional area names logged (ex: Config, Security, Alarms, ...).
Other things to have in mind:
- Keep files small : do not go over a few MB for text files (I do not include DB here). Opening and parsing a heavy file is long and inefficient. Use several small rolling files instead.
- Keep short history : there is no need to keep traces from last week if nothing happened there. This is closely related to file size for text logs.
What to log
- Application events: notifications, alarms, state changes, ...
- Operation-related information : settings update, user actions, ...
What NOT to log
- Periodic data : if live values need to be logged regularly, put them in a separate log file and preferably go for a DB that can handle large amount of entries better. Developers can then refer to timestamps for investigations.
- GUI input errors : if a user enters invalid value, it is not an error for the application, it is an error for the form. As long as the frontend does not interact with the application, whatever happens between the user and the form does not have an impact on the application.
Where to log
A software architecture is built with modules interacting with each others. The business logic of the application is dispatched between these modules. So basically, each functional area of the application shall manage its own logs. To do things neatly, centralize the logging features in a service shared by all components.Do not repeat information. Take the following example: A calls B, B fails. This is the type of log that we should expect from it:
10:00:01 [INFO] [A] Action A1 started 10:00:01 [INFO] [B] Setting xxx=1234 10:00:01 [ERROR] [A] Action A1 failed (Code=Invalid Setting Error)
and not something like :
10:00:01 [INFO] [A] Action A1 started 10:00:01 [INFO] [B] Setting xxx=1234 10:00:01 [ERROR] [B] Given value is invalid 10:00:01 [ERROR] [A] Action A1 failed (Code=Invalid Setting Error)
The error occured at a known place, so either log it from there or from the client (recommended).
Debugging information
Take the following example of code:void ExecuteUserRequest() { env = GetEnvironmentValue(); if(env < THRESHOLD) { doA(); } else { doB(); } }Here, myFunc behaves depending on an external factor. Here, we can chose to log the action (doA or doB) with an INFO level and add debugging information to the user to help him understand why this precise action occured :
void ExecuteUserRequest() { env = GetEnvironmentValue(); LogDebug("env = %d", env); if(env < THRESHOLD) { LogInfo("Executing A"); doA(); } else if(env >= THRESHOLD) { LogInfo("Executing B"); doB(); } }Generally speaking, Debug level gives execution details and exposes implementation data to the developer.
Memo
Here are some basic rules that I use to write my application logs.Code type | Log level |
---|---|
user input error (GUI) | No log. |
Frequently accessed functions (ex: sensor value read) | Debug or no log if we log in a text file. Use a DB if possible. |
Implementation detail (subroutines, intermediate values,...) | Debug |
User action or application event | Info |
Alarm or abnormal situation that does not result in an error immediately | Warn |
Operation failure on user action | Warn if the error is normal given the context (the product acts as specified). Error otherwise. |
Application error (internal non-critical errors or errors managed by the application) | Error |
Critical non recoverable error (unmanaged exception, errors that lead to a non-functioning application) | Fatal |
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.