Teknologi / 5 minutter / 5. februar 2014

Logging in Java - with users in mind!

Logging in a software application is often considered as a non-functional requirement. I suspect that categorizing logging as 'non-functionality' and assuming it will NOT have any user impact, often lead to hasty and premature requirements handling.

Håvard Kvam Buvik

Senior systemutvikler

The result might be focusing too soon on implementation details such as which logging framework to use and how to inject logging statements in the code instead of first identifying users and their needs.

In this article I will identify user needs regarding logging, map those needs to specific requirements and propose a solution implemented with one of the more popular Java logging frameworks, SLF4J with Logback.

Some basic knowledge of Java logging frameworks are required to get the most out of this article.

1. Problem analysis

"As a system owner, I want my system to audit log important events, e.g. user creation, to have key business information available. Logging should not affect user experience in any negative way."

"As a system operator, I want my system to log precise information about errors affecting the systems ability to function properly and what to do to correct them. I also want the log to contain as little as possible about normal activity in the system, everything else is just noise."

"As a system developer/tester, I want to configure the system to do trace/debug logging, so I can verify that my code is behaving as expected, at an early stage of development. In production it would be nice if the log could provide as much information as possible about what happens in the system before/during an error occurs, so I can locate the bug causing the error."

The user stories above describes some common needs for application logging. The two first user stories are for production environment only, the developer/tester user story is for both development, test and production environments.

The users needs can be mapped into a set of specific (cross) functional requirements. The requirements are in some ways conflicting and can be summarized as the table below shows:

# REQUIREMENT WHO NEEDS IT 1Audit log important eventsOwner2Maximize performance (i.e. minimize logging)Owner3 Give precise error information for correcting problemsOperator/developer/tester4 Minimize logging (no noise) during normal operationOperator5 Maximize logging during developmentDeveloper/tester6 Maximize logging (make noise) when in trouble, not otherwiseDeveloper/tester

2. Solution

How do you solve all requirements for the different users? A common approach is to add logging statements in the code, evenly distributed over a set of logging levels, with increasing criticality. But that approach will not cover all the different requirements at the same time! An evenly distribution of logging statements will make a lot of unnecessary noise or too little information in the logs, depending on the actual situation. You have to carefully consider WHEN to use WHAT logging level. This is my recommendation:

ERROR Only errors so critical that the application will malfunction WARNAll other errors or unexpected eventsINFOOnly lifecycle eventsDEBUG All other events (which might be worth logging)TRACE(Optionally) all method invocations with parameter and return values

The code below shows an example of using this approach, using SLF4J API with Logback. Here are the the prerequisites you need to run the example:

Maven dependencies: org.slf4j:slf4j-api:1.7.5, ch.qos.logback:logback-core:1.0.13 and ch.qos.logback:logback-classic:1.0.13
Logback configuration (place a file named logback.xml somewhere on the classpath):
?1234567891011121314AUDIT ACCEPT %-5marker %-5level %logger{0} - %msg%n

If a class named UserManager is implemented like this:

?123456789101112131415161718192021222324252627282930313233343536private static final Logger LOGGER = LoggerFactory.getLogger(UserManager.class);private static final Marker ENTER = MarkerFactory.getMarker("ENTER");private static final Marker LEAVE = MarkerFactory.getMarker("LEAVE");private static final Marker AUDIT = MarkerFactory.getMarker("AUDIT"); public static final UserManager INSTANCE = new UserManager();private final Database database; private UserManager() {LOGGER.trace(ENTER, "constructor");try {LOGGER.debug("Start creating database object");database = new Database("");LOGGER.debug("Finished creating database object");} catch (Exception e) {LOGGER.error("Failed to initialize database, this might be due to an configuration or network error", e);throw e;}// Lifecycle information - happens only once in this JVM - use INFO level for loggingLOGGER.info("Initialized user manager '{}'", this.toString());LOGGER.trace(LEAVE, "constructor");}public void registerUser(String name) {LOGGER.trace(ENTER, "registerUser: {}", name);try {boolean userCreated = database.create(name);if (userCreated) {LOGGER.debug(AUDIT, "Successfully created user '{}'", name);} else {LOGGER.warn(AUDIT, "Failed to created user '{}'", name);}} catch (Exception e) {LOGGER.warn(AUDIT, "Failed to create user '{}'", name, e);}LOGGER.trace(LEAVE, "registerUser");}

And then you invoke #registerUser like this:

?1UserManager.INSTANCE.registerUser("John Doe");

Scenario 1 - When you run this code with INFO as configured level for the root logger you will get following output on the console:

INFO UserManager - Initialized user manager 'no.kantega.texts.examples.UserManager@68558426' AUDIT DEBUG UserManager - Successfully created user 'John Doe'

Scenario 2 - If an error occurs during database initialization you will get this:

ERROR UserManager - Failed to initialize database, this might be due to an configuration or network error java.lang.RuntimeException: Failed to create connection to database at no.kantega.texts.examples.UserManager$Database.(UserManager.java:52) ~[classes/:na] at no.kantega.texts.examples.UserManager$Database.(UserManager.java:46) ~[classes/:na] at no.kantega.texts.examples.UserManager.(UserManager.java:36) [classes/:na] at no.kantega.texts.examples.UserManager.(UserManager.java:15) [classes/:na] at no.kantega.texts.examples.LoggingExample.main(LoggingExample.java:5) [classes/:na]

Scenario 3 - And if an error occurs in database during user creation (because the user already exists), this might be the output:

INFO UserManager - Initialized user manager 'no.kantega.texts.examples.UserManager@68558426' AUDIT WARN UserManager - Failed to create user 'John Doe' java.lang.RuntimeException: Failed to execute SQL for inserting user 'John Doe', user already exists at no.kantega.texts.examples.UserManager$Database.create(UserManager.java:65) ~[classes/:na] at no.kantega.texts.examples.UserManager.registerUser(UserManager.java:21) ~[classes/:na] at no.kantega.texts.examples.LoggingExample.main(LoggingExample.java:5) [classes/:na]

Scenario 4 - Running the happy day scenario with TRACE level enabled in Logback configuration will get you this:

ENTER TRACE UserManager - constructor DEBUG UserManager - Start creating database object ENTER TRACE UserManager$Database - constructor DEBUG UserManager$Database - Initialized database with connection string '' DEBUG UserManager - Finished creating database object LEAVE TRACE UserManager$Database - constructor INFO UserManager - Initialized user manager 'no.kantega.texts.examples.UserManager@528c0d5d' LEAVE TRACE UserManager - constructor ENTER TRACE UserManager - registerUser: John Doe, {} ENTER TRACE UserManager$Database - create: John Doe DEBUG UserManager$Database - Start creating user 'John Doe' DEBUG UserManager$Database - Created user 'John Doe' LEAVE TRACE UserManager$Database - create AUDIT DEBUG UserManager - Successfully created user 'John Doe' LEAVE TRACE UserManager - registerUser

All together these logging outputs satisfy all requirements except #6 - we will see how to deal with that later.

Requirement #1 'Audit log important events' is solved by using the Marker interface in SLF4J and TurboFilter in Logback. Note that both DEBUG and WARN levels are used in combination with the audit logging. This covers ALL attempts to execute important functionality, both failed and successful ones. The successful audit event is logged with DEBUG level only, because this is not an system lifecycle event and does not 'deserve' INFO level. It will anyway stand out in the log output, because of the AUDIT marker. Also note that audit events are always logged, even if logging level in the root logger is set to OFF.

Another consideration to make is whether audit logging should be solved by the logging framework at all, or if audit logging actually is so important that it should be an application feature. It COULD be solved by putting the audit events in a database and providing a proper GUI for the user (system owner). But that is another story...

Requirement #2-5 are solved by using the recommended logging levels at the correct places in the code, in combination with adjusting the configured logging level. Try adjusting the logging level for yourself and see the resulting output.

3. Maximize logging when in trouble, not otherwise

What about requirement #6 'Maximize information (make noise) when in trouble, not otherwise'? One solution is to buffer all the last X logging events and flush them to the log if an logging event with e.g. level ERROR occurs. An example of an log4j appender providing such buffering is PositronLogger (https://github.com/holmbech/PositronLogger). For Logback I would recommend implementing such a buffering component as an Encoder (http://logback.qos.ch/manual/encoders.html), to make it more versatile for use with all types of appenders.

If you will implement such a buffering Encoder you probably wish to be able to configure at least three factors:

the buffer size
the minimum logging level (e.g. ERROR or WARN) the Encoder shall flush the buffer to the log
the minimum logging level (e.g. INFO) the Encoder shall write logging events to the log immediately

Implementation of a buffering Encoder is not included in this article, but you get the idea how it should be working. Assuming we have implemented such a buffering Encoder in the class 'no.kantega.BufferingEncoder' the STDOUT appender in the Logback configuration used in this example will then look like this:

?12345%-5marker %-5level %logger{0} - %msg%n 50 ERROR INFO

With such a buffering Encoder in place (NB! In this case TRACE must be configured logging level for the root logger, to be able to buffer ALL events) the logging output for scenario 1, 3 and 4 would be exactly as before, but in scenario 2, when a critical error occurs, the output will contain information from all logging levels for the steps before the actual error occurred:

ENTER TRACE UserManager - constructor
DEBUG UserManager - Start creating database object
ENTER TRACE UserManager$Database - constructor
LEAVE TRACE UserManager$Database - constructor
ERROR UserManager - Failed to initialize database, this might be due to an configuration or network error
java.lang.RuntimeException: Failed to create connection to database
at no.kantega.texts.examples.UserManager$Database.(UserManager.java:52) ~[classes/:na]
at no.kantega.texts.examples.UserManager$Database.(UserManager.java:46) ~[classes/:na]
at no.kantega.texts.examples.UserManager.(UserManager.java:36) [classes/:na]
at no.kantega.texts.examples.UserManager.(UserManager.java:15) [classes/:na]
at no.kantega.texts.examples.LoggingExample.main(LoggingExample.java:5) [classes/:na]

4. Wrap-up

One of the main goals with logging in a production environment is to have buggy code make a lot of 'noise' in your log, and quality code to be 'quiet'. This could be achieved by:

Finding out who the users are and what functionality that are important for them, regarding logging.
Carefully considering WHEN to use WHAT logging level. Especially consider reserving a specific level (e.g. INFO) for lifecycle events, and configure the logging framework for running with that level in the production environment. All debugging events should be logged at a lower level, to keep the production log small and clean under normal operation.
Implement buffering of logging events and have them flushed to the log only when a logging event at a configured log level is received. In this way the logs will contain a lot of information from the code that is buggy.
Considering whether audit logging should be solved as an application feature and not by the logging framework at all.

Happy logging!