Datasinks

Contents / Java Edition / User Guide / Agent / Datasinks

Datasinks

The following section describes all available datasinks. The MyARM agent implementation and the myarmdaemon are using datasinks to export measured ARM data to a configured datasink destination.

Null datasink

The null datasink in reality is a no operation datasink. Say you have setup a complete environment with ARM instrumentation enabled, but in some circumstances you want no transactions to be measured. Just configure the use of this null datasink and no data will be recorded at all:

# agent uses a datasink named noop
agent.sink.name = noop
# noop defines a null datasink type
noop.type = null

Database datasinks

The following database datasinks are supported in this edition:

SQLite datasink

The sqlite3 datasink uses the SQLite database as described in SQLite database.

MySQL datasink

The mysql datasink uses the MySQL database as described in section MySQL database.

Archive datasink

The archive datasink is used to store measured ARM data into a set of binary files. File names will include a timestamp based on ARM data.

In addition to the standard datasink and the basic archive properties the archive datasink uses the following properties:

<name>.file
set up sqlite3 archive database used within the archive datasink.

Default is ${MYARM_VARLIB_DIR}/myarm_archive.db.

<name>.workdir
set up working directory for the archive datasink. This directory is used to create files and populate them with appropriate timed ARM data during execution. Each file is moved to the directory configured with the <name>.archivedir property if one of the following conditions are met:
  1. The file size exceeds the size configured with the <name>.max_size property
  2. The number of seconds since file creation exceeds the configured <name>.move_interval value

Default is ${MYARM_VARLIB_DIR}/archive/work/.

<name>.archivedir
Base archive directory to move closed files to.

Default is ${MYARM_VARLIB_DIR}/archive/final/.

<name>.max_size
Maximum number of bytes for an archive file. If this limit is reached the archive file is closed and a new file is created.

Default is 1MiB (1 megabyte), minimum is 128KiB (128 kilobyte), maximum is 2GiB (2 gigabyte).

<name>.max_open_files
specifies the maximum number of open files. If more files are needed the least recently used file is closed.

Default is 60, minimum is 30, maximum is 300.

<name>.move_interval
time interval in seconds used to move archive files from workdir to the <name>.archivedir.

Default is 1m (1 minute), minimum is 30s (30 seconds), maximum is 5m (5 minutes).

<name>.zstandard
compress archive files using zstandard library. The suffix .zst will be appended to the file name.

Default is false.

Configuration example

daemon.sink.name = sink_archive
# set up sink type
sink_archive.type = "archive"
# set up sqlite3 archive database used within archive datasink
sink_archive.file = "$MYARM_VARLIB_DIR/myarm_archive.db"
# set up work directory for archive datasink
sink_archive.workdir = "$MYARM_VARLIB_DIR/archive/work/"
# base archive directory to move closed files to
sink_archive.archivedir = "$MYARM_VARLIB_DIR/archive/final/"
# maximum number of bytes for an archive file. If this limit
# is reached the archive file is closed and a new file is created
sink_archive.max_size = 1MiB
# maxmum number of open files; if more are needed the least
# recently used file is closed
sink_archive.max_open_files = 60
# time interval used to move archive files to the archivedir
sink_archive.move_interval = 1m
# compress archive files using zstandard library
sink_archive.zstandard = true

File datasink

Using the file datasink all ARM data are stored in a flat file. This file can be read by the myarmdaemon to forward the ARM data to another destination like a TCP connection or a real database. The main purpose of this datasink in conjunction with the myarmdaemon is to decouple writing ARM data to the database from the instrumented application using simple file IO.

In addition to the standard datasink properties the file datasink uses the following properties:

<name>.workfile
specifies the complete file name (including directory names) for the work file. An unique ID generated by the current process will be appended to make the file unique. The ARM data is written to this file and when its closed it is moved to the configured directory.

Default is /tmp/myarmfile.data.

<name>.rolling.seconds
specifies the number of seconds after which a new workfile will be used. The old workfile will be moved to the configured directory.

Default is 1m (1 minute), minimum is 5s (5 seconds) and maximum is 1h (1 hour).

<name>.rolling.size
specifies the maximal size in bytes of the workfile. If the workfile gets bigger a new workfile is opened and the old workfile is moved to the configured directory.

Default is 128 KB. Minimum is 32 KB and the maximum is 128 MB.

<name>.diskusage.max_used
specifies the maximal size in bytes of all ARM data files in the basic.filestorage.reader.directory directory. If this limit is reached any new ARM data files will be dropped and an error will be reported.

Default is 100 MB. Minimum is 128 KB and no maximum limit.

<name>.diskusage.min_free
specifies the minimal free size in bytes of the file system of the basic.filestorage.reader.directory directory. If this limit is reached any new ARM data files will be dropped and an error will be reported.

Default is 200 MB. Minimum is 100 MB and no maximum limit.

basic.filestorage.reader.directory
specifies the directory to move closed files to. See basic file storage configuration section.

A sample file datasink configuration can look like:

# specify datasink name for the agent
agent.sink.name = sink_file
# datasink file type
sink_file.type = file
# set up work file for file datasink
sink_file.workfile = /opt/myarm/var/myarmfile.data
# time interval in seconds to use a new work file and move
# the old to the myarmdaemon directory.
sink_file.rolling.seconds = 1m
# number of maximal bytes for the work file. If this limit is
# reached the current work file is closed, moved to the
# myarmdaemon directory and a new work file is created.
sink_file.rolling.size = 128KB
# only use up to 100 MB of ARM data files, if this limit 
# is reached new ARM data files are dropped.
sink_file.diskusage.max_used = 100MB
# at least 200 MB of disk space should be left free
sink_file.diskusage.min_free = 200MB

TCP datasink

The tcp datasink is used in conjunction with the myarmdaemon tool. It connects to the myarmdaemon program and sends all ARM data to the myarmdaemon process through a TCP socket connection. Note that the tcp datasink retries to connect to the myarmdaemon periodically as long as there is ARM data available to send to the myarmdaemon. If there is no ARM data available to send for at least <name>.connection.idle time, the connection is closed.

In addition to the standard datasink properties, the tcp datasink uses the following properties:

<name>.host
name of the host to connect to the myarmdaemon process.

Default is localhost.

<name>.port
port number on which the myarmdaemon process is listening and accepts connections.

Default is 5557.

<name>.connection.idle
specified as an interval (with a suffix m for minutes, s for seconds or h for hours) to wait before closing the connection to the myarmdaemon if no ARM data needs to be sent.

Default is 5m (5 minutes), minimum is 30s (30 seconds) and the maximum of 1h (1 hour) for the connection idle time.

<name>.connection.keepalive
specified as an interval (with a suffix m for minutes or s for seconds) to send a keep alive message to the myarmdaemon if there is no ARM data to be sent.

Default is 1m (1 minute), minimum is 30s (30 seconds) and the maximum is 1/2 of the value for the <name>.connection.idle <name>.connection.idle property.

<name>.connection.reconnect
specified as an interval (with a suffix m for minutes or s for seconds) to wait before MyARM retries to connect to the myarmdaemon process again.

Default is 30s (30 seconds), the minimum is 10s (10 seconds) and the maximum is 5m (5 minutes).

<name>.readwrite.timeout
timeout value (with a suffix ms for milliseconds or s for seconds) to wait for reads or writes of ARM data through the connection.

Default is 500ms (500 milliseconds). Minimum is 250ms (250 milliseconds) and the maximum is 2s (2 seconds).

<name>.reply.timeout
timeout value (with a suffix ms for milliseconds or s for seconds) to wait for a reply from the myarmdaemon.

Default is 2s (2 seconds), minimum is 500ms (500 milliseconds) and the maximum is 4s (4 seconds).

Configuration example

# agent uses a datasink named sink_tcp
agent.sink.name = sink_tcp
# set up sink type
sink_tcp.type = tcp
# set up host myarmdaemon is running on
sink_tcp.host  = localhost
# set up port myarmdaemon is listening for incoming connections
sink_tcp.port = 5557
# read/write time out interval
sink_tcp.readwrite.timeout = 1s
# the interval for reconnecting to the daemon in seconds
sink_tcp.connection.reconnect = 1m
# send a keep alive message after 1 minute if no data
sink_tcp.connection.keepalive = 1m
# close the connection to myarmdaemon after 5 minutes if no data
sink_tcp.connection.idle = 5m