issue in netdata as container in cloud



  • when starting netdata as container in cloud ,it is not working

    when checking top command ,2 netdata process id is present.
    any one could share why 2 process are created ?

    The docker logs is:
    Netdata entrypoint script starting
    Netdata entrypoint script starting
    2020-09-07 16:45:53: netdata INFO : MAIN : CONFIG: cannot load cloud config ‘/var/lib/netdata/cloud.d/cloud.conf’. Running with internal defaults.
    2020-09-07 16:45:53: netdata ERROR : MAIN : File ‘/var/lib/netdata/dbengine_multihost_size’ contains invalid input, it will be rebuild
    2020-09-07 16:45:53: netdata INFO : MAIN : Found 0 legacy dbengines, setting multidb diskspace to 256MB
    2020-09-07 16:45:53: netdata INFO : MAIN : Created file ‘/var/lib/netdata/dbengine_multihost_size’ to store the computed value
    2020-09-07 16:45:53: netdata INFO : MAIN : Using host prefix directory ‘/host’
    2020-09-07 16:45:53: netdata INFO : MAIN : SIGNAL: Enabling reaper
    2020-09-07 16:45:53: netdata INFO : MAIN : process tracking enabled.
    2020-09-07 16:45:53: netdata INFO : MAIN : resources control: allowed file descriptors: soft = 1048576, max = 1048576
    2020-09-07 16:45:53: netdata INFO : MAIN : Adjusted my Out-Of-Memory (OOM) score from 0 to 1000.
    2020-09-07 16:45:53: netdata ERROR : MAIN : Cannot adjust netdata scheduling policy to idle (5), with priority 0. Falling back to nice. (errno 38, Function not implemented)
    2020-09-07 16:45:53: netdata ERROR : MAIN : Cannot get my current process scheduling policy. (errno 38, Function not implemented)
    2020-09-07 16:45:53: netdata INFO : MAIN : netdata started on pid 1.
    2020-09-07 16:45:53: netdata INFO : MAIN : Initializing spawn client.
    2020-09-07 16:45:53: netdata INFO : MAIN : cannot set libuv thread name to DAEMON_SPAWN. Err: 13
    2020-09-07 16:45:53: netdata INFO : MAIN : Executing /usr/libexec/netdata/plugins.d/system-info.sh
    Spawn server is up.
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_CONTAINER_OS_NAME=Alpine Linux
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_CONTAINER_OS_ID=alpine
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_CONTAINER_OS_ID_LIKE=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_CONTAINER_OS_VERSION=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_CONTAINER_OS_VERSION_ID=3.12.0
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_CONTAINER_OS_DETECTION=/etc/os-release
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_HOST_OS_NAME=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_HOST_OS_ID=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_HOST_OS_ID_LIKE=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_HOST_OS_VERSION=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_HOST_OS_VERSION_ID=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_HOST_OS_DETECTION=unknown
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_KERNEL_NAME=Linux
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_KERNEL_VERSION=3.10.0-1062.9.1.el7.x86_64
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_ARCHITECTURE=x86_64
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_VIRTUALIZATION=hypervisor
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_VIRT_DETECTION=/proc/cpuinfo
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_CONTAINER=docker
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_CONTAINER_DETECTION=dockerenv
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_CPU_LOGICAL_CPU_COUNT=16
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_CPU_VENDOR=GenuineIntel
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_CPU_MODEL=Intel® Xeon® Gold 6130 CPU @ 2.10GHz
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_CPU_FREQ=2100000000
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_CPU_DETECTION=lscpu
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_TOTAL_RAM=63329996800
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_RAM_DETECTION=procfs
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_TOTAL_DISK_SIZE=0
    2020-09-07 16:45:57: netdata INFO : MAIN : NETDATA_SYSTEM_DISK_DETECTION=sysfs
    2020-09-07 16:45:57: netdata INFO : MAIN : Configuring locking mechanism for global GUID map
    2020-09-07 16:45:57: netdata INFO : MAIN : Cannot open the file /var/lib/netdata/health.silencers.json, so Netdata will work with the default health configuration.
    2020-09-07 16:45:57: netdata INFO : MAIN : CONFIG: cannot load user config ‘/etc/netdata/stream.conf’. Will try stock config.
    2020-09-07 16:45:57: netdata ERROR : MAIN : Failed to read machine GUID from ‘/var/lib/netdata/registry/netdata.public.unique.id’ (errno 22, Invalid argument)
    2020-09-07 16:45:57: netdata ERROR : MAIN : HEALTH [f724cdbdf329]: cannot open health file: /var/lib/netdata/health/health-log.db.old (errno 2, No such file or directory)
    2020-09-07 16:45:57: netdata INFO : MAIN : Added 9a277ef4-f129-11ea-a1b6-0242ac110002 to global map for host f724cdbdf329
    2020-09-07 16:45:57: netdata INFO : MAIN : Host ‘f724cdbdf329’ (at registry as ‘f724cdbdf329’) with guid ‘9a277ef4-f129-11ea-a1b6-0242ac110002’ initialized, os ‘linux’, timezone ‘UTC’, tags ‘’, program_name ‘netdata’, program_version ‘v1.24.0’, update every 1, memory mode dbengine, history entries 3996, streaming disabled (to ‘’ with api key ‘’), health enabled, cache_dir ‘/var/cache/netdata’, varlib_dir ‘/var/lib/netdata’, health_log ‘/var/lib/netdata/health/health-log.db’, alarms default handler ‘/usr/libexec/netdata/plugins.d/alarm-notify.sh’, alarms default recipient ‘root’
    2020-09-07 16:45:57: netdata INFO : MAIN : Found 3 files in path /var/cache/netdata/dbengine
    2020-09-07 16:45:57: netdata INFO : MAIN : Scanning file “/var/cache/netdata/dbengine/datafile-1-0000000001.ndf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Matched file “/var/cache/netdata/dbengine/datafile-1-0000000001.ndf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Scanning file “/var/cache/netdata/dbengine/journalfile-1-0000000001.njf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Scanning file “/var/cache/netdata/dbengine/metadatalog-00000-00001.mlf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Initializing data file “/var/cache/netdata/dbengine/datafile-1-0000000001.ndf”.
    2020-09-07 16:45:57: netdata INFO : MAIN : Data file “/var/cache/netdata/dbengine/datafile-1-0000000001.ndf” initialized (size:4096).
    2020-09-07 16:45:57: netdata INFO : MAIN : Loading journal file “/var/cache/netdata/dbengine/journalfile-1-0000000001.njf”.
    2020-09-07 16:45:57: netdata INFO : MAIN : Journal file “/var/cache/netdata/dbengine/journalfile-1-0000000001.njf” loaded (size:4096).
    2020-09-07 16:45:57: netdata INFO : MAIN : cannot set libuv thread name to DBENGINE. Err: 13
    2020-09-07 16:45:57: netdata INFO : MAIN : Found 3 files in path /var/cache/netdata/dbengine
    2020-09-07 16:45:57: netdata INFO : MAIN : Scanning file “/var/cache/netdata/dbengine/datafile-1-0000000001.ndf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Scanning file “/var/cache/netdata/dbengine/journalfile-1-0000000001.njf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Scanning file “/var/cache/netdata/dbengine/metadatalog-00000-00001.mlf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Matched file “/var/cache/netdata/dbengine/metadatalog-00000-00001.mlf”
    2020-09-07 16:45:57: netdata INFO : MAIN : Loading metadata log “/var/cache/netdata/dbengine/metadatalog-00000-00001.mlf”.
    2020-09-07 16:45:57: netdata ERROR : MAIN : File length is too short.

    2020-09-07 16:45:57: netdata ERROR : MAIN : 1 metadata log files failed to load.
    2020-09-07 16:45:57: netdata ERROR : MAIN : Failed to scan path “/var/cache/netdata/dbengine”. (errno 22, Invalid argument)
    2020-09-07 16:45:57: netdata ERROR : MAIN : Failed to initialize metadata log file event loop.
    2020-09-07 16:45:57: netdata INFO : MAIN : Freed 0 bytes of memory from page cache.
    2020-09-07 16:45:57: netdata ERROR : MAIN : Host ‘f724cdbdf329’ with machine guid ‘9a277ef4-f129-11ea-a1b6-0242ac110002’ failed to initialize multi-host DB engine instance at ‘/var/cache/netdata’. (errno 22, Invalid argument)
    2020-09-07 16:45:57: netdata INFO : MAIN : Freeing all memory for host ‘f724cdbdf329’…
    2020-09-07 16:45:57: netdata INFO : MAIN : SYSTEM_INFO: free 0x55f797f35140
    2020-09-07 16:45:57: netdata FATAL : MAIN : Cannot initialize localhost instance with name ‘f724cdbdf329’. # : Invalid argument

    2020-09-07 16:45:57: netdata INFO : MAIN : /usr/libexec/netdata/plugins.d/anonymous-statistics.sh ‘FATAL’ ‘netdata:MAIN’ ‘1448@daemon/mai:main /22’
    2020-09-07 16:46:00: netdata INFO : MAIN : EXIT: netdata prepares to exit with code 1…
    2020-09-07 16:46:00: netdata INFO : MAIN : /usr/libexec/netdata/plugins.d/anonymous-statistics.sh ‘EXIT’ ‘ERROR’ ‘-’
    2020-09-07 16:46:03: netdata INFO : MAIN : EXIT: cleaning up the database…
    2020-09-07 16:46:03: netdata INFO : MAIN : Cleaning up database [0 hosts(s)]…
    2020-09-07 16:46:03: netdata INFO : MAIN : EXIT: all done - netdata is now exiting - bye bye…


  • Staff

    Hey,

    Welcome to the community!

    How are you running netdata in a container? Are you following the docs or have you created your own container image using a dockerfile?



  • not own image .
    the image netdata/netdata:v1.24.0 only used


  • Staff

    Thanks @viji,

    How are you running netdata container? Are you using a docker-compose file? (if yes, can you share it please?). If not, can you share the command you used to run the netdata container?

    Let’s fix this💪



  • thank you @OdysLam .
    It is created using ansible module docker_container with version 1.24.0
    image: netdata/netdata:v1.24.0

    Now created the container manually using the below command
    docker run -d --name=netdata_test -p 19998:19998 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /var/run/docker.sock:/var/run/docker.sock:ro --restart unless-stopped --cap-add SYS_PTRACE --security-opt apparmor=unconfined netdata/netdata.

    it downloaded the latest image ( v1.24.0-203-g60fa3526) and netdata is running fine .
    is there any issue reported in image netdata/netdata:v1.24.0 ?

    i will re-create the node with latest image and will check further.


  • Staff

    We know that certain netdata distributions have some features disabled, as it’s up to the maintainers to decide what’s in and out. This is why we always suggest our users to use the installation methods that are mentioned in our documentation.

    Regarding Ansible, can you please share more information about your use-case and why you chose to deploy netdata using Ansible? We are very interested in creating more Ansible related material.

    cc @joel


  • Staff

    Hi @viji! I think I know what’s going wrong here. It looks like the container is unable to initialize the database engine. Our Docker documentation has a rather large command that we recommend. While you have some of the options in your command, you’re missing a few. See line 3 and 4 here:

    docker run -d --name=netdata \
      -p 19999:19999 \
      -v netdatalib:/var/lib/netdata \
      -v netdatacache:/var/cache/netdata \
      -v /etc/passwd:/host/etc/passwd:ro \
      -v /etc/group:/host/etc/group:ro \
      -v /proc:/host/proc:ro \
      -v /sys:/host/sys:ro \
      -v /etc/os-release:/host/etc/os-release:ro \
      --restart unless-stopped \
      --cap-add SYS_PTRACE \
      --security-opt apparmor=unconfined \
      netdata/netdata
    

    Those two lines create persistent volumes for the contents of /var/lib/netdata and /var/cache/netdata. I would suggest updating the command you use to create the container with this one and see if that changes anything.

    I second @OdysLam’s request for information on Ansible!



  • thank you @joel @OdysLam .
    After adding this 2 volume,it is working.
    The strange things is ,we will create different node type.
    Netdata is working in some node without this 2 volume.

    Reg usecase :
    We will create env in automated way (infra as code) in cloud .So using ansible for this purpose.


  • Staff

    So, in some cases netdata would be able to run without those 2 persistent volumes?

    Thanks for the use-case, if you feel like it, you could share back your ansible code with the rest of the community here. I am sure someone would find it very beneficial!

    Take care @viji and if you have another question, make sure to post it here!


Log in to reply