0

What logs need to be truncated

Hi All,

I seem to keep finding new logs/working files that I need to clean. This is mostly an issue on my dev system since it does not have as much disk allocated as the others, but once I find it on dev, I see that it is also large on my other environments.

So far I have the following:

Analytics server:

/opt/incorta/IncortaAnalytics/IncortaNode/services/INC_ANA_SID/logs/kafka

/opt/incorta/IncortaAnalytics/IncortaNode/services/INC_ANA_SID/logs/incorta

/opt/incorta/IncortaAnalytics/IncortaNode/services/INC_ANA_SID/logs/

Loader Server (also my Spark server)

/opt/incorta/IncortaAnalytics/IncortaNode/services/INC_LOAD_SID/logs/kafka

/opt/incorta/IncortaAnalytics/IncortaNode/services/INC_LOAD_SID/logs/incorta

/opt/incorta/IncortaAnalytics/IncortaNode/services/INC_LOAD_SID/logs/

/opt/incorta/IncortaAnalytics/IncortaNode/spark/eventlogs

/opt/incorta/IncortaAnalytics/IncortaNode/spark/work

 

I did post my script that I had previously on https://community.incorta.com/t/35hhpn5/log-retention, but I seem to find new places to clean. The other issue is trying to determine an appropriate amount of time to keep the logs. The "spark/work" folder had just over 5GB of files and some of the other folders had even more.

Does anyone have other folders or files they are cleaning?

 

Thanks

5replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
  • Cleaning up Spark Log Files

    If you have MVs, Spark log files can occupy huge amount of disk space over time, By default, Spark does not regularly clean up those files. Change the following Spark properties in spark-defaults.conf to values that support your planned activity, and monitor these settings over time:

     

    spark.worker.cleanup.enabled

    Enables periodic cleanup of worker and application directories. This is disabled by default. Set to true to enable it.

    spark.worker.cleanup.interval

    The frequency, in seconds, that the worker cleans up old application work directories. The default is 30 minutes. Modify the value as you deem appropriate.

    spark.worker.cleanup.appDataTtl

    Controls how long, in seconds, to retain application work directories. The default is 7 days, which is generally inadequate if Spark jobs are run frequently. Modify the value as you deem appropriate.

    Like 1
  • Newbie question - I don't even see these properties in the spark-defaults.conf file. I am looking at the file under the /opt/incorta/IncortaAnalytics/IncortaNode/spark/conf directory - is that the correct one? If so, can I just add these properties and can I get a copy of the syntax for each?

    Like
    • Hi Barbra Jones that is the correct file and the example syntax would be like this:

      spark.worker.cleanup.enabled true
      spark.worker.cleanup.interval 600
      spark.worker.cleanup.appDataTtl 172800

      Note you'll need to restart Spark services (./IncortaNode/stopSpark.sh, ./IncortaNode/startSpark.sh in v4/v5) for the settings to take effect. 

      Like
  • Thanks! Follow up question - if I am doing some manual clean up, I can delete files under the /opt/incorta/IncortaAnalytics/IncortaNode/spark/work folder, correct? 

    Like
    • Barbra Jones Yes, but we would recommend shutting Spark down before deleting log/temp files.

      Like
Like Follow
  • 4 mths agoLast active
  • 5Replies
  • 56Views
  • 4 Following

Product Announcement

Incorta 5 is now Generally Available