Cassandra Analyzer is a tool to collect log files and nodetool output from your Cassandra cluster into a tarball and ingest the logs so they can be visualized using our prebuilt Kibana dashboard. These python scripts also run commands from TableAnalyzer and NodeAnalyzer and includes the results in the tarball. After running this, you will be able to either view your logs in the Kibana dashboard, perform data model review using the formatted spreadsheet generated by TableAnalyzer, or take the tarball that was collected and run other types of analytics.
There are three steps to this process, some of which you might be able to skip depending on what you already have setup:
- Step #1: Installing Elasticsearch, Filebeat, and Kibana
- Step #2: Collect Logs and Nodetool Output From Your Cluster
- Step #3: Visualize Your Logs and Metrics
Follow the instructions below to get started.
If you don't yet have Elasticsearch, Filebeat, and Kibana installed, there are different installation options you can use.
- Option #1 - Ansible: We have included a playbook and instructions in Cassandra.vision for you to use. Get started here.
- Option #2 - Manual Installation: Of course you can always just install the tools yourself. You will need these three tools:
They can all be installed on the same node or they can be installed on different nodes. The only difference will be making sure you specify the right ip addresses when running the tool, as described in the following instructions.
Now that Elastic stack is installed, next you will need to collect logs and Cassandra (CFStats/TableStats) output using offline-log-collector
. Click here to get started.
Already have a Datastax Opscenter Diagnostic Tarball?
Note that offline-log-collector
can be skipped if you already have a Datastax Opscenter Diagnostic Tarball. While the Opscenter Diagnostic Tarball won't include the TableAnalyzer and NodeAnalyzer commands that we run using offline-log-collector
, you can continue to the next step and ingest the diagnostic tarball into Elasticsearch and Kibana without any problem.
Now you are ready to visualize data about your cluster.
There are two different types of visualization here that will help you perform offline monitoring: 1) log visualization using Kibana and 2) tablestats
/cfstats
visualization in a formatted spreadsheet.
It's best to do both if you can, but they can be done separately as well.
Now that you have either finished collecting logs using offline-log-collector
or you already have a Datastax Opscenter Diagnostic Tarball, you are ready to begin ingesting logs into Elasticsearch and visualizing in Kibana.
See offline-log-ingester for ingesting logs into Elasticsearch and Kibana.
A formatted spreadsheet containing tablestats info can make it easy to run a data model review on your cluster. Follow the steps below to begin.
If you ran offline-log-collector already, you can click here to start transforming your tablestats/cfstats into a formatted spreadsheet using TableAnalyzer. Note that at this point, we have already ran the cfstats.receive.py
script for you. Now all you will have to do is transform it into a CSV and then convert that into a spreadsheet, following instructions in the link above.