How to find large partition in cassandra except system.log?












1














How can we find large partitions on our cassandra cluster before came into system.log? we are facing some performance issue due to this. Can anyone help me. We have cassandra version 2.0.11 and 2.1.16.










share|improve this question



























    1














    How can we find large partitions on our cassandra cluster before came into system.log? we are facing some performance issue due to this. Can anyone help me. We have cassandra version 2.0.11 and 2.1.16.










    share|improve this question

























      1












      1








      1







      How can we find large partitions on our cassandra cluster before came into system.log? we are facing some performance issue due to this. Can anyone help me. We have cassandra version 2.0.11 and 2.1.16.










      share|improve this question













      How can we find large partitions on our cassandra cluster before came into system.log? we are facing some performance issue due to this. Can anyone help me. We have cassandra version 2.0.11 and 2.1.16.







      cassandra cassandra-2.0 cassandra-2.1






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Dec 28 '18 at 5:34









      PandeyPandey

      658




      658
























          2 Answers
          2






          active

          oldest

          votes


















          1














          You can look into output of the nodetool tablestats (or nodetool cfstats in the older versions of Cassandra) - for every table it has line Compacted partition maximum bytes together with other information, like in this example when max partition size is about 268Mb:



              Table: table_name
          SSTable count: 2
          Space used (live): 147638509
          Space used (total): 147638509
          .....
          Compacted partition minimum bytes: 43
          Compacted partition maximum bytes: 268650950
          Compacted partition mean bytes: 430941
          Average live cells per slice (last five minutes): 8256.0
          Maximum live cells per slice (last five minutes): 10239
          Average tombstones per slice (last five minutes): 1.0
          Maximum tombstones per slice (last five minutes): 1
          .....


          But nodetool tablestats gives you an information for current node only, so you'll need to execute it on every node of the cluster.



          Update: You can find largest partitions using different tools:





          • https://github.com/tolbertam/sstable-tools has describe command that shows largest/widest partitions. This command will be also available in Cassandra 4.0.

          • for DataStax products the DSBulk tool supports counting of partitions.






          share|improve this answer























          • Thanks. Yes, but require keys of those partitions? any command to find if any
            – Pandey
            Dec 28 '18 at 11:44










          • See updated answer
            – Alex Ott
            Dec 28 '18 at 14:21










          • Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction.
            – Pandey
            Dec 28 '18 at 17:33










          • You can try to build that code with 2.x
            – Alex Ott
            Dec 28 '18 at 17:35












          • Okay, will do. Thanks
            – Pandey
            Dec 29 '18 at 3:16



















          0














          Try nodetool tablehistograms -- <keyspace> <table> command provides statistics about a table, including read/write latency, partition size, column count, and number of SSTables.



          Below is the example output:



          Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
          (micros) (micros) (bytes)
          50% 0.00 73.46 0.00 223875792 61214
          75% 0.00 88.15 0.00 668489532 182785
          95% 0.00 152.32 0.00 1996099046 654949
          98% 0.00 785.94 0.00 3449259151 1358102
          99% 0.00 943.13 0.00 3449259151 1358102
          Min 0.00 24.60 0.00 5723 4
          Max 0.00 5839.59 0.00 5960319812 1955666


          This provides proper stats of the table like 95% percentile of raw_data table has partition size of 107MB and max of 3.44GB.



          Hope this helps to figure out performance issue.






          share|improve this answer





















          • Thanks but we require to key of particular partitions so that we can delete large partitions based on keys.
            – Pandey
            Dec 28 '18 at 11:42











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53954080%2fhow-to-find-large-partition-in-cassandra-except-system-log%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          You can look into output of the nodetool tablestats (or nodetool cfstats in the older versions of Cassandra) - for every table it has line Compacted partition maximum bytes together with other information, like in this example when max partition size is about 268Mb:



              Table: table_name
          SSTable count: 2
          Space used (live): 147638509
          Space used (total): 147638509
          .....
          Compacted partition minimum bytes: 43
          Compacted partition maximum bytes: 268650950
          Compacted partition mean bytes: 430941
          Average live cells per slice (last five minutes): 8256.0
          Maximum live cells per slice (last five minutes): 10239
          Average tombstones per slice (last five minutes): 1.0
          Maximum tombstones per slice (last five minutes): 1
          .....


          But nodetool tablestats gives you an information for current node only, so you'll need to execute it on every node of the cluster.



          Update: You can find largest partitions using different tools:





          • https://github.com/tolbertam/sstable-tools has describe command that shows largest/widest partitions. This command will be also available in Cassandra 4.0.

          • for DataStax products the DSBulk tool supports counting of partitions.






          share|improve this answer























          • Thanks. Yes, but require keys of those partitions? any command to find if any
            – Pandey
            Dec 28 '18 at 11:44










          • See updated answer
            – Alex Ott
            Dec 28 '18 at 14:21










          • Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction.
            – Pandey
            Dec 28 '18 at 17:33










          • You can try to build that code with 2.x
            – Alex Ott
            Dec 28 '18 at 17:35












          • Okay, will do. Thanks
            – Pandey
            Dec 29 '18 at 3:16
















          1














          You can look into output of the nodetool tablestats (or nodetool cfstats in the older versions of Cassandra) - for every table it has line Compacted partition maximum bytes together with other information, like in this example when max partition size is about 268Mb:



              Table: table_name
          SSTable count: 2
          Space used (live): 147638509
          Space used (total): 147638509
          .....
          Compacted partition minimum bytes: 43
          Compacted partition maximum bytes: 268650950
          Compacted partition mean bytes: 430941
          Average live cells per slice (last five minutes): 8256.0
          Maximum live cells per slice (last five minutes): 10239
          Average tombstones per slice (last five minutes): 1.0
          Maximum tombstones per slice (last five minutes): 1
          .....


          But nodetool tablestats gives you an information for current node only, so you'll need to execute it on every node of the cluster.



          Update: You can find largest partitions using different tools:





          • https://github.com/tolbertam/sstable-tools has describe command that shows largest/widest partitions. This command will be also available in Cassandra 4.0.

          • for DataStax products the DSBulk tool supports counting of partitions.






          share|improve this answer























          • Thanks. Yes, but require keys of those partitions? any command to find if any
            – Pandey
            Dec 28 '18 at 11:44










          • See updated answer
            – Alex Ott
            Dec 28 '18 at 14:21










          • Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction.
            – Pandey
            Dec 28 '18 at 17:33










          • You can try to build that code with 2.x
            – Alex Ott
            Dec 28 '18 at 17:35












          • Okay, will do. Thanks
            – Pandey
            Dec 29 '18 at 3:16














          1












          1








          1






          You can look into output of the nodetool tablestats (or nodetool cfstats in the older versions of Cassandra) - for every table it has line Compacted partition maximum bytes together with other information, like in this example when max partition size is about 268Mb:



              Table: table_name
          SSTable count: 2
          Space used (live): 147638509
          Space used (total): 147638509
          .....
          Compacted partition minimum bytes: 43
          Compacted partition maximum bytes: 268650950
          Compacted partition mean bytes: 430941
          Average live cells per slice (last five minutes): 8256.0
          Maximum live cells per slice (last five minutes): 10239
          Average tombstones per slice (last five minutes): 1.0
          Maximum tombstones per slice (last five minutes): 1
          .....


          But nodetool tablestats gives you an information for current node only, so you'll need to execute it on every node of the cluster.



          Update: You can find largest partitions using different tools:





          • https://github.com/tolbertam/sstable-tools has describe command that shows largest/widest partitions. This command will be also available in Cassandra 4.0.

          • for DataStax products the DSBulk tool supports counting of partitions.






          share|improve this answer














          You can look into output of the nodetool tablestats (or nodetool cfstats in the older versions of Cassandra) - for every table it has line Compacted partition maximum bytes together with other information, like in this example when max partition size is about 268Mb:



              Table: table_name
          SSTable count: 2
          Space used (live): 147638509
          Space used (total): 147638509
          .....
          Compacted partition minimum bytes: 43
          Compacted partition maximum bytes: 268650950
          Compacted partition mean bytes: 430941
          Average live cells per slice (last five minutes): 8256.0
          Maximum live cells per slice (last five minutes): 10239
          Average tombstones per slice (last five minutes): 1.0
          Maximum tombstones per slice (last five minutes): 1
          .....


          But nodetool tablestats gives you an information for current node only, so you'll need to execute it on every node of the cluster.



          Update: You can find largest partitions using different tools:





          • https://github.com/tolbertam/sstable-tools has describe command that shows largest/widest partitions. This command will be also available in Cassandra 4.0.

          • for DataStax products the DSBulk tool supports counting of partitions.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Dec 28 '18 at 14:21

























          answered Dec 28 '18 at 8:12









          Alex OttAlex Ott

          27.3k35072




          27.3k35072












          • Thanks. Yes, but require keys of those partitions? any command to find if any
            – Pandey
            Dec 28 '18 at 11:44










          • See updated answer
            – Alex Ott
            Dec 28 '18 at 14:21










          • Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction.
            – Pandey
            Dec 28 '18 at 17:33










          • You can try to build that code with 2.x
            – Alex Ott
            Dec 28 '18 at 17:35












          • Okay, will do. Thanks
            – Pandey
            Dec 29 '18 at 3:16


















          • Thanks. Yes, but require keys of those partitions? any command to find if any
            – Pandey
            Dec 28 '18 at 11:44










          • See updated answer
            – Alex Ott
            Dec 28 '18 at 14:21










          • Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction.
            – Pandey
            Dec 28 '18 at 17:33










          • You can try to build that code with 2.x
            – Alex Ott
            Dec 28 '18 at 17:35












          • Okay, will do. Thanks
            – Pandey
            Dec 29 '18 at 3:16
















          Thanks. Yes, but require keys of those partitions? any command to find if any
          – Pandey
          Dec 28 '18 at 11:44




          Thanks. Yes, but require keys of those partitions? any command to find if any
          – Pandey
          Dec 28 '18 at 11:44












          See updated answer
          – Alex Ott
          Dec 28 '18 at 14:21




          See updated answer
          – Alex Ott
          Dec 28 '18 at 14:21












          Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction.
          – Pandey
          Dec 28 '18 at 17:33




          Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction.
          – Pandey
          Dec 28 '18 at 17:33












          You can try to build that code with 2.x
          – Alex Ott
          Dec 28 '18 at 17:35






          You can try to build that code with 2.x
          – Alex Ott
          Dec 28 '18 at 17:35














          Okay, will do. Thanks
          – Pandey
          Dec 29 '18 at 3:16




          Okay, will do. Thanks
          – Pandey
          Dec 29 '18 at 3:16













          0














          Try nodetool tablehistograms -- <keyspace> <table> command provides statistics about a table, including read/write latency, partition size, column count, and number of SSTables.



          Below is the example output:



          Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
          (micros) (micros) (bytes)
          50% 0.00 73.46 0.00 223875792 61214
          75% 0.00 88.15 0.00 668489532 182785
          95% 0.00 152.32 0.00 1996099046 654949
          98% 0.00 785.94 0.00 3449259151 1358102
          99% 0.00 943.13 0.00 3449259151 1358102
          Min 0.00 24.60 0.00 5723 4
          Max 0.00 5839.59 0.00 5960319812 1955666


          This provides proper stats of the table like 95% percentile of raw_data table has partition size of 107MB and max of 3.44GB.



          Hope this helps to figure out performance issue.






          share|improve this answer





















          • Thanks but we require to key of particular partitions so that we can delete large partitions based on keys.
            – Pandey
            Dec 28 '18 at 11:42
















          0














          Try nodetool tablehistograms -- <keyspace> <table> command provides statistics about a table, including read/write latency, partition size, column count, and number of SSTables.



          Below is the example output:



          Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
          (micros) (micros) (bytes)
          50% 0.00 73.46 0.00 223875792 61214
          75% 0.00 88.15 0.00 668489532 182785
          95% 0.00 152.32 0.00 1996099046 654949
          98% 0.00 785.94 0.00 3449259151 1358102
          99% 0.00 943.13 0.00 3449259151 1358102
          Min 0.00 24.60 0.00 5723 4
          Max 0.00 5839.59 0.00 5960319812 1955666


          This provides proper stats of the table like 95% percentile of raw_data table has partition size of 107MB and max of 3.44GB.



          Hope this helps to figure out performance issue.






          share|improve this answer





















          • Thanks but we require to key of particular partitions so that we can delete large partitions based on keys.
            – Pandey
            Dec 28 '18 at 11:42














          0












          0








          0






          Try nodetool tablehistograms -- <keyspace> <table> command provides statistics about a table, including read/write latency, partition size, column count, and number of SSTables.



          Below is the example output:



          Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
          (micros) (micros) (bytes)
          50% 0.00 73.46 0.00 223875792 61214
          75% 0.00 88.15 0.00 668489532 182785
          95% 0.00 152.32 0.00 1996099046 654949
          98% 0.00 785.94 0.00 3449259151 1358102
          99% 0.00 943.13 0.00 3449259151 1358102
          Min 0.00 24.60 0.00 5723 4
          Max 0.00 5839.59 0.00 5960319812 1955666


          This provides proper stats of the table like 95% percentile of raw_data table has partition size of 107MB and max of 3.44GB.



          Hope this helps to figure out performance issue.






          share|improve this answer












          Try nodetool tablehistograms -- <keyspace> <table> command provides statistics about a table, including read/write latency, partition size, column count, and number of SSTables.



          Below is the example output:



          Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
          (micros) (micros) (bytes)
          50% 0.00 73.46 0.00 223875792 61214
          75% 0.00 88.15 0.00 668489532 182785
          95% 0.00 152.32 0.00 1996099046 654949
          98% 0.00 785.94 0.00 3449259151 1358102
          99% 0.00 943.13 0.00 3449259151 1358102
          Min 0.00 24.60 0.00 5723 4
          Max 0.00 5839.59 0.00 5960319812 1955666


          This provides proper stats of the table like 95% percentile of raw_data table has partition size of 107MB and max of 3.44GB.



          Hope this helps to figure out performance issue.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 28 '18 at 9:23









          Mehul GuptaMehul Gupta

          314315




          314315












          • Thanks but we require to key of particular partitions so that we can delete large partitions based on keys.
            – Pandey
            Dec 28 '18 at 11:42


















          • Thanks but we require to key of particular partitions so that we can delete large partitions based on keys.
            – Pandey
            Dec 28 '18 at 11:42
















          Thanks but we require to key of particular partitions so that we can delete large partitions based on keys.
          – Pandey
          Dec 28 '18 at 11:42




          Thanks but we require to key of particular partitions so that we can delete large partitions based on keys.
          – Pandey
          Dec 28 '18 at 11:42


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53954080%2fhow-to-find-large-partition-in-cassandra-except-system-log%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Mossoró

          Error while reading .h5 file using the rhdf5 package in R

          Pushsharp Apns notification error: 'InvalidToken'