Deleted data in cassandra come back,like ghost












1















I have a 3 nodes Cassandra cluster(3.7), a keyspace



CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;


a table



CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));


one day when deleting one row like



delete from tradingdate 
where key='tradingDay'and tradingdate='2018-12-31'


then the deleted row become ghost, when the query



select * from tradingdate 
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;

key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02


select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;

key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31


So when use order by, the deleted row (tradingDay, 2018-12-31) come back.



I guess I only delete a row on one node, but it still exists on another node. So I execute:



nodetool repair demo tradingdate


on 3 nodes, then the deleted row totally disappears



So I want to know why use order by, I can see the ghost row.










share|improve this question

























  • what version of cassandra?

    – Alex Ott
    Jan 1 at 10:51











  • What consistency level are you using when reading?

    – creker
    Jan 1 at 12:35











  • i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result

    – jungle green
    Jan 1 at 12:41













  • That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

    – creker
    Jan 1 at 12:49











  • but strangely,when not use order by , i can not get the stale data

    – jungle green
    Jan 1 at 12:59
















1















I have a 3 nodes Cassandra cluster(3.7), a keyspace



CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;


a table



CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));


one day when deleting one row like



delete from tradingdate 
where key='tradingDay'and tradingdate='2018-12-31'


then the deleted row become ghost, when the query



select * from tradingdate 
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;

key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02


select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;

key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31


So when use order by, the deleted row (tradingDay, 2018-12-31) come back.



I guess I only delete a row on one node, but it still exists on another node. So I execute:



nodetool repair demo tradingdate


on 3 nodes, then the deleted row totally disappears



So I want to know why use order by, I can see the ghost row.










share|improve this question

























  • what version of cassandra?

    – Alex Ott
    Jan 1 at 10:51











  • What consistency level are you using when reading?

    – creker
    Jan 1 at 12:35











  • i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result

    – jungle green
    Jan 1 at 12:41













  • That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

    – creker
    Jan 1 at 12:49











  • but strangely,when not use order by , i can not get the stale data

    – jungle green
    Jan 1 at 12:59














1












1








1


0






I have a 3 nodes Cassandra cluster(3.7), a keyspace



CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;


a table



CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));


one day when deleting one row like



delete from tradingdate 
where key='tradingDay'and tradingdate='2018-12-31'


then the deleted row become ghost, when the query



select * from tradingdate 
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;

key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02


select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;

key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31


So when use order by, the deleted row (tradingDay, 2018-12-31) come back.



I guess I only delete a row on one node, but it still exists on another node. So I execute:



nodetool repair demo tradingdate


on 3 nodes, then the deleted row totally disappears



So I want to know why use order by, I can see the ghost row.










share|improve this question
















I have a 3 nodes Cassandra cluster(3.7), a keyspace



CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;


a table



CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));


one day when deleting one row like



delete from tradingdate 
where key='tradingDay'and tradingdate='2018-12-31'


then the deleted row become ghost, when the query



select * from tradingdate 
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;

key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02


select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;

key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31


So when use order by, the deleted row (tradingDay, 2018-12-31) come back.



I guess I only delete a row on one node, but it still exists on another node. So I execute:



nodetool repair demo tradingdate


on 3 nodes, then the deleted row totally disappears



So I want to know why use order by, I can see the ghost row.







cassandra nosql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 2 at 14:33









Naveen Nelamali

350112




350112










asked Jan 1 at 6:59









jungle greenjungle green

63




63













  • what version of cassandra?

    – Alex Ott
    Jan 1 at 10:51











  • What consistency level are you using when reading?

    – creker
    Jan 1 at 12:35











  • i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result

    – jungle green
    Jan 1 at 12:41













  • That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

    – creker
    Jan 1 at 12:49











  • but strangely,when not use order by , i can not get the stale data

    – jungle green
    Jan 1 at 12:59



















  • what version of cassandra?

    – Alex Ott
    Jan 1 at 10:51











  • What consistency level are you using when reading?

    – creker
    Jan 1 at 12:35











  • i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result

    – jungle green
    Jan 1 at 12:41













  • That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

    – creker
    Jan 1 at 12:49











  • but strangely,when not use order by , i can not get the stale data

    – jungle green
    Jan 1 at 12:59

















what version of cassandra?

– Alex Ott
Jan 1 at 10:51





what version of cassandra?

– Alex Ott
Jan 1 at 10:51













What consistency level are you using when reading?

– creker
Jan 1 at 12:35





What consistency level are you using when reading?

– creker
Jan 1 at 12:35













i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result

– jungle green
Jan 1 at 12:41







i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result

– jungle green
Jan 1 at 12:41















That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

– creker
Jan 1 at 12:49





That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

– creker
Jan 1 at 12:49













but strangely,when not use order by , i can not get the stale data

– jungle green
Jan 1 at 12:59





but strangely,when not use order by , i can not get the stale data

– jungle green
Jan 1 at 12:59












1 Answer
1






active

oldest

votes


















0














This is some good reading about deletes in Cassandra (and other distributed systems as well):



http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html



As well as:



https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html



You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.



Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):



# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19

Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0


Dropped messages indicate that there is something wrong.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993604%2fdeleted-data-in-cassandra-come-back-like-ghost%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    This is some good reading about deletes in Cassandra (and other distributed systems as well):



    http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html



    As well as:



    https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html



    You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.



    Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):



    # nodetool tpstats
    Pool Name Active Pending Completed Blocked All time blocked
    MutationStage 0 0 787032744 0 0
    ReadStage 0 0 1627843193 0 0
    RequestResponseStage 0 0 2257452312 0 0
    ReadRepairStage 0 0 99910415 0 0
    CounterMutationStage 0 0 0 0 0
    HintedHandoff 0 0 1582 0 0
    MiscStage 0 0 0 0 0
    CompactionExecutor 0 0 6649458 0 0
    MemtableReclaimMemory 0 0 17987 0 0
    PendingRangeCalculator 0 0 46 0 0
    GossipStage 0 0 22766295 0 0
    MigrationStage 0 0 8 0 0
    MemtablePostFlush 0 0 127844 0 0
    ValidationExecutor 0 0 0 0 0
    Sampler 0 0 0 0 0
    MemtableFlushWriter 0 0 17851 0 0
    InternalResponseStage 0 0 8669 0 0
    AntiEntropyStage 0 0 0 0 0
    CacheCleanupExecutor 0 0 0 0 0
    Native-Transport-Requests 0 0 631966060 0 19

    Message type Dropped
    READ 0
    RANGE_SLICE 0
    _TRACE 0
    MUTATION 0
    COUNTER_MUTATION 0
    REQUEST_RESPONSE 0
    PAGED_RANGE 0
    READ_REPAIR 0


    Dropped messages indicate that there is something wrong.






    share|improve this answer




























      0














      This is some good reading about deletes in Cassandra (and other distributed systems as well):



      http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html



      As well as:



      https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html



      You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.



      Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):



      # nodetool tpstats
      Pool Name Active Pending Completed Blocked All time blocked
      MutationStage 0 0 787032744 0 0
      ReadStage 0 0 1627843193 0 0
      RequestResponseStage 0 0 2257452312 0 0
      ReadRepairStage 0 0 99910415 0 0
      CounterMutationStage 0 0 0 0 0
      HintedHandoff 0 0 1582 0 0
      MiscStage 0 0 0 0 0
      CompactionExecutor 0 0 6649458 0 0
      MemtableReclaimMemory 0 0 17987 0 0
      PendingRangeCalculator 0 0 46 0 0
      GossipStage 0 0 22766295 0 0
      MigrationStage 0 0 8 0 0
      MemtablePostFlush 0 0 127844 0 0
      ValidationExecutor 0 0 0 0 0
      Sampler 0 0 0 0 0
      MemtableFlushWriter 0 0 17851 0 0
      InternalResponseStage 0 0 8669 0 0
      AntiEntropyStage 0 0 0 0 0
      CacheCleanupExecutor 0 0 0 0 0
      Native-Transport-Requests 0 0 631966060 0 19

      Message type Dropped
      READ 0
      RANGE_SLICE 0
      _TRACE 0
      MUTATION 0
      COUNTER_MUTATION 0
      REQUEST_RESPONSE 0
      PAGED_RANGE 0
      READ_REPAIR 0


      Dropped messages indicate that there is something wrong.






      share|improve this answer


























        0












        0








        0







        This is some good reading about deletes in Cassandra (and other distributed systems as well):



        http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html



        As well as:



        https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html



        You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.



        Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):



        # nodetool tpstats
        Pool Name Active Pending Completed Blocked All time blocked
        MutationStage 0 0 787032744 0 0
        ReadStage 0 0 1627843193 0 0
        RequestResponseStage 0 0 2257452312 0 0
        ReadRepairStage 0 0 99910415 0 0
        CounterMutationStage 0 0 0 0 0
        HintedHandoff 0 0 1582 0 0
        MiscStage 0 0 0 0 0
        CompactionExecutor 0 0 6649458 0 0
        MemtableReclaimMemory 0 0 17987 0 0
        PendingRangeCalculator 0 0 46 0 0
        GossipStage 0 0 22766295 0 0
        MigrationStage 0 0 8 0 0
        MemtablePostFlush 0 0 127844 0 0
        ValidationExecutor 0 0 0 0 0
        Sampler 0 0 0 0 0
        MemtableFlushWriter 0 0 17851 0 0
        InternalResponseStage 0 0 8669 0 0
        AntiEntropyStage 0 0 0 0 0
        CacheCleanupExecutor 0 0 0 0 0
        Native-Transport-Requests 0 0 631966060 0 19

        Message type Dropped
        READ 0
        RANGE_SLICE 0
        _TRACE 0
        MUTATION 0
        COUNTER_MUTATION 0
        REQUEST_RESPONSE 0
        PAGED_RANGE 0
        READ_REPAIR 0


        Dropped messages indicate that there is something wrong.






        share|improve this answer













        This is some good reading about deletes in Cassandra (and other distributed systems as well):



        http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html



        As well as:



        https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html



        You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.



        Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):



        # nodetool tpstats
        Pool Name Active Pending Completed Blocked All time blocked
        MutationStage 0 0 787032744 0 0
        ReadStage 0 0 1627843193 0 0
        RequestResponseStage 0 0 2257452312 0 0
        ReadRepairStage 0 0 99910415 0 0
        CounterMutationStage 0 0 0 0 0
        HintedHandoff 0 0 1582 0 0
        MiscStage 0 0 0 0 0
        CompactionExecutor 0 0 6649458 0 0
        MemtableReclaimMemory 0 0 17987 0 0
        PendingRangeCalculator 0 0 46 0 0
        GossipStage 0 0 22766295 0 0
        MigrationStage 0 0 8 0 0
        MemtablePostFlush 0 0 127844 0 0
        ValidationExecutor 0 0 0 0 0
        Sampler 0 0 0 0 0
        MemtableFlushWriter 0 0 17851 0 0
        InternalResponseStage 0 0 8669 0 0
        AntiEntropyStage 0 0 0 0 0
        CacheCleanupExecutor 0 0 0 0 0
        Native-Transport-Requests 0 0 631966060 0 19

        Message type Dropped
        READ 0
        RANGE_SLICE 0
        _TRACE 0
        MUTATION 0
        COUNTER_MUTATION 0
        REQUEST_RESPONSE 0
        PAGED_RANGE 0
        READ_REPAIR 0


        Dropped messages indicate that there is something wrong.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 2 at 12:49









        MandraenkeMandraenke

        2,1791718




        2,1791718
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993604%2fdeleted-data-in-cassandra-come-back-like-ghost%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Monofisismo

            Angular Downloading a file using contenturl with Basic Authentication

            Olmecas