Deleted data in cassandra come back,like ghost

I have a 3 nodes Cassandra cluster(3.7), a keyspace

CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;

a table

CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));

one day when deleting one row like

delete from tradingdate 

where key='tradingDay'and tradingdate='2018-12-31'

then the deleted row become ghost, when the query

select * from tradingdate 

where key='tradingDay'and tradingdate>'2018-12-27' limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2018-12-28

     tradingDay |  2019-01-02





select * from tradingdate 

where key='tradingDay'and tradingdate<'2019-01-03' 

order by tradingdate desc limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2019-01-02

     tradingDay |  2018-12-31

So when use order by, the deleted row (tradingDay, 2018-12-31) come back.

I guess I only delete a row on one node, but it still exists on another node. So I execute:

nodetool repair demo tradingdate

on 3 nodes, then the deleted row totally disappears

So I want to know why use order by, I can see the ghost row.

edited Jan 2 at 14:33

Naveen Nelamali

350112

asked Jan 1 at 6:59

jungle green

what version of cassandra?

– Alex Ott
Jan 1 at 10:51

What consistency level are you using when reading?

– creker
Jan 1 at 12:35

i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation，when set consistency two,would lead to same result

– jungle green
Jan 1 at 12:41

That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

– creker
Jan 1 at 12:49

but strangely,when not use order by , i can not get the stale data

– jungle green
Jan 1 at 12:59

|
show 3 more comments

I have a 3 nodes Cassandra cluster(3.7), a keyspace

CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;

a table

CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));

one day when deleting one row like

delete from tradingdate 

where key='tradingDay'and tradingdate='2018-12-31'

then the deleted row become ghost, when the query

select * from tradingdate 

where key='tradingDay'and tradingdate>'2018-12-27' limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2018-12-28

     tradingDay |  2019-01-02





select * from tradingdate 

where key='tradingDay'and tradingdate<'2019-01-03' 

order by tradingdate desc limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2019-01-02

     tradingDay |  2018-12-31

So when use order by, the deleted row (tradingDay, 2018-12-31) come back.

I guess I only delete a row on one node, but it still exists on another node. So I execute:

nodetool repair demo tradingdate

on 3 nodes, then the deleted row totally disappears

So I want to know why use order by, I can see the ghost row.

edited Jan 2 at 14:33

Naveen Nelamali

350112

asked Jan 1 at 6:59

jungle green

what version of cassandra?

– Alex Ott
Jan 1 at 10:51

What consistency level are you using when reading?

– creker
Jan 1 at 12:35

i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation，when set consistency two,would lead to same result

– jungle green
Jan 1 at 12:41

That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

– creker
Jan 1 at 12:49

but strangely,when not use order by , i can not get the stale data

– jungle green
Jan 1 at 12:59

|
show 3 more comments

I have a 3 nodes Cassandra cluster(3.7), a keyspace

CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;

a table

CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));

one day when deleting one row like

delete from tradingdate 

where key='tradingDay'and tradingdate='2018-12-31'

then the deleted row become ghost, when the query

select * from tradingdate 

where key='tradingDay'and tradingdate>'2018-12-27' limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2018-12-28

     tradingDay |  2019-01-02





select * from tradingdate 

where key='tradingDay'and tradingdate<'2019-01-03' 

order by tradingdate desc limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2019-01-02

     tradingDay |  2018-12-31

So when use order by, the deleted row (tradingDay, 2018-12-31) come back.

I guess I only delete a row on one node, but it still exists on another node. So I execute:

nodetool repair demo tradingdate

on 3 nodes, then the deleted row totally disappears

So I want to know why use order by, I can see the ghost row.

edited Jan 2 at 14:33

Naveen Nelamali

350112

asked Jan 1 at 6:59

jungle green

I have a 3 nodes Cassandra cluster(3.7), a keyspace

CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;

a table

CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));

one day when deleting one row like

delete from tradingdate 

where key='tradingDay'and tradingdate='2018-12-31'

then the deleted row become ghost, when the query

select * from tradingdate 

where key='tradingDay'and tradingdate>'2018-12-27' limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2018-12-28

     tradingDay |  2019-01-02





select * from tradingdate 

where key='tradingDay'and tradingdate<'2019-01-03' 

order by tradingdate desc limit 2;



     key        | tradingdate

    ------------+-------------

     tradingDay |  2019-01-02

     tradingDay |  2018-12-31

So when use order by, the deleted row (tradingDay, 2018-12-31) come back.

I guess I only delete a row on one node, but it still exists on another node. So I execute:

nodetool repair demo tradingdate

on 3 nodes, then the deleted row totally disappears

So I want to know why use order by, I can see the ghost row.

cassandra nosql

edited Jan 2 at 14:33

Naveen Nelamali

350112

asked Jan 1 at 6:59

jungle green

edited Jan 2 at 14:33

Naveen Nelamali

350112

asked Jan 1 at 6:59

jungle green

edited Jan 2 at 14:33

Naveen Nelamali

350112

edited Jan 2 at 14:33

Naveen Nelamali

350112

edited Jan 2 at 14:33

Naveen Nelamali

350112

asked Jan 1 at 6:59

jungle green

asked Jan 1 at 6:59

jungle green

asked Jan 1 at 6:59

jungle green

what version of cassandra?

– Alex Ott
Jan 1 at 10:51

What consistency level are you using when reading?

– creker
Jan 1 at 12:35

i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation，when set consistency two,would lead to same result

– jungle green
Jan 1 at 12:41

That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

– creker
Jan 1 at 12:49

but strangely,when not use order by , i can not get the stale data

– jungle green
Jan 1 at 12:59

|
show 3 more comments

what version of cassandra?

– Alex Ott
Jan 1 at 10:51

What consistency level are you using when reading?

– creker
Jan 1 at 12:35

i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation，when set consistency two,would lead to same result

– jungle green
Jan 1 at 12:41

That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

– creker
Jan 1 at 12:49

but strangely,when not use order by , i can not get the stale data

– jungle green
Jan 1 at 12:59

what version of cassandra?

– Alex Ott
Jan 1 at 10:51

What consistency level are you using when reading?

– creker
Jan 1 at 12:35

i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation，when set consistency two,would lead to same result

– jungle green
Jan 1 at 12:41

That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.

– creker
Jan 1 at 12:49

but strangely,when not use order by , i can not get the stale data

– jungle green
Jan 1 at 12:59

|
show 3 more comments

1 Answer
1

active

oldest

votes

This is some good reading about deletes in Cassandra (and other distributed systems as well):

http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html

As well as:

https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html

You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.

Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):

# nodetool tpstats

Pool Name                    Active   Pending      Completed   Blocked  All time blocked

MutationStage                     0         0      787032744         0                 0

ReadStage                         0         0     1627843193         0                 0

RequestResponseStage              0         0     2257452312         0                 0

ReadRepairStage                   0         0       99910415         0                 0

CounterMutationStage              0         0              0         0                 0

HintedHandoff                     0         0           1582         0                 0

MiscStage                         0         0              0         0                 0

CompactionExecutor                0         0        6649458         0                 0

MemtableReclaimMemory             0         0          17987         0                 0

PendingRangeCalculator            0         0             46         0                 0

GossipStage                       0         0       22766295         0                 0

MigrationStage                    0         0              8         0                 0

MemtablePostFlush                 0         0         127844         0                 0

ValidationExecutor                0         0              0         0                 0

Sampler                           0         0              0         0                 0

MemtableFlushWriter               0         0          17851         0                 0

InternalResponseStage             0         0           8669         0                 0

AntiEntropyStage                  0         0              0         0                 0

CacheCleanupExecutor              0         0              0         0                 0

Native-Transport-Requests         0         0      631966060         0                19



Message type           Dropped

READ                         0

RANGE_SLICE                  0

_TRACE                       0

MUTATION                     0

COUNTER_MUTATION             0

REQUEST_RESPONSE             0

PAGED_RANGE                  0

READ_REPAIR                  0

Dropped messages indicate that there is something wrong.

answered Jan 2 at 12:49

Mandraenke

2,1791718

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993604%2fdeleted-data-in-cassandra-come-back-like-ghost%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

This is some good reading about deletes in Cassandra (and other distributed systems as well):

http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html

As well as:

https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html

You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.

Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):

# nodetool tpstats

Pool Name                    Active   Pending      Completed   Blocked  All time blocked

MutationStage                     0         0      787032744         0                 0

ReadStage                         0         0     1627843193         0                 0

RequestResponseStage              0         0     2257452312         0                 0

ReadRepairStage                   0         0       99910415         0                 0

CounterMutationStage              0         0              0         0                 0

HintedHandoff                     0         0           1582         0                 0

MiscStage                         0         0              0         0                 0

CompactionExecutor                0         0        6649458         0                 0

MemtableReclaimMemory             0         0          17987         0                 0

PendingRangeCalculator            0         0             46         0                 0

GossipStage                       0         0       22766295         0                 0

MigrationStage                    0         0              8         0                 0

MemtablePostFlush                 0         0         127844         0                 0

ValidationExecutor                0         0              0         0                 0

Sampler                           0         0              0         0                 0

MemtableFlushWriter               0         0          17851         0                 0

InternalResponseStage             0         0           8669         0                 0

AntiEntropyStage                  0         0              0         0                 0

CacheCleanupExecutor              0         0              0         0                 0

Native-Transport-Requests         0         0      631966060         0                19



Message type           Dropped

READ                         0

RANGE_SLICE                  0

_TRACE                       0

MUTATION                     0

COUNTER_MUTATION             0

REQUEST_RESPONSE             0

PAGED_RANGE                  0

READ_REPAIR                  0

Dropped messages indicate that there is something wrong.

answered Jan 2 at 12:49

Mandraenke

2,1791718

add a comment |

This is some good reading about deletes in Cassandra (and other distributed systems as well):

http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html

As well as:

https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html

You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.

Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):

# nodetool tpstats

Pool Name                    Active   Pending      Completed   Blocked  All time blocked

MutationStage                     0         0      787032744         0                 0

ReadStage                         0         0     1627843193         0                 0

RequestResponseStage              0         0     2257452312         0                 0

ReadRepairStage                   0         0       99910415         0                 0

CounterMutationStage              0         0              0         0                 0

HintedHandoff                     0         0           1582         0                 0

MiscStage                         0         0              0         0                 0

CompactionExecutor                0         0        6649458         0                 0

MemtableReclaimMemory             0         0          17987         0                 0

PendingRangeCalculator            0         0             46         0                 0

GossipStage                       0         0       22766295         0                 0

MigrationStage                    0         0              8         0                 0

MemtablePostFlush                 0         0         127844         0                 0

ValidationExecutor                0         0              0         0                 0

Sampler                           0         0              0         0                 0

MemtableFlushWriter               0         0          17851         0                 0

InternalResponseStage             0         0           8669         0                 0

AntiEntropyStage                  0         0              0         0                 0

CacheCleanupExecutor              0         0              0         0                 0

Native-Transport-Requests         0         0      631966060         0                19



Message type           Dropped

READ                         0

RANGE_SLICE                  0

_TRACE                       0

MUTATION                     0

COUNTER_MUTATION             0

REQUEST_RESPONSE             0

PAGED_RANGE                  0

READ_REPAIR                  0

Dropped messages indicate that there is something wrong.

answered Jan 2 at 12:49

Mandraenke

2,1791718

add a comment |

This is some good reading about deletes in Cassandra (and other distributed systems as well):

http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html

As well as:

https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html

You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.

Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):

# nodetool tpstats

Pool Name                    Active   Pending      Completed   Blocked  All time blocked

MutationStage                     0         0      787032744         0                 0

ReadStage                         0         0     1627843193         0                 0

RequestResponseStage              0         0     2257452312         0                 0

ReadRepairStage                   0         0       99910415         0                 0

CounterMutationStage              0         0              0         0                 0

HintedHandoff                     0         0           1582         0                 0

MiscStage                         0         0              0         0                 0

CompactionExecutor                0         0        6649458         0                 0

MemtableReclaimMemory             0         0          17987         0                 0

PendingRangeCalculator            0         0             46         0                 0

GossipStage                       0         0       22766295         0                 0

MigrationStage                    0         0              8         0                 0

MemtablePostFlush                 0         0         127844         0                 0

ValidationExecutor                0         0              0         0                 0

Sampler                           0         0              0         0                 0

MemtableFlushWriter               0         0          17851         0                 0

InternalResponseStage             0         0           8669         0                 0

AntiEntropyStage                  0         0              0         0                 0

CacheCleanupExecutor              0         0              0         0                 0

Native-Transport-Requests         0         0      631966060         0                19



Message type           Dropped

READ                         0

RANGE_SLICE                  0

_TRACE                       0

MUTATION                     0

COUNTER_MUTATION             0

REQUEST_RESPONSE             0

PAGED_RANGE                  0

READ_REPAIR                  0

Dropped messages indicate that there is something wrong.

answered Jan 2 at 12:49

Mandraenke

2,1791718

This is some good reading about deletes in Cassandra (and other distributed systems as well):

http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html

As well as:

https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html

You will need to run/schedule a routine repair at least once within gc_grace_seconds which defaults to ten days to prevent data from reappearing in your cluster.

Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):

# nodetool tpstats

Pool Name                    Active   Pending      Completed   Blocked  All time blocked

MutationStage                     0         0      787032744         0                 0

ReadStage                         0         0     1627843193         0                 0

RequestResponseStage              0         0     2257452312         0                 0

ReadRepairStage                   0         0       99910415         0                 0

CounterMutationStage              0         0              0         0                 0

HintedHandoff                     0         0           1582         0                 0

MiscStage                         0         0              0         0                 0

CompactionExecutor                0         0        6649458         0                 0

MemtableReclaimMemory             0         0          17987         0                 0

PendingRangeCalculator            0         0             46         0                 0

GossipStage                       0         0       22766295         0                 0

MigrationStage                    0         0              8         0                 0

MemtablePostFlush                 0         0         127844         0                 0

ValidationExecutor                0         0              0         0                 0

Sampler                           0         0              0         0                 0

MemtableFlushWriter               0         0          17851         0                 0

InternalResponseStage             0         0           8669         0                 0

AntiEntropyStage                  0         0              0         0                 0

CacheCleanupExecutor              0         0              0         0                 0

Native-Transport-Requests         0         0      631966060         0                19



Message type           Dropped

READ                         0

RANGE_SLICE                  0

_TRACE                       0

MUTATION                     0

COUNTER_MUTATION             0

REQUEST_RESPONSE             0

PAGED_RANGE                  0

READ_REPAIR                  0

Dropped messages indicate that there is something wrong.

answered Jan 2 at 12:49

Mandraenke

2,1791718

answered Jan 2 at 12:49

Mandraenke

2,1791718

answered Jan 2 at 12:49

Mandraenke

2,1791718

answered Jan 2 at 12:49

Mandraenke

2,1791718

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk