Deleted data in cassandra come back,like ghost
I have a 3 nodes Cassandra cluster(3.7), a keyspace
CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true;
a table
CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));
one day when deleting one row like
delete from tradingdate
where key='tradingDay'and tradingdate='2018-12-31'
then the deleted row become ghost, when the query
select * from tradingdate
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;
key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02
select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;
key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31
So when use order by, the deleted row (tradingDay, 2018-12-31) come back.
I guess I only delete a row on one node, but it still exists on another node. So I execute:
nodetool repair demo tradingdate
on 3 nodes, then the deleted row totally disappears
So I want to know why use order by, I can see the ghost row.
cassandra nosql
|
show 3 more comments
I have a 3 nodes Cassandra cluster(3.7), a keyspace
CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true;
a table
CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));
one day when deleting one row like
delete from tradingdate
where key='tradingDay'and tradingdate='2018-12-31'
then the deleted row become ghost, when the query
select * from tradingdate
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;
key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02
select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;
key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31
So when use order by, the deleted row (tradingDay, 2018-12-31) come back.
I guess I only delete a row on one node, but it still exists on another node. So I execute:
nodetool repair demo tradingdate
on 3 nodes, then the deleted row totally disappears
So I want to know why use order by, I can see the ghost row.
cassandra nosql
what version of cassandra?
– Alex Ott
Jan 1 at 10:51
What consistency level are you using when reading?
– creker
Jan 1 at 12:35
i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result
– jungle green
Jan 1 at 12:41
That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.
– creker
Jan 1 at 12:49
but strangely,when not use order by , i can not get the stale data
– jungle green
Jan 1 at 12:59
|
show 3 more comments
I have a 3 nodes Cassandra cluster(3.7), a keyspace
CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true;
a table
CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));
one day when deleting one row like
delete from tradingdate
where key='tradingDay'and tradingdate='2018-12-31'
then the deleted row become ghost, when the query
select * from tradingdate
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;
key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02
select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;
key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31
So when use order by, the deleted row (tradingDay, 2018-12-31) come back.
I guess I only delete a row on one node, but it still exists on another node. So I execute:
nodetool repair demo tradingdate
on 3 nodes, then the deleted row totally disappears
So I want to know why use order by, I can see the ghost row.
cassandra nosql
I have a 3 nodes Cassandra cluster(3.7), a keyspace
CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true;
a table
CREATE TABLE tradingdate (key text,tradingdate date,PRIMARY KEY (key, tradingdate));
one day when deleting one row like
delete from tradingdate
where key='tradingDay'and tradingdate='2018-12-31'
then the deleted row become ghost, when the query
select * from tradingdate
where key='tradingDay'and tradingdate>'2018-12-27' limit 2;
key | tradingdate
------------+-------------
tradingDay | 2018-12-28
tradingDay | 2019-01-02
select * from tradingdate
where key='tradingDay'and tradingdate<'2019-01-03'
order by tradingdate desc limit 2;
key | tradingdate
------------+-------------
tradingDay | 2019-01-02
tradingDay | 2018-12-31
So when use order by, the deleted row (tradingDay, 2018-12-31) come back.
I guess I only delete a row on one node, but it still exists on another node. So I execute:
nodetool repair demo tradingdate
on 3 nodes, then the deleted row totally disappears
So I want to know why use order by, I can see the ghost row.
cassandra nosql
cassandra nosql
edited Jan 2 at 14:33
Naveen Nelamali
350112
350112
asked Jan 1 at 6:59
jungle greenjungle green
63
63
what version of cassandra?
– Alex Ott
Jan 1 at 10:51
What consistency level are you using when reading?
– creker
Jan 1 at 12:35
i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result
– jungle green
Jan 1 at 12:41
That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.
– creker
Jan 1 at 12:49
but strangely,when not use order by , i can not get the stale data
– jungle green
Jan 1 at 12:59
|
show 3 more comments
what version of cassandra?
– Alex Ott
Jan 1 at 10:51
What consistency level are you using when reading?
– creker
Jan 1 at 12:35
i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result
– jungle green
Jan 1 at 12:41
That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.
– creker
Jan 1 at 12:49
but strangely,when not use order by , i can not get the stale data
– jungle green
Jan 1 at 12:59
what version of cassandra?
– Alex Ott
Jan 1 at 10:51
what version of cassandra?
– Alex Ott
Jan 1 at 10:51
What consistency level are you using when reading?
– creker
Jan 1 at 12:35
What consistency level are you using when reading?
– creker
Jan 1 at 12:35
i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result
– jungle green
Jan 1 at 12:41
i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result
– jungle green
Jan 1 at 12:41
That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.
– creker
Jan 1 at 12:49
That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.
– creker
Jan 1 at 12:49
but strangely,when not use order by , i can not get the stale data
– jungle green
Jan 1 at 12:59
but strangely,when not use order by , i can not get the stale data
– jungle green
Jan 1 at 12:59
|
show 3 more comments
1 Answer
1
active
oldest
votes
This is some good reading about deletes in Cassandra (and other distributed systems as well):
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
As well as:
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
You will need to run/schedule a routine repair at least once within gc_grace_seconds
which defaults to ten days to prevent data from reappearing in your cluster.
Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):
# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
Dropped messages indicate that there is something wrong.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993604%2fdeleted-data-in-cassandra-come-back-like-ghost%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
This is some good reading about deletes in Cassandra (and other distributed systems as well):
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
As well as:
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
You will need to run/schedule a routine repair at least once within gc_grace_seconds
which defaults to ten days to prevent data from reappearing in your cluster.
Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):
# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
Dropped messages indicate that there is something wrong.
add a comment |
This is some good reading about deletes in Cassandra (and other distributed systems as well):
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
As well as:
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
You will need to run/schedule a routine repair at least once within gc_grace_seconds
which defaults to ten days to prevent data from reappearing in your cluster.
Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):
# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
Dropped messages indicate that there is something wrong.
add a comment |
This is some good reading about deletes in Cassandra (and other distributed systems as well):
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
As well as:
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
You will need to run/schedule a routine repair at least once within gc_grace_seconds
which defaults to ten days to prevent data from reappearing in your cluster.
Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):
# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
Dropped messages indicate that there is something wrong.
This is some good reading about deletes in Cassandra (and other distributed systems as well):
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
As well as:
https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html
You will need to run/schedule a routine repair at least once within gc_grace_seconds
which defaults to ten days to prevent data from reappearing in your cluster.
Also you should look for dropped messages in case one of your nodes is missing deletes (and other messages):
# nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
MutationStage 0 0 787032744 0 0
ReadStage 0 0 1627843193 0 0
RequestResponseStage 0 0 2257452312 0 0
ReadRepairStage 0 0 99910415 0 0
CounterMutationStage 0 0 0 0 0
HintedHandoff 0 0 1582 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 6649458 0 0
MemtableReclaimMemory 0 0 17987 0 0
PendingRangeCalculator 0 0 46 0 0
GossipStage 0 0 22766295 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 127844 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 17851 0 0
InternalResponseStage 0 0 8669 0 0
AntiEntropyStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 631966060 0 19
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
MUTATION 0
COUNTER_MUTATION 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
Dropped messages indicate that there is something wrong.
answered Jan 2 at 12:49
MandraenkeMandraenke
2,1791718
2,1791718
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53993604%2fdeleted-data-in-cassandra-come-back-like-ghost%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
what version of cassandra?
– Alex Ott
Jan 1 at 10:51
What consistency level are you using when reading?
– creker
Jan 1 at 12:35
i execute query in cqlsh,the default consistency is ONE.i remember i once met this situation,when set consistency two,would lead to same result
– jungle green
Jan 1 at 12:41
That's probably the reason. Your delete not propagated to other nodes and with consistency ONE you may read stale data. Try using QUORUM to get more consistency.
– creker
Jan 1 at 12:49
but strangely,when not use order by , i can not get the stale data
– jungle green
Jan 1 at 12:59