When it's worth the tradeoff of using local secondary index in DynamoDB?
I've read guidelines for secondary indexes but I'm not sure when the ability to search fast outweighs the disadvantage of scan over attributes. Let me give you an example.
I am saving game progress data for users. The PK is user ID. I need to be able to:
Find out user progress about a particular game.
Get all finished/in progress games for a user.
Thus, I can design my SK as progress_{state} to be able to query all games by progress fast (state represents started/finished) or I can design my SK as progress_{gameId} to be able to query progress of a given game fast. However, I can't have both using just SK. When I chose one, the other operation will require a scan.
Therefore, I was thinking about using LSI which will add an overhead to the whole table as noted by Amazon here:
Every secondary index means more work for DynamoDB. When you add, delete, or replace items in a table that has local secondary indexes, DynamoDB will use additional write capacity units to update the relevant indexes.
I estimate maximum thousands of types games and I wonder whether it's worth using LSI or whether it's better to use scans for the other operation I choose.
Does anyone has any real experience with such problem? I was not able to find anything on this topic.
amazon-web-services amazon-dynamodb amazon-dynamodb-index
add a comment |
I've read guidelines for secondary indexes but I'm not sure when the ability to search fast outweighs the disadvantage of scan over attributes. Let me give you an example.
I am saving game progress data for users. The PK is user ID. I need to be able to:
Find out user progress about a particular game.
Get all finished/in progress games for a user.
Thus, I can design my SK as progress_{state} to be able to query all games by progress fast (state represents started/finished) or I can design my SK as progress_{gameId} to be able to query progress of a given game fast. However, I can't have both using just SK. When I chose one, the other operation will require a scan.
Therefore, I was thinking about using LSI which will add an overhead to the whole table as noted by Amazon here:
Every secondary index means more work for DynamoDB. When you add, delete, or replace items in a table that has local secondary indexes, DynamoDB will use additional write capacity units to update the relevant indexes.
I estimate maximum thousands of types games and I wonder whether it's worth using LSI or whether it's better to use scans for the other operation I choose.
Does anyone has any real experience with such problem? I was not able to find anything on this topic.
amazon-web-services amazon-dynamodb amazon-dynamodb-index
add a comment |
I've read guidelines for secondary indexes but I'm not sure when the ability to search fast outweighs the disadvantage of scan over attributes. Let me give you an example.
I am saving game progress data for users. The PK is user ID. I need to be able to:
Find out user progress about a particular game.
Get all finished/in progress games for a user.
Thus, I can design my SK as progress_{state} to be able to query all games by progress fast (state represents started/finished) or I can design my SK as progress_{gameId} to be able to query progress of a given game fast. However, I can't have both using just SK. When I chose one, the other operation will require a scan.
Therefore, I was thinking about using LSI which will add an overhead to the whole table as noted by Amazon here:
Every secondary index means more work for DynamoDB. When you add, delete, or replace items in a table that has local secondary indexes, DynamoDB will use additional write capacity units to update the relevant indexes.
I estimate maximum thousands of types games and I wonder whether it's worth using LSI or whether it's better to use scans for the other operation I choose.
Does anyone has any real experience with such problem? I was not able to find anything on this topic.
amazon-web-services amazon-dynamodb amazon-dynamodb-index
I've read guidelines for secondary indexes but I'm not sure when the ability to search fast outweighs the disadvantage of scan over attributes. Let me give you an example.
I am saving game progress data for users. The PK is user ID. I need to be able to:
Find out user progress about a particular game.
Get all finished/in progress games for a user.
Thus, I can design my SK as progress_{state} to be able to query all games by progress fast (state represents started/finished) or I can design my SK as progress_{gameId} to be able to query progress of a given game fast. However, I can't have both using just SK. When I chose one, the other operation will require a scan.
Therefore, I was thinking about using LSI which will add an overhead to the whole table as noted by Amazon here:
Every secondary index means more work for DynamoDB. When you add, delete, or replace items in a table that has local secondary indexes, DynamoDB will use additional write capacity units to update the relevant indexes.
I estimate maximum thousands of types games and I wonder whether it's worth using LSI or whether it's better to use scans for the other operation I choose.
Does anyone has any real experience with such problem? I was not able to find anything on this topic.
amazon-web-services amazon-dynamodb amazon-dynamodb-index
amazon-web-services amazon-dynamodb amazon-dynamodb-index
edited Dec 31 '18 at 8:11
Uwe Keim
27.5k31130210
27.5k31130210
asked Dec 30 '18 at 23:08
MartinMartin
3527
3527
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
When you are designing DynamoDB tables, the main cost factor comes with IOPS for reads and writes.
This is why avoiding scans are usually better. Scans will consume a significant amount of read IOPS and it will increase with the number of items in the table since scan needs to read all the items in the table before returning the matching items.
Then coming back to your use-case of using SK for progress, it would be better to use attributes and define Secondary Indexes, since you will need to update the state later on (Which is not possible with PK and SK in the table).
So based on your use-case and the information given in the question you can define the schema as;
PK- UserID
SK- GameID
GSI- Progress (PK)
Query all games by progress fast
GSI Progress (PK)
Note: if this is for a particular user; you can change it to LSI Progress.
Query progress of a given game fast (Assuming that for a given user)
Query using UserID (PK) and GameID (SK) of the Table
Hi Ashan, thank you for the useful answer. May I ask one more thing? Currently, I save both game data (e.g. price, description of game) and progress data in one table with different PKs. Would you split this table into 2 (User_data/Game_data) so the game data queries are not 'hit' by the LSI overhead or is it ok to keep everything in one table.
– Martin
Dec 31 '18 at 10:02
1
Hi @Martin It seems splitting into two tables make sense. Rationale here is in keeping items small (to keep 1kb-write and 4kb-read limits per IOPS per table item) and also seperating data with different query patterns.
– Ashan
Dec 31 '18 at 10:18
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53982164%2fwhen-its-worth-the-tradeoff-of-using-local-secondary-index-in-dynamodb%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
When you are designing DynamoDB tables, the main cost factor comes with IOPS for reads and writes.
This is why avoiding scans are usually better. Scans will consume a significant amount of read IOPS and it will increase with the number of items in the table since scan needs to read all the items in the table before returning the matching items.
Then coming back to your use-case of using SK for progress, it would be better to use attributes and define Secondary Indexes, since you will need to update the state later on (Which is not possible with PK and SK in the table).
So based on your use-case and the information given in the question you can define the schema as;
PK- UserID
SK- GameID
GSI- Progress (PK)
Query all games by progress fast
GSI Progress (PK)
Note: if this is for a particular user; you can change it to LSI Progress.
Query progress of a given game fast (Assuming that for a given user)
Query using UserID (PK) and GameID (SK) of the Table
Hi Ashan, thank you for the useful answer. May I ask one more thing? Currently, I save both game data (e.g. price, description of game) and progress data in one table with different PKs. Would you split this table into 2 (User_data/Game_data) so the game data queries are not 'hit' by the LSI overhead or is it ok to keep everything in one table.
– Martin
Dec 31 '18 at 10:02
1
Hi @Martin It seems splitting into two tables make sense. Rationale here is in keeping items small (to keep 1kb-write and 4kb-read limits per IOPS per table item) and also seperating data with different query patterns.
– Ashan
Dec 31 '18 at 10:18
add a comment |
When you are designing DynamoDB tables, the main cost factor comes with IOPS for reads and writes.
This is why avoiding scans are usually better. Scans will consume a significant amount of read IOPS and it will increase with the number of items in the table since scan needs to read all the items in the table before returning the matching items.
Then coming back to your use-case of using SK for progress, it would be better to use attributes and define Secondary Indexes, since you will need to update the state later on (Which is not possible with PK and SK in the table).
So based on your use-case and the information given in the question you can define the schema as;
PK- UserID
SK- GameID
GSI- Progress (PK)
Query all games by progress fast
GSI Progress (PK)
Note: if this is for a particular user; you can change it to LSI Progress.
Query progress of a given game fast (Assuming that for a given user)
Query using UserID (PK) and GameID (SK) of the Table
Hi Ashan, thank you for the useful answer. May I ask one more thing? Currently, I save both game data (e.g. price, description of game) and progress data in one table with different PKs. Would you split this table into 2 (User_data/Game_data) so the game data queries are not 'hit' by the LSI overhead or is it ok to keep everything in one table.
– Martin
Dec 31 '18 at 10:02
1
Hi @Martin It seems splitting into two tables make sense. Rationale here is in keeping items small (to keep 1kb-write and 4kb-read limits per IOPS per table item) and also seperating data with different query patterns.
– Ashan
Dec 31 '18 at 10:18
add a comment |
When you are designing DynamoDB tables, the main cost factor comes with IOPS for reads and writes.
This is why avoiding scans are usually better. Scans will consume a significant amount of read IOPS and it will increase with the number of items in the table since scan needs to read all the items in the table before returning the matching items.
Then coming back to your use-case of using SK for progress, it would be better to use attributes and define Secondary Indexes, since you will need to update the state later on (Which is not possible with PK and SK in the table).
So based on your use-case and the information given in the question you can define the schema as;
PK- UserID
SK- GameID
GSI- Progress (PK)
Query all games by progress fast
GSI Progress (PK)
Note: if this is for a particular user; you can change it to LSI Progress.
Query progress of a given game fast (Assuming that for a given user)
Query using UserID (PK) and GameID (SK) of the Table
When you are designing DynamoDB tables, the main cost factor comes with IOPS for reads and writes.
This is why avoiding scans are usually better. Scans will consume a significant amount of read IOPS and it will increase with the number of items in the table since scan needs to read all the items in the table before returning the matching items.
Then coming back to your use-case of using SK for progress, it would be better to use attributes and define Secondary Indexes, since you will need to update the state later on (Which is not possible with PK and SK in the table).
So based on your use-case and the information given in the question you can define the schema as;
PK- UserID
SK- GameID
GSI- Progress (PK)
Query all games by progress fast
GSI Progress (PK)
Note: if this is for a particular user; you can change it to LSI Progress.
Query progress of a given game fast (Assuming that for a given user)
Query using UserID (PK) and GameID (SK) of the Table
edited Dec 31 '18 at 8:10
answered Dec 31 '18 at 7:52
AshanAshan
10.3k21935
10.3k21935
Hi Ashan, thank you for the useful answer. May I ask one more thing? Currently, I save both game data (e.g. price, description of game) and progress data in one table with different PKs. Would you split this table into 2 (User_data/Game_data) so the game data queries are not 'hit' by the LSI overhead or is it ok to keep everything in one table.
– Martin
Dec 31 '18 at 10:02
1
Hi @Martin It seems splitting into two tables make sense. Rationale here is in keeping items small (to keep 1kb-write and 4kb-read limits per IOPS per table item) and also seperating data with different query patterns.
– Ashan
Dec 31 '18 at 10:18
add a comment |
Hi Ashan, thank you for the useful answer. May I ask one more thing? Currently, I save both game data (e.g. price, description of game) and progress data in one table with different PKs. Would you split this table into 2 (User_data/Game_data) so the game data queries are not 'hit' by the LSI overhead or is it ok to keep everything in one table.
– Martin
Dec 31 '18 at 10:02
1
Hi @Martin It seems splitting into two tables make sense. Rationale here is in keeping items small (to keep 1kb-write and 4kb-read limits per IOPS per table item) and also seperating data with different query patterns.
– Ashan
Dec 31 '18 at 10:18
Hi Ashan, thank you for the useful answer. May I ask one more thing? Currently, I save both game data (e.g. price, description of game) and progress data in one table with different PKs. Would you split this table into 2 (User_data/Game_data) so the game data queries are not 'hit' by the LSI overhead or is it ok to keep everything in one table.
– Martin
Dec 31 '18 at 10:02
Hi Ashan, thank you for the useful answer. May I ask one more thing? Currently, I save both game data (e.g. price, description of game) and progress data in one table with different PKs. Would you split this table into 2 (User_data/Game_data) so the game data queries are not 'hit' by the LSI overhead or is it ok to keep everything in one table.
– Martin
Dec 31 '18 at 10:02
1
1
Hi @Martin It seems splitting into two tables make sense. Rationale here is in keeping items small (to keep 1kb-write and 4kb-read limits per IOPS per table item) and also seperating data with different query patterns.
– Ashan
Dec 31 '18 at 10:18
Hi @Martin It seems splitting into two tables make sense. Rationale here is in keeping items small (to keep 1kb-write and 4kb-read limits per IOPS per table item) and also seperating data with different query patterns.
– Ashan
Dec 31 '18 at 10:18
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53982164%2fwhen-its-worth-the-tradeoff-of-using-local-secondary-index-in-dynamodb%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown