Insert bulk data into big-query without keeping it in streaming buffer
My motive here is as follow:
- Insert bulk records into big-query every half an hour
- Delete the record if the exists
Those records are transactions which change their statuses from: pending, success, fail and expire.
BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.
can anyone suggest me some workaround as i am getting some duplicate rows in my table.
node.js async-await google-bigquery synchronization
add a comment |
My motive here is as follow:
- Insert bulk records into big-query every half an hour
- Delete the record if the exists
Those records are transactions which change their statuses from: pending, success, fail and expire.
BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.
can anyone suggest me some workaround as i am getting some duplicate rows in my table.
node.js async-await google-bigquery synchronization
add a comment |
My motive here is as follow:
- Insert bulk records into big-query every half an hour
- Delete the record if the exists
Those records are transactions which change their statuses from: pending, success, fail and expire.
BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.
can anyone suggest me some workaround as i am getting some duplicate rows in my table.
node.js async-await google-bigquery synchronization
My motive here is as follow:
- Insert bulk records into big-query every half an hour
- Delete the record if the exists
Those records are transactions which change their statuses from: pending, success, fail and expire.
BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.
can anyone suggest me some workaround as i am getting some duplicate rows in my table.
node.js async-await google-bigquery synchronization
node.js async-await google-bigquery synchronization
edited Jan 2 at 12:51
Tamir Klein
800920
800920
asked Jan 2 at 12:20
Gaurav ShewaleGaurav Shewale
237
237
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
A better course of action would be to:
- Perform periodic loads into a staging table (loading is a free operation)
- After the load completes, execute a MERGE statement.
You would want something like this:
MERGE dataset.TransactionTable dt
USING dataset.StagingTransactionTable st
ON dt.tx_id = st.tx_id
WHEN MATCHED THEN
UPDATE dt.status = st.status
WHEN NOT MATCHED THEN
INSERT (tx_id, status) VALUES (st.tx_id, st.status)
i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?
– Gaurav Shewale
Jan 3 at 6:37
Yes, using a level of indirection could help
– Elliott Brossard
Jan 3 at 12:12
can you update your answer to update the whole row with all the fields ?
– Gaurav Shewale
Jan 12 at 10:40
also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?
– Gaurav Shewale
Jan 12 at 10:45
You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.
– Elliott Brossard
Jan 12 at 15:40
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006289%2finsert-bulk-data-into-big-query-without-keeping-it-in-streaming-buffer%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
A better course of action would be to:
- Perform periodic loads into a staging table (loading is a free operation)
- After the load completes, execute a MERGE statement.
You would want something like this:
MERGE dataset.TransactionTable dt
USING dataset.StagingTransactionTable st
ON dt.tx_id = st.tx_id
WHEN MATCHED THEN
UPDATE dt.status = st.status
WHEN NOT MATCHED THEN
INSERT (tx_id, status) VALUES (st.tx_id, st.status)
i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?
– Gaurav Shewale
Jan 3 at 6:37
Yes, using a level of indirection could help
– Elliott Brossard
Jan 3 at 12:12
can you update your answer to update the whole row with all the fields ?
– Gaurav Shewale
Jan 12 at 10:40
also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?
– Gaurav Shewale
Jan 12 at 10:45
You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.
– Elliott Brossard
Jan 12 at 15:40
|
show 2 more comments
A better course of action would be to:
- Perform periodic loads into a staging table (loading is a free operation)
- After the load completes, execute a MERGE statement.
You would want something like this:
MERGE dataset.TransactionTable dt
USING dataset.StagingTransactionTable st
ON dt.tx_id = st.tx_id
WHEN MATCHED THEN
UPDATE dt.status = st.status
WHEN NOT MATCHED THEN
INSERT (tx_id, status) VALUES (st.tx_id, st.status)
i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?
– Gaurav Shewale
Jan 3 at 6:37
Yes, using a level of indirection could help
– Elliott Brossard
Jan 3 at 12:12
can you update your answer to update the whole row with all the fields ?
– Gaurav Shewale
Jan 12 at 10:40
also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?
– Gaurav Shewale
Jan 12 at 10:45
You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.
– Elliott Brossard
Jan 12 at 15:40
|
show 2 more comments
A better course of action would be to:
- Perform periodic loads into a staging table (loading is a free operation)
- After the load completes, execute a MERGE statement.
You would want something like this:
MERGE dataset.TransactionTable dt
USING dataset.StagingTransactionTable st
ON dt.tx_id = st.tx_id
WHEN MATCHED THEN
UPDATE dt.status = st.status
WHEN NOT MATCHED THEN
INSERT (tx_id, status) VALUES (st.tx_id, st.status)
A better course of action would be to:
- Perform periodic loads into a staging table (loading is a free operation)
- After the load completes, execute a MERGE statement.
You would want something like this:
MERGE dataset.TransactionTable dt
USING dataset.StagingTransactionTable st
ON dt.tx_id = st.tx_id
WHEN MATCHED THEN
UPDATE dt.status = st.status
WHEN NOT MATCHED THEN
INSERT (tx_id, status) VALUES (st.tx_id, st.status)
answered Jan 2 at 13:01
Elliott BrossardElliott Brossard
17.3k21137
17.3k21137
i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?
– Gaurav Shewale
Jan 3 at 6:37
Yes, using a level of indirection could help
– Elliott Brossard
Jan 3 at 12:12
can you update your answer to update the whole row with all the fields ?
– Gaurav Shewale
Jan 12 at 10:40
also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?
– Gaurav Shewale
Jan 12 at 10:45
You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.
– Elliott Brossard
Jan 12 at 15:40
|
show 2 more comments
i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?
– Gaurav Shewale
Jan 3 at 6:37
Yes, using a level of indirection could help
– Elliott Brossard
Jan 3 at 12:12
can you update your answer to update the whole row with all the fields ?
– Gaurav Shewale
Jan 12 at 10:40
also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?
– Gaurav Shewale
Jan 12 at 10:45
You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.
– Elliott Brossard
Jan 12 at 15:40
i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?
– Gaurav Shewale
Jan 3 at 6:37
i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?
– Gaurav Shewale
Jan 3 at 6:37
Yes, using a level of indirection could help
– Elliott Brossard
Jan 3 at 12:12
Yes, using a level of indirection could help
– Elliott Brossard
Jan 3 at 12:12
can you update your answer to update the whole row with all the fields ?
– Gaurav Shewale
Jan 12 at 10:40
can you update your answer to update the whole row with all the fields ?
– Gaurav Shewale
Jan 12 at 10:40
also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?
– Gaurav Shewale
Jan 12 at 10:45
also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?
– Gaurav Shewale
Jan 12 at 10:45
You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.
– Elliott Brossard
Jan 12 at 15:40
You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.
– Elliott Brossard
Jan 12 at 15:40
|
show 2 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006289%2finsert-bulk-data-into-big-query-without-keeping-it-in-streaming-buffer%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown