Insert bulk data into big-query without keeping it in streaming buffer












0















My motive here is as follow:




  1. Insert bulk records into big-query every half an hour

  2. Delete the record if the exists


Those records are transactions which change their statuses from: pending, success, fail and expire.



BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.



can anyone suggest me some workaround as i am getting some duplicate rows in my table.










share|improve this question





























    0















    My motive here is as follow:




    1. Insert bulk records into big-query every half an hour

    2. Delete the record if the exists


    Those records are transactions which change their statuses from: pending, success, fail and expire.



    BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.



    can anyone suggest me some workaround as i am getting some duplicate rows in my table.










    share|improve this question



























      0












      0








      0


      2






      My motive here is as follow:




      1. Insert bulk records into big-query every half an hour

      2. Delete the record if the exists


      Those records are transactions which change their statuses from: pending, success, fail and expire.



      BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.



      can anyone suggest me some workaround as i am getting some duplicate rows in my table.










      share|improve this question
















      My motive here is as follow:




      1. Insert bulk records into big-query every half an hour

      2. Delete the record if the exists


      Those records are transactions which change their statuses from: pending, success, fail and expire.



      BigQuery does not allow me to delete the rows that are inserted just half an hour ago as they are still in the streaming buffer.



      can anyone suggest me some workaround as i am getting some duplicate rows in my table.







      node.js async-await google-bigquery synchronization






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 2 at 12:51









      Tamir Klein

      800920




      800920










      asked Jan 2 at 12:20









      Gaurav ShewaleGaurav Shewale

      237




      237
























          1 Answer
          1






          active

          oldest

          votes


















          1














          A better course of action would be to:




          • Perform periodic loads into a staging table (loading is a free operation)

          • After the load completes, execute a MERGE statement.


          You would want something like this:



          MERGE dataset.TransactionTable dt
          USING dataset.StagingTransactionTable st
          ON dt.tx_id = st.tx_id
          WHEN MATCHED THEN
          UPDATE dt.status = st.status
          WHEN NOT MATCHED THEN
          INSERT (tx_id, status) VALUES (st.tx_id, st.status)





          share|improve this answer
























          • i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?

            – Gaurav Shewale
            Jan 3 at 6:37













          • Yes, using a level of indirection could help

            – Elliott Brossard
            Jan 3 at 12:12











          • can you update your answer to update the whole row with all the fields ?

            – Gaurav Shewale
            Jan 12 at 10:40











          • also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?

            – Gaurav Shewale
            Jan 12 at 10:45











          • You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.

            – Elliott Brossard
            Jan 12 at 15:40











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006289%2finsert-bulk-data-into-big-query-without-keeping-it-in-streaming-buffer%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          A better course of action would be to:




          • Perform periodic loads into a staging table (loading is a free operation)

          • After the load completes, execute a MERGE statement.


          You would want something like this:



          MERGE dataset.TransactionTable dt
          USING dataset.StagingTransactionTable st
          ON dt.tx_id = st.tx_id
          WHEN MATCHED THEN
          UPDATE dt.status = st.status
          WHEN NOT MATCHED THEN
          INSERT (tx_id, status) VALUES (st.tx_id, st.status)





          share|improve this answer
























          • i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?

            – Gaurav Shewale
            Jan 3 at 6:37













          • Yes, using a level of indirection could help

            – Elliott Brossard
            Jan 3 at 12:12











          • can you update your answer to update the whole row with all the fields ?

            – Gaurav Shewale
            Jan 12 at 10:40











          • also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?

            – Gaurav Shewale
            Jan 12 at 10:45











          • You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.

            – Elliott Brossard
            Jan 12 at 15:40
















          1














          A better course of action would be to:




          • Perform periodic loads into a staging table (loading is a free operation)

          • After the load completes, execute a MERGE statement.


          You would want something like this:



          MERGE dataset.TransactionTable dt
          USING dataset.StagingTransactionTable st
          ON dt.tx_id = st.tx_id
          WHEN MATCHED THEN
          UPDATE dt.status = st.status
          WHEN NOT MATCHED THEN
          INSERT (tx_id, status) VALUES (st.tx_id, st.status)





          share|improve this answer
























          • i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?

            – Gaurav Shewale
            Jan 3 at 6:37













          • Yes, using a level of indirection could help

            – Elliott Brossard
            Jan 3 at 12:12











          • can you update your answer to update the whole row with all the fields ?

            – Gaurav Shewale
            Jan 12 at 10:40











          • also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?

            – Gaurav Shewale
            Jan 12 at 10:45











          • You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.

            – Elliott Brossard
            Jan 12 at 15:40














          1












          1








          1







          A better course of action would be to:




          • Perform periodic loads into a staging table (loading is a free operation)

          • After the load completes, execute a MERGE statement.


          You would want something like this:



          MERGE dataset.TransactionTable dt
          USING dataset.StagingTransactionTable st
          ON dt.tx_id = st.tx_id
          WHEN MATCHED THEN
          UPDATE dt.status = st.status
          WHEN NOT MATCHED THEN
          INSERT (tx_id, status) VALUES (st.tx_id, st.status)





          share|improve this answer













          A better course of action would be to:




          • Perform periodic loads into a staging table (loading is a free operation)

          • After the load completes, execute a MERGE statement.


          You would want something like this:



          MERGE dataset.TransactionTable dt
          USING dataset.StagingTransactionTable st
          ON dt.tx_id = st.tx_id
          WHEN MATCHED THEN
          UPDATE dt.status = st.status
          WHEN NOT MATCHED THEN
          INSERT (tx_id, status) VALUES (st.tx_id, st.status)






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 2 at 13:01









          Elliott BrossardElliott Brossard

          17.3k21137




          17.3k21137













          • i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?

            – Gaurav Shewale
            Jan 3 at 6:37













          • Yes, using a level of indirection could help

            – Elliott Brossard
            Jan 3 at 12:12











          • can you update your answer to update the whole row with all the fields ?

            – Gaurav Shewale
            Jan 12 at 10:40











          • also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?

            – Gaurav Shewale
            Jan 12 at 10:45











          • You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.

            – Elliott Brossard
            Jan 12 at 15:40



















          • i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?

            – Gaurav Shewale
            Jan 3 at 6:37













          • Yes, using a level of indirection could help

            – Elliott Brossard
            Jan 3 at 12:12











          • can you update your answer to update the whole row with all the fields ?

            – Gaurav Shewale
            Jan 12 at 10:40











          • also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?

            – Gaurav Shewale
            Jan 12 at 10:45











          • You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.

            – Elliott Brossard
            Jan 12 at 15:40

















          i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?

          – Gaurav Shewale
          Jan 3 at 6:37







          i am already using DML statements to delete and insert rows , but it's causing query too complex error also there are several other fields to be updated for each record. are you suggesting that i should maintain two tables ( temp , final ) ?

          – Gaurav Shewale
          Jan 3 at 6:37















          Yes, using a level of indirection could help

          – Elliott Brossard
          Jan 3 at 12:12





          Yes, using a level of indirection could help

          – Elliott Brossard
          Jan 3 at 12:12













          can you update your answer to update the whole row with all the fields ?

          – Gaurav Shewale
          Jan 12 at 10:40





          can you update your answer to update the whole row with all the fields ?

          – Gaurav Shewale
          Jan 12 at 10:40













          also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?

          – Gaurav Shewale
          Jan 12 at 10:45





          also i need to load data every half an hour into the temporary table .. how do i manage duplicates there ? i would've to truncate the temp table after my operations ?

          – Gaurav Shewale
          Jan 12 at 10:45













          You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.

          – Elliott Brossard
          Jan 12 at 15:40





          You need to list the columns to update or insert explicitly, and I don't know the names of the columns in your table, so you'll need to write the complete query yourself. Yes, you should overwrite the temp table whenever you load into it.

          – Elliott Brossard
          Jan 12 at 15:40




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006289%2finsert-bulk-data-into-big-query-without-keeping-it-in-streaming-buffer%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Monofisismo

          Angular Downloading a file using contenturl with Basic Authentication

          Olmecas