Sqoop - Import - Query subcommand - Parameters in where clause





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.



I'm going to use an example of this video:



https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515



query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"



As we can see, there is a parameter "$CONDITIONS" added in the where condition.



Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.



My questions are:



1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



Appreciate your help on this,



David.










share|improve this question































    0















    I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.



    I'm going to use an example of this video:



    https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515



    query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"



    As we can see, there is a parameter "$CONDITIONS" added in the where condition.



    Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.



    My questions are:



    1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



    2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



    Appreciate your help on this,



    David.










    share|improve this question



























      0












      0








      0








      I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.



      I'm going to use an example of this video:



      https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515



      query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"



      As we can see, there is a parameter "$CONDITIONS" added in the where condition.



      Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.



      My questions are:



      1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



      2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



      Appreciate your help on this,



      David.










      share|improve this question
















      I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.



      I'm going to use an example of this video:



      https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515



      query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"



      As we can see, there is a parameter "$CONDITIONS" added in the where condition.



      Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.



      My questions are:



      1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



      2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



      Appreciate your help on this,



      David.







      import parameters where sqoop






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 4 at 13:10









      shriyog

      595818




      595818










      asked Jan 4 at 12:05









      DavidDavid

      31




      31
























          1 Answer
          1






          active

          oldest

          votes


















          0














          I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.



          1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



          SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m or --num-mappers argument. By default 4 map tasks will be used by SQOOP.



          For example, if you had a table with a primary key column of order_id whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
          SELECT * FROM orders WHERE $CONDITIONS



          Now $CONDITIONS place holder would be replaced as below





          1. for map task 1: order_id >= 0 and order_id < 250
            and final SQL will look like




            SELECT * FROM orders WHERE order_id >= 0 and order_id < 250





          2. for map task 2: order_id >= 250 and order_id < 500
            and final SQL will look like




            SELECT * FROM orders WHERE order_id >= 250 and order_id < 500





          3. for map task 3: order_id >= 500 and order_id < 750
            and final SQL will look like




            SELECT * FROM orders WHERE order_id >= 500 and order_id < 750





          4. for map task 4: order_id >= 750 and order_id < 1001
            and final SQL will look like




            SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001





          As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.



          2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



          You can simply specify your filter criterion after $CONDITIONS prefixed by AND.



          For example, if you want to load only those orders where order amount is greater than 100.




          query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"




          Hope this clarifies !






          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54038665%2fsqoop-import-query-subcommand-parameters-in-where-clause%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.



            1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



            SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m or --num-mappers argument. By default 4 map tasks will be used by SQOOP.



            For example, if you had a table with a primary key column of order_id whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
            SELECT * FROM orders WHERE $CONDITIONS



            Now $CONDITIONS place holder would be replaced as below





            1. for map task 1: order_id >= 0 and order_id < 250
              and final SQL will look like




              SELECT * FROM orders WHERE order_id >= 0 and order_id < 250





            2. for map task 2: order_id >= 250 and order_id < 500
              and final SQL will look like




              SELECT * FROM orders WHERE order_id >= 250 and order_id < 500





            3. for map task 3: order_id >= 500 and order_id < 750
              and final SQL will look like




              SELECT * FROM orders WHERE order_id >= 500 and order_id < 750





            4. for map task 4: order_id >= 750 and order_id < 1001
              and final SQL will look like




              SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001





            As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.



            2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



            You can simply specify your filter criterion after $CONDITIONS prefixed by AND.



            For example, if you want to load only those orders where order amount is greater than 100.




            query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"




            Hope this clarifies !






            share|improve this answer




























              0














              I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.



              1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



              SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m or --num-mappers argument. By default 4 map tasks will be used by SQOOP.



              For example, if you had a table with a primary key column of order_id whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
              SELECT * FROM orders WHERE $CONDITIONS



              Now $CONDITIONS place holder would be replaced as below





              1. for map task 1: order_id >= 0 and order_id < 250
                and final SQL will look like




                SELECT * FROM orders WHERE order_id >= 0 and order_id < 250





              2. for map task 2: order_id >= 250 and order_id < 500
                and final SQL will look like




                SELECT * FROM orders WHERE order_id >= 250 and order_id < 500





              3. for map task 3: order_id >= 500 and order_id < 750
                and final SQL will look like




                SELECT * FROM orders WHERE order_id >= 500 and order_id < 750





              4. for map task 4: order_id >= 750 and order_id < 1001
                and final SQL will look like




                SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001





              As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.



              2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



              You can simply specify your filter criterion after $CONDITIONS prefixed by AND.



              For example, if you want to load only those orders where order amount is greater than 100.




              query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"




              Hope this clarifies !






              share|improve this answer


























                0












                0








                0







                I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.



                1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



                SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m or --num-mappers argument. By default 4 map tasks will be used by SQOOP.



                For example, if you had a table with a primary key column of order_id whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
                SELECT * FROM orders WHERE $CONDITIONS



                Now $CONDITIONS place holder would be replaced as below





                1. for map task 1: order_id >= 0 and order_id < 250
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 0 and order_id < 250





                2. for map task 2: order_id >= 250 and order_id < 500
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 250 and order_id < 500





                3. for map task 3: order_id >= 500 and order_id < 750
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 500 and order_id < 750





                4. for map task 4: order_id >= 750 and order_id < 1001
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001





                As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.



                2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



                You can simply specify your filter criterion after $CONDITIONS prefixed by AND.



                For example, if you want to load only those orders where order amount is greater than 100.




                query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"




                Hope this clarifies !






                share|improve this answer













                I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.



                1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand



                SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m or --num-mappers argument. By default 4 map tasks will be used by SQOOP.



                For example, if you had a table with a primary key column of order_id whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
                SELECT * FROM orders WHERE $CONDITIONS



                Now $CONDITIONS place holder would be replaced as below





                1. for map task 1: order_id >= 0 and order_id < 250
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 0 and order_id < 250





                2. for map task 2: order_id >= 250 and order_id < 500
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 250 and order_id < 500





                3. for map task 3: order_id >= 500 and order_id < 750
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 500 and order_id < 750





                4. for map task 4: order_id >= 750 and order_id < 1001
                  and final SQL will look like




                  SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001





                As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.



                2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?



                You can simply specify your filter criterion after $CONDITIONS prefixed by AND.



                For example, if you want to load only those orders where order amount is greater than 100.




                query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"




                Hope this clarifies !







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jan 8 at 11:08









                ShivenShiven

                1613




                1613
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54038665%2fsqoop-import-query-subcommand-parameters-in-where-clause%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Monofisismo

                    Angular Downloading a file using contenturl with Basic Authentication

                    Olmecas