Sqoop - Import - Query subcommand - Parameters in where clause
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.
I'm going to use an example of this video:
https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"
As we can see, there is a parameter "$CONDITIONS" added in the where condition.
Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.
My questions are:
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
Appreciate your help on this,
David.
import parameters where sqoop
add a comment |
I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.
I'm going to use an example of this video:
https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"
As we can see, there is a parameter "$CONDITIONS" added in the where condition.
Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.
My questions are:
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
Appreciate your help on this,
David.
import parameters where sqoop
add a comment |
I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.
I'm going to use an example of this video:
https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"
As we can see, there is a parameter "$CONDITIONS" added in the where condition.
Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.
My questions are:
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
Appreciate your help on this,
David.
import parameters where sqoop
I would like to ask if someone can explain me a Sqoop "query" subcommand functionality in details.
I'm going to use an example of this video:
https://youtu.be/7oZ_CctyS5Q?list=PLf0swTFhTI8rJvGpOp-LujOcpk-Rlz-yE&t=1515
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS"
As we can see, there is a parameter "$CONDITIONS" added in the where condition.
Probably I'm wrong, but after watch the video, I understood the condition will be replaced as "1=1" if the parameter "$CONDITIONS" exists or by "1=0" if don't.
My questions are:
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
Appreciate your help on this,
David.
import parameters where sqoop
import parameters where sqoop
edited Jan 4 at 13:10
shriyog
595818
595818
asked Jan 4 at 12:05
DavidDavid
31
31
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m
or --num-mappers
argument. By default 4 map tasks will be used by SQOOP.
For example, if you had a table with a primary key column of order_id
whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
SELECT * FROM orders WHERE $CONDITIONS
Now $CONDITIONS place holder would be replaced as below
for map task 1:
order_id >= 0 and order_id < 250
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 0 and order_id < 250
for map task 2:
order_id >= 250 and order_id < 500
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 250 and order_id < 500
for map task 3:
order_id >= 500 and order_id < 750
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 500 and order_id < 750
for map task 4:
order_id >= 750 and order_id < 1001
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001
As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
You can simply specify your filter criterion after $CONDITIONS prefixed by AND
.
For example, if you want to load only those orders where order amount is greater than 100.
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"
Hope this clarifies !
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54038665%2fsqoop-import-query-subcommand-parameters-in-where-clause%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m
or --num-mappers
argument. By default 4 map tasks will be used by SQOOP.
For example, if you had a table with a primary key column of order_id
whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
SELECT * FROM orders WHERE $CONDITIONS
Now $CONDITIONS place holder would be replaced as below
for map task 1:
order_id >= 0 and order_id < 250
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 0 and order_id < 250
for map task 2:
order_id >= 250 and order_id < 500
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 250 and order_id < 500
for map task 3:
order_id >= 500 and order_id < 750
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 500 and order_id < 750
for map task 4:
order_id >= 750 and order_id < 1001
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001
As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
You can simply specify your filter criterion after $CONDITIONS prefixed by AND
.
For example, if you want to load only those orders where order amount is greater than 100.
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"
Hope this clarifies !
add a comment |
I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m
or --num-mappers
argument. By default 4 map tasks will be used by SQOOP.
For example, if you had a table with a primary key column of order_id
whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
SELECT * FROM orders WHERE $CONDITIONS
Now $CONDITIONS place holder would be replaced as below
for map task 1:
order_id >= 0 and order_id < 250
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 0 and order_id < 250
for map task 2:
order_id >= 250 and order_id < 500
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 250 and order_id < 500
for map task 3:
order_id >= 500 and order_id < 750
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 500 and order_id < 750
for map task 4:
order_id >= 750 and order_id < 1001
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001
As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
You can simply specify your filter criterion after $CONDITIONS prefixed by AND
.
For example, if you want to load only those orders where order amount is greater than 100.
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"
Hope this clarifies !
add a comment |
I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m
or --num-mappers
argument. By default 4 map tasks will be used by SQOOP.
For example, if you had a table with a primary key column of order_id
whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
SELECT * FROM orders WHERE $CONDITIONS
Now $CONDITIONS place holder would be replaced as below
for map task 1:
order_id >= 0 and order_id < 250
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 0 and order_id < 250
for map task 2:
order_id >= 250 and order_id < 500
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 250 and order_id < 500
for map task 3:
order_id >= 500 and order_id < 750
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 500 and order_id < 750
for map task 4:
order_id >= 750 and order_id < 1001
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001
As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
You can simply specify your filter criterion after $CONDITIONS prefixed by AND
.
For example, if you want to load only those orders where order amount is greater than 100.
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"
Hope this clarifies !
I have not exactly understood your queries, however let me try to answer. Let me know if you are looking for something else.
1- Why it is recommend to use the parameter in the Where clause of the "query" subcommand
SQOOP imports data in parallel from RDBMS. You can specify no of map tasks (parallel process) to use to perform the import by using the -m
or --num-mappers
argument. By default 4 map tasks will be used by SQOOP.
For example, if you had a table with a primary key column of order_id
whose minimum value was 0 and maximum value was 1000, and SQOOP was directed to use 4 tasks, SQOOP would run four processes which each execute SQL statements of the form
SELECT * FROM orders WHERE $CONDITIONS
Now $CONDITIONS place holder would be replaced as below
for map task 1:
order_id >= 0 and order_id < 250
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 0 and order_id < 250
for map task 2:
order_id >= 250 and order_id < 500
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 250 and order_id < 500
for map task 3:
order_id >= 500 and order_id < 750
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 500 and order_id < 750
for map task 4:
order_id >= 750 and order_id < 1001
and final SQL will look like
SELECT * FROM orders WHERE order_id >= 750 and order_id < 1001
As you can see $CONDITIONS is a place holder which SQOOP will search and replace with generated values. Please note that specifying $CONDITIONS is mandatory in case no of map tasks is greater than 1.
2- What happen if we really have a WHERE clause in a parameter which we want to use in the "query" subcommand and not in the "where" subcommand?
You can simply specify your filter criterion after $CONDITIONS prefixed by AND
.
For example, if you want to load only those orders where order amount is greater than 100.
query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where $CONDITIONS AND order_amount > 100"
Hope this clarifies !
answered Jan 8 at 11:08
ShivenShiven
1613
1613
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54038665%2fsqoop-import-query-subcommand-parameters-in-where-clause%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown