Spark Parquet file writing will not show any files in target folder

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I am facing a wired situation. I am trying to read from oracle and write to a hdfs folder in parquet files using spark-sql 2.3.1. Below is my code snippet:

df.write.format("parquet")

  .mode("overwrite")

  .partitionBy(partitionColumn)

  .save(parquet_file)

When I run this code in locally it is working fine, but when I run the same on a apache-spark cluster it is not at producing any results in the target folder.

Not sure what is missing but I don't see any errors in logs. Quite interestingly when I reduce the number of records of oracle table it is producing the folders as expected. How to solve this problem?

edited Jan 4 at 6:43

Shaido

13.1k123044

asked Jan 4 at 6:34

Shyam

3201418

parquet_file here is the path of target folder for saving parquet files.

– Shyam
Jan 4 at 6:39

Check 2 Things 1. Count of Dataframe that you are writing - does it have any data? 2. Can you print path where you are writing i.e parquet_file and path where you are checking files. just want to make sure you have not messed up relative path

– Harjeet Kumar
Jan 4 at 6:47

@HarjeetKumar 1. I have lot of data in the table , the respected dataframe has around 1793,723594 i.e. 1790 million records. 2. When I read fewer records i.e. 1 million records I can see the files in target path , but when I read entire records i dont see any files in the target path. So its not the path issue.

– Shyam
Jan 4 at 7:05

add a comment |

I am facing a wired situation. I am trying to read from oracle and write to a hdfs folder in parquet files using spark-sql 2.3.1. Below is my code snippet:

df.write.format("parquet")

  .mode("overwrite")

  .partitionBy(partitionColumn)

  .save(parquet_file)

When I run this code in locally it is working fine, but when I run the same on a apache-spark cluster it is not at producing any results in the target folder.

edited Jan 4 at 6:43

Shaido

13.1k123044

asked Jan 4 at 6:34

Shyam

3201418

parquet_file here is the path of target folder for saving parquet files.

– Shyam
Jan 4 at 6:39

Check 2 Things 1. Count of Dataframe that you are writing - does it have any data? 2. Can you print path where you are writing i.e parquet_file and path where you are checking files. just want to make sure you have not messed up relative path

– Harjeet Kumar
Jan 4 at 6:47

@HarjeetKumar 1. I have lot of data in the table , the respected dataframe has around 1793,723594 i.e. 1790 million records. 2. When I read fewer records i.e. 1 million records I can see the files in target path , but when I read entire records i dont see any files in the target path. So its not the path issue.

– Shyam
Jan 4 at 7:05

add a comment |

I am facing a wired situation. I am trying to read from oracle and write to a hdfs folder in parquet files using spark-sql 2.3.1. Below is my code snippet:

df.write.format("parquet")

  .mode("overwrite")

  .partitionBy(partitionColumn)

  .save(parquet_file)

When I run this code in locally it is working fine, but when I run the same on a apache-spark cluster it is not at producing any results in the target folder.

edited Jan 4 at 6:43

Shaido

13.1k123044

asked Jan 4 at 6:34

Shyam

3201418

I am facing a wired situation. I am trying to read from oracle and write to a hdfs folder in parquet files using spark-sql 2.3.1. Below is my code snippet:

df.write.format("parquet")

  .mode("overwrite")

  .partitionBy(partitionColumn)

  .save(parquet_file)

When I run this code in locally it is working fine, but when I run the same on a apache-spark cluster it is not at producing any results in the target folder.

apache-spark apache-spark-sql parquet

edited Jan 4 at 6:43

Shaido

13.1k123044

asked Jan 4 at 6:34

Shyam

3201418

edited Jan 4 at 6:43

Shaido

13.1k123044

asked Jan 4 at 6:34

Shyam

3201418

edited Jan 4 at 6:43

Shaido

13.1k123044

edited Jan 4 at 6:43

Shaido

13.1k123044

edited Jan 4 at 6:43

Shaido

13.1k123044

asked Jan 4 at 6:34

Shyam

3201418

asked Jan 4 at 6:34

Shyam

3201418

asked Jan 4 at 6:34

Shyam

3201418

parquet_file here is the path of target folder for saving parquet files.

– Shyam
Jan 4 at 6:39

Check 2 Things 1. Count of Dataframe that you are writing - does it have any data? 2. Can you print path where you are writing i.e parquet_file and path where you are checking files. just want to make sure you have not messed up relative path

– Harjeet Kumar
Jan 4 at 6:47

@HarjeetKumar 1. I have lot of data in the table , the respected dataframe has around 1793,723594 i.e. 1790 million records. 2. When I read fewer records i.e. 1 million records I can see the files in target path , but when I read entire records i dont see any files in the target path. So its not the path issue.

– Shyam
Jan 4 at 7:05

add a comment |

parquet_file here is the path of target folder for saving parquet files.

– Shyam
Jan 4 at 6:39

Check 2 Things 1. Count of Dataframe that you are writing - does it have any data? 2. Can you print path where you are writing i.e parquet_file and path where you are checking files. just want to make sure you have not messed up relative path

– Harjeet Kumar
Jan 4 at 6:47

@HarjeetKumar 1. I have lot of data in the table , the respected dataframe has around 1793,723594 i.e. 1790 million records. 2. When I read fewer records i.e. 1 million records I can see the files in target path , but when I read entire records i dont see any files in the target path. So its not the path issue.

– Shyam
Jan 4 at 7:05

parquet_file here is the path of target folder for saving parquet files.

– Shyam
Jan 4 at 6:39

Check 2 Things 1. Count of Dataframe that you are writing - does it have any data? 2. Can you print path where you are writing i.e parquet_file and path where you are checking files. just want to make sure you have not messed up relative path

– Harjeet Kumar
Jan 4 at 6:47

@HarjeetKumar 1. I have lot of data in the table , the respected dataframe has around 1793,723594 i.e. 1790 million records. 2. When I read fewer records i.e. 1 million records I can see the files in target path , but when I read entire records i dont see any files in the target path. So its not the path issue.

– Shyam
Jan 4 at 7:05

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54034090%2fspark-parquet-file-writing-will-not-show-any-files-in-target-folder%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk