R :Read csv numeric with comma in decimal, package sparklyr












0














I need to read a file of type ".csv" using the library "sparklyr", in which the numeric values appear with commas.



I am using:



library(sparklyr)
library(dplyr)

df<-data.frame(DNI=c("22-e","EE-4","55-W"),
DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9"))

write.csv(df,"aff.csv")

sc <- spark_connect(master = "local", spark_home = "/home/tomas/spark-2.1.0-bin-hadoop2.7/", version = "2.1.0")

df <- spark_read_csv(sc, name = "data", path = "/home/tomas/Documentos/Clusterapp/aff.csv", header = TRUE, delimiter = ",")

tbl <- sdf_copy_to(sc = sc, x =C_tbl , overwrite = T)


The problem, read the numbers as factor










share|improve this question









New contributor




Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 2




    You might want to include some sample data from the file.
    – Tim Biegeleisen
    Dec 27 at 13:46






  • 2




    Please add this data to your question, formatted as readable code.
    – Tim Biegeleisen
    Dec 27 at 13:58










  • The first column is an identifier and the others are numeric values, in which the comma identifies a number with decimal. df<-data.frame(DNI=c("22-e","EE-4","55-W"), DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9")) write.csv(df,"aff.csv")
    – Tomas Tapia
    Dec 27 at 14:36










  • I'm looking for something like the csv2 () function of R, but of sparklyr
    – Tomas Tapia
    Dec 27 at 15:07
















0














I need to read a file of type ".csv" using the library "sparklyr", in which the numeric values appear with commas.



I am using:



library(sparklyr)
library(dplyr)

df<-data.frame(DNI=c("22-e","EE-4","55-W"),
DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9"))

write.csv(df,"aff.csv")

sc <- spark_connect(master = "local", spark_home = "/home/tomas/spark-2.1.0-bin-hadoop2.7/", version = "2.1.0")

df <- spark_read_csv(sc, name = "data", path = "/home/tomas/Documentos/Clusterapp/aff.csv", header = TRUE, delimiter = ",")

tbl <- sdf_copy_to(sc = sc, x =C_tbl , overwrite = T)


The problem, read the numbers as factor










share|improve this question









New contributor




Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 2




    You might want to include some sample data from the file.
    – Tim Biegeleisen
    Dec 27 at 13:46






  • 2




    Please add this data to your question, formatted as readable code.
    – Tim Biegeleisen
    Dec 27 at 13:58










  • The first column is an identifier and the others are numeric values, in which the comma identifies a number with decimal. df<-data.frame(DNI=c("22-e","EE-4","55-W"), DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9")) write.csv(df,"aff.csv")
    – Tomas Tapia
    Dec 27 at 14:36










  • I'm looking for something like the csv2 () function of R, but of sparklyr
    – Tomas Tapia
    Dec 27 at 15:07














0












0








0







I need to read a file of type ".csv" using the library "sparklyr", in which the numeric values appear with commas.



I am using:



library(sparklyr)
library(dplyr)

df<-data.frame(DNI=c("22-e","EE-4","55-W"),
DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9"))

write.csv(df,"aff.csv")

sc <- spark_connect(master = "local", spark_home = "/home/tomas/spark-2.1.0-bin-hadoop2.7/", version = "2.1.0")

df <- spark_read_csv(sc, name = "data", path = "/home/tomas/Documentos/Clusterapp/aff.csv", header = TRUE, delimiter = ",")

tbl <- sdf_copy_to(sc = sc, x =C_tbl , overwrite = T)


The problem, read the numbers as factor










share|improve this question









New contributor




Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I need to read a file of type ".csv" using the library "sparklyr", in which the numeric values appear with commas.



I am using:



library(sparklyr)
library(dplyr)

df<-data.frame(DNI=c("22-e","EE-4","55-W"),
DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9"))

write.csv(df,"aff.csv")

sc <- spark_connect(master = "local", spark_home = "/home/tomas/spark-2.1.0-bin-hadoop2.7/", version = "2.1.0")

df <- spark_read_csv(sc, name = "data", path = "/home/tomas/Documentos/Clusterapp/aff.csv", header = TRUE, delimiter = ",")

tbl <- sdf_copy_to(sc = sc, x =C_tbl , overwrite = T)


The problem, read the numbers as factor







r apache-spark sparklyr






share|improve this question









New contributor




Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 17 hours ago









user6910411

32.6k86995




32.6k86995






New contributor




Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Dec 27 at 13:39









Tomas Tapia

11




11




New contributor




Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Tomas Tapia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 2




    You might want to include some sample data from the file.
    – Tim Biegeleisen
    Dec 27 at 13:46






  • 2




    Please add this data to your question, formatted as readable code.
    – Tim Biegeleisen
    Dec 27 at 13:58










  • The first column is an identifier and the others are numeric values, in which the comma identifies a number with decimal. df<-data.frame(DNI=c("22-e","EE-4","55-W"), DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9")) write.csv(df,"aff.csv")
    – Tomas Tapia
    Dec 27 at 14:36










  • I'm looking for something like the csv2 () function of R, but of sparklyr
    – Tomas Tapia
    Dec 27 at 15:07














  • 2




    You might want to include some sample data from the file.
    – Tim Biegeleisen
    Dec 27 at 13:46






  • 2




    Please add this data to your question, formatted as readable code.
    – Tim Biegeleisen
    Dec 27 at 13:58










  • The first column is an identifier and the others are numeric values, in which the comma identifies a number with decimal. df<-data.frame(DNI=c("22-e","EE-4","55-W"), DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9")) write.csv(df,"aff.csv")
    – Tomas Tapia
    Dec 27 at 14:36










  • I'm looking for something like the csv2 () function of R, but of sparklyr
    – Tomas Tapia
    Dec 27 at 15:07








2




2




You might want to include some sample data from the file.
– Tim Biegeleisen
Dec 27 at 13:46




You might want to include some sample data from the file.
– Tim Biegeleisen
Dec 27 at 13:46




2




2




Please add this data to your question, formatted as readable code.
– Tim Biegeleisen
Dec 27 at 13:58




Please add this data to your question, formatted as readable code.
– Tim Biegeleisen
Dec 27 at 13:58












The first column is an identifier and the others are numeric values, in which the comma identifies a number with decimal. df<-data.frame(DNI=c("22-e","EE-4","55-W"), DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9")) write.csv(df,"aff.csv")
– Tomas Tapia
Dec 27 at 14:36




The first column is an identifier and the others are numeric values, in which the comma identifies a number with decimal. df<-data.frame(DNI=c("22-e","EE-4","55-W"), DD=c("33,2","33.2","14,55"),CC=c("2","44,4","44,9")) write.csv(df,"aff.csv")
– Tomas Tapia
Dec 27 at 14:36












I'm looking for something like the csv2 () function of R, but of sparklyr
– Tomas Tapia
Dec 27 at 15:07




I'm looking for something like the csv2 () function of R, but of sparklyr
– Tomas Tapia
Dec 27 at 15:07












2 Answers
2






active

oldest

votes


















1














To manipulate string inside a spark df you can use regexp_replace function as mentioned here:



https://spark.rstudio.com/guides/textmining/



For you problem it would work out like this:



tbl <- sdf_copy_to(sc = sc, x =df, overwrite = T)

tbl0<-tbl%>%
mutate(DD=regexp_replace(DD,",","."),CC=regexp_replace(CC,",","."))%>%
mutate_at(vars(c("DD","CC")),as.numeric)


to check your result:



> glimpse(tbl0)
Observations: ??
Variables: 3
$ DNI <chr> "22-e", "EE-4", "55-W"
$ DD <dbl> 33.20, 33.20, 14.55
$ CC <dbl> 2.0, 44.4, 44.9





share|improve this answer





























    0














    You could replace the "," in the numbers with "." and convert them to numeric. For instance



    df$DD<-as.numeric(gsub(pattern = ",",replacement = ".",x = df$DD))


    Does that help?






    share|improve this answer





















    • It is a good strategy, but the objective is to be able to read the data without having to load the data in R, but to operate with Sparklyr directly. In R is the command read.csv2() that allows to read decimal numeric data with comma.
      – Tomas Tapia
      yesterday











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Tomas Tapia is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53946018%2fr-read-csv-numeric-with-comma-in-decimal-package-sparklyr%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    To manipulate string inside a spark df you can use regexp_replace function as mentioned here:



    https://spark.rstudio.com/guides/textmining/



    For you problem it would work out like this:



    tbl <- sdf_copy_to(sc = sc, x =df, overwrite = T)

    tbl0<-tbl%>%
    mutate(DD=regexp_replace(DD,",","."),CC=regexp_replace(CC,",","."))%>%
    mutate_at(vars(c("DD","CC")),as.numeric)


    to check your result:



    > glimpse(tbl0)
    Observations: ??
    Variables: 3
    $ DNI <chr> "22-e", "EE-4", "55-W"
    $ DD <dbl> 33.20, 33.20, 14.55
    $ CC <dbl> 2.0, 44.4, 44.9





    share|improve this answer


























      1














      To manipulate string inside a spark df you can use regexp_replace function as mentioned here:



      https://spark.rstudio.com/guides/textmining/



      For you problem it would work out like this:



      tbl <- sdf_copy_to(sc = sc, x =df, overwrite = T)

      tbl0<-tbl%>%
      mutate(DD=regexp_replace(DD,",","."),CC=regexp_replace(CC,",","."))%>%
      mutate_at(vars(c("DD","CC")),as.numeric)


      to check your result:



      > glimpse(tbl0)
      Observations: ??
      Variables: 3
      $ DNI <chr> "22-e", "EE-4", "55-W"
      $ DD <dbl> 33.20, 33.20, 14.55
      $ CC <dbl> 2.0, 44.4, 44.9





      share|improve this answer
























        1












        1








        1






        To manipulate string inside a spark df you can use regexp_replace function as mentioned here:



        https://spark.rstudio.com/guides/textmining/



        For you problem it would work out like this:



        tbl <- sdf_copy_to(sc = sc, x =df, overwrite = T)

        tbl0<-tbl%>%
        mutate(DD=regexp_replace(DD,",","."),CC=regexp_replace(CC,",","."))%>%
        mutate_at(vars(c("DD","CC")),as.numeric)


        to check your result:



        > glimpse(tbl0)
        Observations: ??
        Variables: 3
        $ DNI <chr> "22-e", "EE-4", "55-W"
        $ DD <dbl> 33.20, 33.20, 14.55
        $ CC <dbl> 2.0, 44.4, 44.9





        share|improve this answer












        To manipulate string inside a spark df you can use regexp_replace function as mentioned here:



        https://spark.rstudio.com/guides/textmining/



        For you problem it would work out like this:



        tbl <- sdf_copy_to(sc = sc, x =df, overwrite = T)

        tbl0<-tbl%>%
        mutate(DD=regexp_replace(DD,",","."),CC=regexp_replace(CC,",","."))%>%
        mutate_at(vars(c("DD","CC")),as.numeric)


        to check your result:



        > glimpse(tbl0)
        Observations: ??
        Variables: 3
        $ DNI <chr> "22-e", "EE-4", "55-W"
        $ DD <dbl> 33.20, 33.20, 14.55
        $ CC <dbl> 2.0, 44.4, 44.9






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 20 hours ago









        Antonis

        1,4831715




        1,4831715

























            0














            You could replace the "," in the numbers with "." and convert them to numeric. For instance



            df$DD<-as.numeric(gsub(pattern = ",",replacement = ".",x = df$DD))


            Does that help?






            share|improve this answer





















            • It is a good strategy, but the objective is to be able to read the data without having to load the data in R, but to operate with Sparklyr directly. In R is the command read.csv2() that allows to read decimal numeric data with comma.
              – Tomas Tapia
              yesterday
















            0














            You could replace the "," in the numbers with "." and convert them to numeric. For instance



            df$DD<-as.numeric(gsub(pattern = ",",replacement = ".",x = df$DD))


            Does that help?






            share|improve this answer





















            • It is a good strategy, but the objective is to be able to read the data without having to load the data in R, but to operate with Sparklyr directly. In R is the command read.csv2() that allows to read decimal numeric data with comma.
              – Tomas Tapia
              yesterday














            0












            0








            0






            You could replace the "," in the numbers with "." and convert them to numeric. For instance



            df$DD<-as.numeric(gsub(pattern = ",",replacement = ".",x = df$DD))


            Does that help?






            share|improve this answer












            You could replace the "," in the numbers with "." and convert them to numeric. For instance



            df$DD<-as.numeric(gsub(pattern = ",",replacement = ".",x = df$DD))


            Does that help?







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Dec 28 at 10:32









            Rage

            14312




            14312












            • It is a good strategy, but the objective is to be able to read the data without having to load the data in R, but to operate with Sparklyr directly. In R is the command read.csv2() that allows to read decimal numeric data with comma.
              – Tomas Tapia
              yesterday


















            • It is a good strategy, but the objective is to be able to read the data without having to load the data in R, but to operate with Sparklyr directly. In R is the command read.csv2() that allows to read decimal numeric data with comma.
              – Tomas Tapia
              yesterday
















            It is a good strategy, but the objective is to be able to read the data without having to load the data in R, but to operate with Sparklyr directly. In R is the command read.csv2() that allows to read decimal numeric data with comma.
            – Tomas Tapia
            yesterday




            It is a good strategy, but the objective is to be able to read the data without having to load the data in R, but to operate with Sparklyr directly. In R is the command read.csv2() that allows to read decimal numeric data with comma.
            – Tomas Tapia
            yesterday










            Tomas Tapia is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            Tomas Tapia is a new contributor. Be nice, and check out our Code of Conduct.













            Tomas Tapia is a new contributor. Be nice, and check out our Code of Conduct.












            Tomas Tapia is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53946018%2fr-read-csv-numeric-with-comma-in-decimal-package-sparklyr%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Monofisismo

            Angular Downloading a file using contenturl with Basic Authentication

            Olmecas