Import a new file type in R












1















I would like to create a function that imports this syntax-type files to R. The format of those files looks like:



!!!
!!!
!!!
!!!
!!!
!!!
!!!
!!!
**kern **dynam **kern **dynam <--------Here is determined the number of columns (4)
*staff2 * *staff1 *staff1/2
*>[A,A,B,B] * *>[A,A,B,B] *>[A,A,B,B]
*>norep[A,B] * *>norep[A,B] *>norep[A,B]
*>A * *>A *>A
*clefF4 * *clefG2 *clefG2
*k[b-] * *k[b-] *k[b-]
*F: * *F: *F:
*M3/4 * *M3/4 *M3/4
*MM108 * *MM108 *MM108
16r . 16f f
=1 =1 =1 =1
!LO:TX:b:i:t=legato ! ! !
12FL . 4cc .
12A . . .
12cJ . . .
. . (32bnqq/ .
12GL . 4cc) .
12B- . . .
12cJ . . .
. . (32bqLLL> .
. . 32ccq .
. . 32ddqJJJ .
12FL . 4cc) .
12A . . .
12cJ . . .
=2 =2 =2 =2
*Xtuplet * *Xtuplet *


Therefore, somehow the file could be converted to csv to be imported in R. The number of columns could vary in each file (normally 1 to 50).



I tried data_imported<-import("sonata.krn", format = "csv") but I got the following error:



Warning messages:
1: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Detected 1 column names but the data has 4 columns (i.e. invalid file). Added 3 extra default column names at the end.
2: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Stopped early on line 101. Expected 4 fields but found 5. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<4FF 4F . (<12g 12b-L 2ryy f>>


However, the first part of the file is shown in the console. The thing is that I want to automatize the task, so the header should be removed by using a function in R...but if I cannot import this file, that's not possible.



Thank you in advance, any ideas will be rewarded!










share|improve this question




















  • 3





    If you don't need the info in the header (and the same # of header lines appears in each Humdrum file), consider using the skip arg of read.table to skip past the !!! lines.

    – smarchese
    Jan 3 at 0:15
















1















I would like to create a function that imports this syntax-type files to R. The format of those files looks like:



!!!
!!!
!!!
!!!
!!!
!!!
!!!
!!!
**kern **dynam **kern **dynam <--------Here is determined the number of columns (4)
*staff2 * *staff1 *staff1/2
*>[A,A,B,B] * *>[A,A,B,B] *>[A,A,B,B]
*>norep[A,B] * *>norep[A,B] *>norep[A,B]
*>A * *>A *>A
*clefF4 * *clefG2 *clefG2
*k[b-] * *k[b-] *k[b-]
*F: * *F: *F:
*M3/4 * *M3/4 *M3/4
*MM108 * *MM108 *MM108
16r . 16f f
=1 =1 =1 =1
!LO:TX:b:i:t=legato ! ! !
12FL . 4cc .
12A . . .
12cJ . . .
. . (32bnqq/ .
12GL . 4cc) .
12B- . . .
12cJ . . .
. . (32bqLLL> .
. . 32ccq .
. . 32ddqJJJ .
12FL . 4cc) .
12A . . .
12cJ . . .
=2 =2 =2 =2
*Xtuplet * *Xtuplet *


Therefore, somehow the file could be converted to csv to be imported in R. The number of columns could vary in each file (normally 1 to 50).



I tried data_imported<-import("sonata.krn", format = "csv") but I got the following error:



Warning messages:
1: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Detected 1 column names but the data has 4 columns (i.e. invalid file). Added 3 extra default column names at the end.
2: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Stopped early on line 101. Expected 4 fields but found 5. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<4FF 4F . (<12g 12b-L 2ryy f>>


However, the first part of the file is shown in the console. The thing is that I want to automatize the task, so the header should be removed by using a function in R...but if I cannot import this file, that's not possible.



Thank you in advance, any ideas will be rewarded!










share|improve this question




















  • 3





    If you don't need the info in the header (and the same # of header lines appears in each Humdrum file), consider using the skip arg of read.table to skip past the !!! lines.

    – smarchese
    Jan 3 at 0:15














1












1








1








I would like to create a function that imports this syntax-type files to R. The format of those files looks like:



!!!
!!!
!!!
!!!
!!!
!!!
!!!
!!!
**kern **dynam **kern **dynam <--------Here is determined the number of columns (4)
*staff2 * *staff1 *staff1/2
*>[A,A,B,B] * *>[A,A,B,B] *>[A,A,B,B]
*>norep[A,B] * *>norep[A,B] *>norep[A,B]
*>A * *>A *>A
*clefF4 * *clefG2 *clefG2
*k[b-] * *k[b-] *k[b-]
*F: * *F: *F:
*M3/4 * *M3/4 *M3/4
*MM108 * *MM108 *MM108
16r . 16f f
=1 =1 =1 =1
!LO:TX:b:i:t=legato ! ! !
12FL . 4cc .
12A . . .
12cJ . . .
. . (32bnqq/ .
12GL . 4cc) .
12B- . . .
12cJ . . .
. . (32bqLLL> .
. . 32ccq .
. . 32ddqJJJ .
12FL . 4cc) .
12A . . .
12cJ . . .
=2 =2 =2 =2
*Xtuplet * *Xtuplet *


Therefore, somehow the file could be converted to csv to be imported in R. The number of columns could vary in each file (normally 1 to 50).



I tried data_imported<-import("sonata.krn", format = "csv") but I got the following error:



Warning messages:
1: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Detected 1 column names but the data has 4 columns (i.e. invalid file). Added 3 extra default column names at the end.
2: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Stopped early on line 101. Expected 4 fields but found 5. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<4FF 4F . (<12g 12b-L 2ryy f>>


However, the first part of the file is shown in the console. The thing is that I want to automatize the task, so the header should be removed by using a function in R...but if I cannot import this file, that's not possible.



Thank you in advance, any ideas will be rewarded!










share|improve this question
















I would like to create a function that imports this syntax-type files to R. The format of those files looks like:



!!!
!!!
!!!
!!!
!!!
!!!
!!!
!!!
**kern **dynam **kern **dynam <--------Here is determined the number of columns (4)
*staff2 * *staff1 *staff1/2
*>[A,A,B,B] * *>[A,A,B,B] *>[A,A,B,B]
*>norep[A,B] * *>norep[A,B] *>norep[A,B]
*>A * *>A *>A
*clefF4 * *clefG2 *clefG2
*k[b-] * *k[b-] *k[b-]
*F: * *F: *F:
*M3/4 * *M3/4 *M3/4
*MM108 * *MM108 *MM108
16r . 16f f
=1 =1 =1 =1
!LO:TX:b:i:t=legato ! ! !
12FL . 4cc .
12A . . .
12cJ . . .
. . (32bnqq/ .
12GL . 4cc) .
12B- . . .
12cJ . . .
. . (32bqLLL> .
. . 32ccq .
. . 32ddqJJJ .
12FL . 4cc) .
12A . . .
12cJ . . .
=2 =2 =2 =2
*Xtuplet * *Xtuplet *


Therefore, somehow the file could be converted to csv to be imported in R. The number of columns could vary in each file (normally 1 to 50).



I tried data_imported<-import("sonata.krn", format = "csv") but I got the following error:



Warning messages:
1: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Detected 1 column names but the data has 4 columns (i.e. invalid file). Added 3 extra default column names at the end.
2: In fread(dec = ".", input = "son.krn", sep = "auto", header = "auto", :
Stopped early on line 101. Expected 4 fields but found 5. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<4FF 4F . (<12g 12b-L 2ryy f>>


However, the first part of the file is shown in the console. The thing is that I want to automatize the task, so the header should be removed by using a function in R...but if I cannot import this file, that's not possible.



Thank you in advance, any ideas will be rewarded!







r file file-conversion






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 25 at 17:26







fina

















asked Jan 3 at 0:03









finafina

6715




6715








  • 3





    If you don't need the info in the header (and the same # of header lines appears in each Humdrum file), consider using the skip arg of read.table to skip past the !!! lines.

    – smarchese
    Jan 3 at 0:15














  • 3





    If you don't need the info in the header (and the same # of header lines appears in each Humdrum file), consider using the skip arg of read.table to skip past the !!! lines.

    – smarchese
    Jan 3 at 0:15








3




3





If you don't need the info in the header (and the same # of header lines appears in each Humdrum file), consider using the skip arg of read.table to skip past the !!! lines.

– smarchese
Jan 3 at 0:15





If you don't need the info in the header (and the same # of header lines appears in each Humdrum file), consider using the skip arg of read.table to skip past the !!! lines.

– smarchese
Jan 3 at 0:15












1 Answer
1






active

oldest

votes


















1














The code below imports the header as a named character vector and the data as a data frame.



fn <- "sonata.krn"

# Read all the lines
lines <- readLines(con = fn)

# Get the header and the data
start <- 1
while (grepl("^!!!", lines[start])) {
start <- start + 1
}

header <- lines[1:(start - 1)]
# Convert the header into a named vector
names(header) <- gsub("^!!!([[:alpha:][:digit:]]+):.*", "\1", header)
header <- gsub("^!!![[:alpha:][:digit:]]+: ", "", header)

# Lines containing the data
lines <- lines[-(1:(start - 1))]

# Substitute TABs for column delimiters
lines <- gsub(" +", "t", lines)

# Import the data
d <- read.delim(textConnection(lines), sep = "t",
stringsAsFactors = FALSE)


Some additional data cleaning is probably necessary but that should be pretty straightforward.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014765%2fimport-a-new-file-type-in-r%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    The code below imports the header as a named character vector and the data as a data frame.



    fn <- "sonata.krn"

    # Read all the lines
    lines <- readLines(con = fn)

    # Get the header and the data
    start <- 1
    while (grepl("^!!!", lines[start])) {
    start <- start + 1
    }

    header <- lines[1:(start - 1)]
    # Convert the header into a named vector
    names(header) <- gsub("^!!!([[:alpha:][:digit:]]+):.*", "\1", header)
    header <- gsub("^!!![[:alpha:][:digit:]]+: ", "", header)

    # Lines containing the data
    lines <- lines[-(1:(start - 1))]

    # Substitute TABs for column delimiters
    lines <- gsub(" +", "t", lines)

    # Import the data
    d <- read.delim(textConnection(lines), sep = "t",
    stringsAsFactors = FALSE)


    Some additional data cleaning is probably necessary but that should be pretty straightforward.






    share|improve this answer




























      1














      The code below imports the header as a named character vector and the data as a data frame.



      fn <- "sonata.krn"

      # Read all the lines
      lines <- readLines(con = fn)

      # Get the header and the data
      start <- 1
      while (grepl("^!!!", lines[start])) {
      start <- start + 1
      }

      header <- lines[1:(start - 1)]
      # Convert the header into a named vector
      names(header) <- gsub("^!!!([[:alpha:][:digit:]]+):.*", "\1", header)
      header <- gsub("^!!![[:alpha:][:digit:]]+: ", "", header)

      # Lines containing the data
      lines <- lines[-(1:(start - 1))]

      # Substitute TABs for column delimiters
      lines <- gsub(" +", "t", lines)

      # Import the data
      d <- read.delim(textConnection(lines), sep = "t",
      stringsAsFactors = FALSE)


      Some additional data cleaning is probably necessary but that should be pretty straightforward.






      share|improve this answer


























        1












        1








        1







        The code below imports the header as a named character vector and the data as a data frame.



        fn <- "sonata.krn"

        # Read all the lines
        lines <- readLines(con = fn)

        # Get the header and the data
        start <- 1
        while (grepl("^!!!", lines[start])) {
        start <- start + 1
        }

        header <- lines[1:(start - 1)]
        # Convert the header into a named vector
        names(header) <- gsub("^!!!([[:alpha:][:digit:]]+):.*", "\1", header)
        header <- gsub("^!!![[:alpha:][:digit:]]+: ", "", header)

        # Lines containing the data
        lines <- lines[-(1:(start - 1))]

        # Substitute TABs for column delimiters
        lines <- gsub(" +", "t", lines)

        # Import the data
        d <- read.delim(textConnection(lines), sep = "t",
        stringsAsFactors = FALSE)


        Some additional data cleaning is probably necessary but that should be pretty straightforward.






        share|improve this answer













        The code below imports the header as a named character vector and the data as a data frame.



        fn <- "sonata.krn"

        # Read all the lines
        lines <- readLines(con = fn)

        # Get the header and the data
        start <- 1
        while (grepl("^!!!", lines[start])) {
        start <- start + 1
        }

        header <- lines[1:(start - 1)]
        # Convert the header into a named vector
        names(header) <- gsub("^!!!([[:alpha:][:digit:]]+):.*", "\1", header)
        header <- gsub("^!!![[:alpha:][:digit:]]+: ", "", header)

        # Lines containing the data
        lines <- lines[-(1:(start - 1))]

        # Substitute TABs for column delimiters
        lines <- gsub(" +", "t", lines)

        # Import the data
        d <- read.delim(textConnection(lines), sep = "t",
        stringsAsFactors = FALSE)


        Some additional data cleaning is probably necessary but that should be pretty straightforward.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 3 at 10:49









        Pavel ObraztcovPavel Obraztcov

        1415




        1415
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014765%2fimport-a-new-file-type-in-r%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Angular Downloading a file using contenturl with Basic Authentication

            Monofisismo

            Olmecas