Reshaping into long form when there are no column names for certain repetition












1















I have a following sample dataset, and I am trying to reshape to long form using data.table package (no dplyr solution please).



test_data<-structure(list(id = c("a", "b", "c", "d"), demo_d1_alt = c(2, 
3, 4, 5), demo_d1_pre = c(5, 4, 5, 5), demo_d2_alt = c(4, 5,
6, 7), demo_d3_alt = c(8, 9, 10, 11), demo_d3_pre = c(5, 5, 5,
5)), .Names = c("id", "demo_d1_alt", "demo_d1_pre", "demo_d2_alt",
"demo_d3_alt", "demo_d3_pre"), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))

id demo_d1_alt demo_d1_pre demo_d2_alt demo_d3_alt demo_d3_pre
a 2 5 4 8 5
b 3 4 5 9 5
c 4 5 6 10 5
d 5 5 7 11 5


Following is my code and the output:



library(data.table)
data_long <- melt.data.table(
setDT(test_data), id.vars = "id",
measure.vars = patterns(alt = "alt",
prec= "pre"))
data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 5
6: b 2 5 5
7: c 2 6 5
8: d 2 7 5
9: a 3 8 NA
10: b 3 9 NA
11: c 3 10 NA
12: d 3 11 NA


My expected output is (since only for demo_d2 there is no prec)



data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 NA
6: b 2 5 NA
7: c 2 6 NA
8: d 2 7 NA
9: a 3 8 5
10: b 3 9 5
11: c 3 10 5
12: d 3 11 5


Please suggest.










share|improve this question


















  • 1





    See this issue and a related discussion for what led to my draft ReshapeLong_() function.

    – A5C1D2H2I1M1N2O1R2T1
    Mar 3 '18 at 16:45
















1















I have a following sample dataset, and I am trying to reshape to long form using data.table package (no dplyr solution please).



test_data<-structure(list(id = c("a", "b", "c", "d"), demo_d1_alt = c(2, 
3, 4, 5), demo_d1_pre = c(5, 4, 5, 5), demo_d2_alt = c(4, 5,
6, 7), demo_d3_alt = c(8, 9, 10, 11), demo_d3_pre = c(5, 5, 5,
5)), .Names = c("id", "demo_d1_alt", "demo_d1_pre", "demo_d2_alt",
"demo_d3_alt", "demo_d3_pre"), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))

id demo_d1_alt demo_d1_pre demo_d2_alt demo_d3_alt demo_d3_pre
a 2 5 4 8 5
b 3 4 5 9 5
c 4 5 6 10 5
d 5 5 7 11 5


Following is my code and the output:



library(data.table)
data_long <- melt.data.table(
setDT(test_data), id.vars = "id",
measure.vars = patterns(alt = "alt",
prec= "pre"))
data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 5
6: b 2 5 5
7: c 2 6 5
8: d 2 7 5
9: a 3 8 NA
10: b 3 9 NA
11: c 3 10 NA
12: d 3 11 NA


My expected output is (since only for demo_d2 there is no prec)



data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 NA
6: b 2 5 NA
7: c 2 6 NA
8: d 2 7 NA
9: a 3 8 5
10: b 3 9 5
11: c 3 10 5
12: d 3 11 5


Please suggest.










share|improve this question


















  • 1





    See this issue and a related discussion for what led to my draft ReshapeLong_() function.

    – A5C1D2H2I1M1N2O1R2T1
    Mar 3 '18 at 16:45














1












1








1








I have a following sample dataset, and I am trying to reshape to long form using data.table package (no dplyr solution please).



test_data<-structure(list(id = c("a", "b", "c", "d"), demo_d1_alt = c(2, 
3, 4, 5), demo_d1_pre = c(5, 4, 5, 5), demo_d2_alt = c(4, 5,
6, 7), demo_d3_alt = c(8, 9, 10, 11), demo_d3_pre = c(5, 5, 5,
5)), .Names = c("id", "demo_d1_alt", "demo_d1_pre", "demo_d2_alt",
"demo_d3_alt", "demo_d3_pre"), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))

id demo_d1_alt demo_d1_pre demo_d2_alt demo_d3_alt demo_d3_pre
a 2 5 4 8 5
b 3 4 5 9 5
c 4 5 6 10 5
d 5 5 7 11 5


Following is my code and the output:



library(data.table)
data_long <- melt.data.table(
setDT(test_data), id.vars = "id",
measure.vars = patterns(alt = "alt",
prec= "pre"))
data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 5
6: b 2 5 5
7: c 2 6 5
8: d 2 7 5
9: a 3 8 NA
10: b 3 9 NA
11: c 3 10 NA
12: d 3 11 NA


My expected output is (since only for demo_d2 there is no prec)



data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 NA
6: b 2 5 NA
7: c 2 6 NA
8: d 2 7 NA
9: a 3 8 5
10: b 3 9 5
11: c 3 10 5
12: d 3 11 5


Please suggest.










share|improve this question














I have a following sample dataset, and I am trying to reshape to long form using data.table package (no dplyr solution please).



test_data<-structure(list(id = c("a", "b", "c", "d"), demo_d1_alt = c(2, 
3, 4, 5), demo_d1_pre = c(5, 4, 5, 5), demo_d2_alt = c(4, 5,
6, 7), demo_d3_alt = c(8, 9, 10, 11), demo_d3_pre = c(5, 5, 5,
5)), .Names = c("id", "demo_d1_alt", "demo_d1_pre", "demo_d2_alt",
"demo_d3_alt", "demo_d3_pre"), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))

id demo_d1_alt demo_d1_pre demo_d2_alt demo_d3_alt demo_d3_pre
a 2 5 4 8 5
b 3 4 5 9 5
c 4 5 6 10 5
d 5 5 7 11 5


Following is my code and the output:



library(data.table)
data_long <- melt.data.table(
setDT(test_data), id.vars = "id",
measure.vars = patterns(alt = "alt",
prec= "pre"))
data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 5
6: b 2 5 5
7: c 2 6 5
8: d 2 7 5
9: a 3 8 NA
10: b 3 9 NA
11: c 3 10 NA
12: d 3 11 NA


My expected output is (since only for demo_d2 there is no prec)



data_long
id variable alt prec
1: a 1 2 5
2: b 1 3 4
3: c 1 4 5
4: d 1 5 5
5: a 2 4 NA
6: b 2 5 NA
7: c 2 6 NA
8: d 2 7 NA
9: a 3 8 5
10: b 3 9 5
11: c 3 10 5
12: d 3 11 5


Please suggest.







r data.table reshape






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 27 '18 at 6:56









MetricsMetrics

11.3k43871




11.3k43871








  • 1





    See this issue and a related discussion for what led to my draft ReshapeLong_() function.

    – A5C1D2H2I1M1N2O1R2T1
    Mar 3 '18 at 16:45














  • 1





    See this issue and a related discussion for what led to my draft ReshapeLong_() function.

    – A5C1D2H2I1M1N2O1R2T1
    Mar 3 '18 at 16:45








1




1





See this issue and a related discussion for what led to my draft ReshapeLong_() function.

– A5C1D2H2I1M1N2O1R2T1
Mar 3 '18 at 16:45





See this issue and a related discussion for what led to my draft ReshapeLong_() function.

– A5C1D2H2I1M1N2O1R2T1
Mar 3 '18 at 16:45












3 Answers
3






active

oldest

votes


















1














One option would be to do a melt/dcast



dcast(melt(setDT(test_data), measure = patterns("d\d+"))[,
c("var1", "var2") := tstrsplit(variable, "_")[-1]],
id + var1 ~ var2, value.var = 'value')[order(var1)]
# id var1 alt pre
# 1: a d1 2 5
# 2: b d1 3 4
# 3: c d1 4 5
# 4: d d1 5 5
# 5: a d2 4 NA
# 6: b d2 5 NA
# 7: c d2 6 NA
# 8: d d2 7 NA
# 9: a d3 8 5
#10: b d3 9 5
#11: c d3 10 5
#12: d d3 11 5





share|improve this answer
























  • Thank you. So, melt alone can't handle it?.

    – Metrics
    Feb 27 '18 at 7:31






  • 1





    @Metrics For your code to work, there should be corresponding pairwise column. Here, we have the d3_pre at the end. So, it thinks that is is the pairwise column for the 'd2_alt'. To avoid that create a column of NAs, i.e. cbind(test_data[1:4], demo_d2_pre = NA_real_, test_data[5:6]) before converting to data.table

    – akrun
    Feb 27 '18 at 7:38






  • 1





    Thanks a lot. That worked since I have only one variable to be created.

    – Metrics
    Feb 27 '18 at 7:50



















1














Currently, most R approaches don't handle these types of unbalanced datasets very neatly, but instead require you to first convert the data into a "long" format, do some transformation, and then go back into a "wide" format. For this, I had written the merged.stack function as part of my "splitstackshape" package, so named because it would first stack related columns of data into a list and then merge them together.



The approach would be something like:



library(splitstackshape)
merged.stack(test_data, var.stubs = c("_alt", "_pre"), sep = "var.stubs", atStart = FALSE)
# id .time_1 _alt _pre
# 1: a demo_d1 2 5
# 2: a demo_d2 4 NA
# 3: a demo_d3 8 5
# 4: b demo_d1 3 4
# 5: b demo_d2 5 NA
# 6: b demo_d3 9 5
# 7: c demo_d1 4 5
# 8: c demo_d2 6 NA
# 9: c demo_d3 10 5
# 10: d demo_d1 5 5
# 11: d demo_d2 7 NA
# 12: d demo_d3 11 5


However, with the improvements that "data.table" has brought to melt, it made sense to revisit the approach.



The working draft can be found here, and you can load the relevant functions with:



source("https://gist.githubusercontent.com/mrdwab/ceded54b616bdc61cb0f/raw/c5a583267f8682a5afea1b551347eafdab618471/MELT.R")


Usage is straightforward: specify the dataset, the stubs, and whether the stubs are at the start or at the end of the variable names.



library(data.table)
ReshapeLong_(test_data, stubs = c(alt = "_alt", pre = "_pre"), end_stub = TRUE)
# id variable alt pre
# 1: a demo_d1 2 5
# 2: b demo_d1 3 4
# 3: c demo_d1 4 5
# 4: d demo_d1 5 5
# 5: a demo_d2 4 NA
# 6: b demo_d2 5 NA
# 7: c demo_d2 6 NA
# 8: d demo_d2 7 NA
# 9: a demo_d3 8 5
# 10: b demo_d3 9 5
# 11: c demo_d3 10 5
# 12: d demo_d3 11 5





share|improve this answer
























  • Thank you for the answer.

    – Metrics
    Mar 5 '18 at 5:55



















0














Simple hack would be



test_data <- as.data.table(test_data)
test_data[,.NA:=NA]
melt(test_data, id='id',
measure = list(c('demo_d1_alt', 'demo_d2_alt', 'demo_d3_alt'),
c('demo_d1_pre', '.NA', 'demo_d3_pre')))


align the variables by hand and insert '.NA' where no observation.



If there are mulitple missing variables,



test_data <- as.data.table(test_data)
test_data[,.NA:=NA]
melt(test_data, id='id',
measure = list(c('demo_d1_alt', 'demo_d2_alt', '.NA'),
c('demo_d1_pre', '.NA', 'demo_d3_pre')))





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f49002855%2freshaping-into-long-form-when-there-are-no-column-names-for-certain-repetition%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    One option would be to do a melt/dcast



    dcast(melt(setDT(test_data), measure = patterns("d\d+"))[,
    c("var1", "var2") := tstrsplit(variable, "_")[-1]],
    id + var1 ~ var2, value.var = 'value')[order(var1)]
    # id var1 alt pre
    # 1: a d1 2 5
    # 2: b d1 3 4
    # 3: c d1 4 5
    # 4: d d1 5 5
    # 5: a d2 4 NA
    # 6: b d2 5 NA
    # 7: c d2 6 NA
    # 8: d d2 7 NA
    # 9: a d3 8 5
    #10: b d3 9 5
    #11: c d3 10 5
    #12: d d3 11 5





    share|improve this answer
























    • Thank you. So, melt alone can't handle it?.

      – Metrics
      Feb 27 '18 at 7:31






    • 1





      @Metrics For your code to work, there should be corresponding pairwise column. Here, we have the d3_pre at the end. So, it thinks that is is the pairwise column for the 'd2_alt'. To avoid that create a column of NAs, i.e. cbind(test_data[1:4], demo_d2_pre = NA_real_, test_data[5:6]) before converting to data.table

      – akrun
      Feb 27 '18 at 7:38






    • 1





      Thanks a lot. That worked since I have only one variable to be created.

      – Metrics
      Feb 27 '18 at 7:50
















    1














    One option would be to do a melt/dcast



    dcast(melt(setDT(test_data), measure = patterns("d\d+"))[,
    c("var1", "var2") := tstrsplit(variable, "_")[-1]],
    id + var1 ~ var2, value.var = 'value')[order(var1)]
    # id var1 alt pre
    # 1: a d1 2 5
    # 2: b d1 3 4
    # 3: c d1 4 5
    # 4: d d1 5 5
    # 5: a d2 4 NA
    # 6: b d2 5 NA
    # 7: c d2 6 NA
    # 8: d d2 7 NA
    # 9: a d3 8 5
    #10: b d3 9 5
    #11: c d3 10 5
    #12: d d3 11 5





    share|improve this answer
























    • Thank you. So, melt alone can't handle it?.

      – Metrics
      Feb 27 '18 at 7:31






    • 1





      @Metrics For your code to work, there should be corresponding pairwise column. Here, we have the d3_pre at the end. So, it thinks that is is the pairwise column for the 'd2_alt'. To avoid that create a column of NAs, i.e. cbind(test_data[1:4], demo_d2_pre = NA_real_, test_data[5:6]) before converting to data.table

      – akrun
      Feb 27 '18 at 7:38






    • 1





      Thanks a lot. That worked since I have only one variable to be created.

      – Metrics
      Feb 27 '18 at 7:50














    1












    1








    1







    One option would be to do a melt/dcast



    dcast(melt(setDT(test_data), measure = patterns("d\d+"))[,
    c("var1", "var2") := tstrsplit(variable, "_")[-1]],
    id + var1 ~ var2, value.var = 'value')[order(var1)]
    # id var1 alt pre
    # 1: a d1 2 5
    # 2: b d1 3 4
    # 3: c d1 4 5
    # 4: d d1 5 5
    # 5: a d2 4 NA
    # 6: b d2 5 NA
    # 7: c d2 6 NA
    # 8: d d2 7 NA
    # 9: a d3 8 5
    #10: b d3 9 5
    #11: c d3 10 5
    #12: d d3 11 5





    share|improve this answer













    One option would be to do a melt/dcast



    dcast(melt(setDT(test_data), measure = patterns("d\d+"))[,
    c("var1", "var2") := tstrsplit(variable, "_")[-1]],
    id + var1 ~ var2, value.var = 'value')[order(var1)]
    # id var1 alt pre
    # 1: a d1 2 5
    # 2: b d1 3 4
    # 3: c d1 4 5
    # 4: d d1 5 5
    # 5: a d2 4 NA
    # 6: b d2 5 NA
    # 7: c d2 6 NA
    # 8: d d2 7 NA
    # 9: a d3 8 5
    #10: b d3 9 5
    #11: c d3 10 5
    #12: d d3 11 5






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Feb 27 '18 at 7:11









    akrunakrun

    403k13196269




    403k13196269













    • Thank you. So, melt alone can't handle it?.

      – Metrics
      Feb 27 '18 at 7:31






    • 1





      @Metrics For your code to work, there should be corresponding pairwise column. Here, we have the d3_pre at the end. So, it thinks that is is the pairwise column for the 'd2_alt'. To avoid that create a column of NAs, i.e. cbind(test_data[1:4], demo_d2_pre = NA_real_, test_data[5:6]) before converting to data.table

      – akrun
      Feb 27 '18 at 7:38






    • 1





      Thanks a lot. That worked since I have only one variable to be created.

      – Metrics
      Feb 27 '18 at 7:50



















    • Thank you. So, melt alone can't handle it?.

      – Metrics
      Feb 27 '18 at 7:31






    • 1





      @Metrics For your code to work, there should be corresponding pairwise column. Here, we have the d3_pre at the end. So, it thinks that is is the pairwise column for the 'd2_alt'. To avoid that create a column of NAs, i.e. cbind(test_data[1:4], demo_d2_pre = NA_real_, test_data[5:6]) before converting to data.table

      – akrun
      Feb 27 '18 at 7:38






    • 1





      Thanks a lot. That worked since I have only one variable to be created.

      – Metrics
      Feb 27 '18 at 7:50

















    Thank you. So, melt alone can't handle it?.

    – Metrics
    Feb 27 '18 at 7:31





    Thank you. So, melt alone can't handle it?.

    – Metrics
    Feb 27 '18 at 7:31




    1




    1





    @Metrics For your code to work, there should be corresponding pairwise column. Here, we have the d3_pre at the end. So, it thinks that is is the pairwise column for the 'd2_alt'. To avoid that create a column of NAs, i.e. cbind(test_data[1:4], demo_d2_pre = NA_real_, test_data[5:6]) before converting to data.table

    – akrun
    Feb 27 '18 at 7:38





    @Metrics For your code to work, there should be corresponding pairwise column. Here, we have the d3_pre at the end. So, it thinks that is is the pairwise column for the 'd2_alt'. To avoid that create a column of NAs, i.e. cbind(test_data[1:4], demo_d2_pre = NA_real_, test_data[5:6]) before converting to data.table

    – akrun
    Feb 27 '18 at 7:38




    1




    1





    Thanks a lot. That worked since I have only one variable to be created.

    – Metrics
    Feb 27 '18 at 7:50





    Thanks a lot. That worked since I have only one variable to be created.

    – Metrics
    Feb 27 '18 at 7:50













    1














    Currently, most R approaches don't handle these types of unbalanced datasets very neatly, but instead require you to first convert the data into a "long" format, do some transformation, and then go back into a "wide" format. For this, I had written the merged.stack function as part of my "splitstackshape" package, so named because it would first stack related columns of data into a list and then merge them together.



    The approach would be something like:



    library(splitstackshape)
    merged.stack(test_data, var.stubs = c("_alt", "_pre"), sep = "var.stubs", atStart = FALSE)
    # id .time_1 _alt _pre
    # 1: a demo_d1 2 5
    # 2: a demo_d2 4 NA
    # 3: a demo_d3 8 5
    # 4: b demo_d1 3 4
    # 5: b demo_d2 5 NA
    # 6: b demo_d3 9 5
    # 7: c demo_d1 4 5
    # 8: c demo_d2 6 NA
    # 9: c demo_d3 10 5
    # 10: d demo_d1 5 5
    # 11: d demo_d2 7 NA
    # 12: d demo_d3 11 5


    However, with the improvements that "data.table" has brought to melt, it made sense to revisit the approach.



    The working draft can be found here, and you can load the relevant functions with:



    source("https://gist.githubusercontent.com/mrdwab/ceded54b616bdc61cb0f/raw/c5a583267f8682a5afea1b551347eafdab618471/MELT.R")


    Usage is straightforward: specify the dataset, the stubs, and whether the stubs are at the start or at the end of the variable names.



    library(data.table)
    ReshapeLong_(test_data, stubs = c(alt = "_alt", pre = "_pre"), end_stub = TRUE)
    # id variable alt pre
    # 1: a demo_d1 2 5
    # 2: b demo_d1 3 4
    # 3: c demo_d1 4 5
    # 4: d demo_d1 5 5
    # 5: a demo_d2 4 NA
    # 6: b demo_d2 5 NA
    # 7: c demo_d2 6 NA
    # 8: d demo_d2 7 NA
    # 9: a demo_d3 8 5
    # 10: b demo_d3 9 5
    # 11: c demo_d3 10 5
    # 12: d demo_d3 11 5





    share|improve this answer
























    • Thank you for the answer.

      – Metrics
      Mar 5 '18 at 5:55
















    1














    Currently, most R approaches don't handle these types of unbalanced datasets very neatly, but instead require you to first convert the data into a "long" format, do some transformation, and then go back into a "wide" format. For this, I had written the merged.stack function as part of my "splitstackshape" package, so named because it would first stack related columns of data into a list and then merge them together.



    The approach would be something like:



    library(splitstackshape)
    merged.stack(test_data, var.stubs = c("_alt", "_pre"), sep = "var.stubs", atStart = FALSE)
    # id .time_1 _alt _pre
    # 1: a demo_d1 2 5
    # 2: a demo_d2 4 NA
    # 3: a demo_d3 8 5
    # 4: b demo_d1 3 4
    # 5: b demo_d2 5 NA
    # 6: b demo_d3 9 5
    # 7: c demo_d1 4 5
    # 8: c demo_d2 6 NA
    # 9: c demo_d3 10 5
    # 10: d demo_d1 5 5
    # 11: d demo_d2 7 NA
    # 12: d demo_d3 11 5


    However, with the improvements that "data.table" has brought to melt, it made sense to revisit the approach.



    The working draft can be found here, and you can load the relevant functions with:



    source("https://gist.githubusercontent.com/mrdwab/ceded54b616bdc61cb0f/raw/c5a583267f8682a5afea1b551347eafdab618471/MELT.R")


    Usage is straightforward: specify the dataset, the stubs, and whether the stubs are at the start or at the end of the variable names.



    library(data.table)
    ReshapeLong_(test_data, stubs = c(alt = "_alt", pre = "_pre"), end_stub = TRUE)
    # id variable alt pre
    # 1: a demo_d1 2 5
    # 2: b demo_d1 3 4
    # 3: c demo_d1 4 5
    # 4: d demo_d1 5 5
    # 5: a demo_d2 4 NA
    # 6: b demo_d2 5 NA
    # 7: c demo_d2 6 NA
    # 8: d demo_d2 7 NA
    # 9: a demo_d3 8 5
    # 10: b demo_d3 9 5
    # 11: c demo_d3 10 5
    # 12: d demo_d3 11 5





    share|improve this answer
























    • Thank you for the answer.

      – Metrics
      Mar 5 '18 at 5:55














    1












    1








    1







    Currently, most R approaches don't handle these types of unbalanced datasets very neatly, but instead require you to first convert the data into a "long" format, do some transformation, and then go back into a "wide" format. For this, I had written the merged.stack function as part of my "splitstackshape" package, so named because it would first stack related columns of data into a list and then merge them together.



    The approach would be something like:



    library(splitstackshape)
    merged.stack(test_data, var.stubs = c("_alt", "_pre"), sep = "var.stubs", atStart = FALSE)
    # id .time_1 _alt _pre
    # 1: a demo_d1 2 5
    # 2: a demo_d2 4 NA
    # 3: a demo_d3 8 5
    # 4: b demo_d1 3 4
    # 5: b demo_d2 5 NA
    # 6: b demo_d3 9 5
    # 7: c demo_d1 4 5
    # 8: c demo_d2 6 NA
    # 9: c demo_d3 10 5
    # 10: d demo_d1 5 5
    # 11: d demo_d2 7 NA
    # 12: d demo_d3 11 5


    However, with the improvements that "data.table" has brought to melt, it made sense to revisit the approach.



    The working draft can be found here, and you can load the relevant functions with:



    source("https://gist.githubusercontent.com/mrdwab/ceded54b616bdc61cb0f/raw/c5a583267f8682a5afea1b551347eafdab618471/MELT.R")


    Usage is straightforward: specify the dataset, the stubs, and whether the stubs are at the start or at the end of the variable names.



    library(data.table)
    ReshapeLong_(test_data, stubs = c(alt = "_alt", pre = "_pre"), end_stub = TRUE)
    # id variable alt pre
    # 1: a demo_d1 2 5
    # 2: b demo_d1 3 4
    # 3: c demo_d1 4 5
    # 4: d demo_d1 5 5
    # 5: a demo_d2 4 NA
    # 6: b demo_d2 5 NA
    # 7: c demo_d2 6 NA
    # 8: d demo_d2 7 NA
    # 9: a demo_d3 8 5
    # 10: b demo_d3 9 5
    # 11: c demo_d3 10 5
    # 12: d demo_d3 11 5





    share|improve this answer













    Currently, most R approaches don't handle these types of unbalanced datasets very neatly, but instead require you to first convert the data into a "long" format, do some transformation, and then go back into a "wide" format. For this, I had written the merged.stack function as part of my "splitstackshape" package, so named because it would first stack related columns of data into a list and then merge them together.



    The approach would be something like:



    library(splitstackshape)
    merged.stack(test_data, var.stubs = c("_alt", "_pre"), sep = "var.stubs", atStart = FALSE)
    # id .time_1 _alt _pre
    # 1: a demo_d1 2 5
    # 2: a demo_d2 4 NA
    # 3: a demo_d3 8 5
    # 4: b demo_d1 3 4
    # 5: b demo_d2 5 NA
    # 6: b demo_d3 9 5
    # 7: c demo_d1 4 5
    # 8: c demo_d2 6 NA
    # 9: c demo_d3 10 5
    # 10: d demo_d1 5 5
    # 11: d demo_d2 7 NA
    # 12: d demo_d3 11 5


    However, with the improvements that "data.table" has brought to melt, it made sense to revisit the approach.



    The working draft can be found here, and you can load the relevant functions with:



    source("https://gist.githubusercontent.com/mrdwab/ceded54b616bdc61cb0f/raw/c5a583267f8682a5afea1b551347eafdab618471/MELT.R")


    Usage is straightforward: specify the dataset, the stubs, and whether the stubs are at the start or at the end of the variable names.



    library(data.table)
    ReshapeLong_(test_data, stubs = c(alt = "_alt", pre = "_pre"), end_stub = TRUE)
    # id variable alt pre
    # 1: a demo_d1 2 5
    # 2: b demo_d1 3 4
    # 3: c demo_d1 4 5
    # 4: d demo_d1 5 5
    # 5: a demo_d2 4 NA
    # 6: b demo_d2 5 NA
    # 7: c demo_d2 6 NA
    # 8: d demo_d2 7 NA
    # 9: a demo_d3 8 5
    # 10: b demo_d3 9 5
    # 11: c demo_d3 10 5
    # 12: d demo_d3 11 5






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Mar 3 '18 at 16:43









    A5C1D2H2I1M1N2O1R2T1A5C1D2H2I1M1N2O1R2T1

    153k18284378




    153k18284378













    • Thank you for the answer.

      – Metrics
      Mar 5 '18 at 5:55



















    • Thank you for the answer.

      – Metrics
      Mar 5 '18 at 5:55

















    Thank you for the answer.

    – Metrics
    Mar 5 '18 at 5:55





    Thank you for the answer.

    – Metrics
    Mar 5 '18 at 5:55











    0














    Simple hack would be



    test_data <- as.data.table(test_data)
    test_data[,.NA:=NA]
    melt(test_data, id='id',
    measure = list(c('demo_d1_alt', 'demo_d2_alt', 'demo_d3_alt'),
    c('demo_d1_pre', '.NA', 'demo_d3_pre')))


    align the variables by hand and insert '.NA' where no observation.



    If there are mulitple missing variables,



    test_data <- as.data.table(test_data)
    test_data[,.NA:=NA]
    melt(test_data, id='id',
    measure = list(c('demo_d1_alt', 'demo_d2_alt', '.NA'),
    c('demo_d1_pre', '.NA', 'demo_d3_pre')))





    share|improve this answer




























      0














      Simple hack would be



      test_data <- as.data.table(test_data)
      test_data[,.NA:=NA]
      melt(test_data, id='id',
      measure = list(c('demo_d1_alt', 'demo_d2_alt', 'demo_d3_alt'),
      c('demo_d1_pre', '.NA', 'demo_d3_pre')))


      align the variables by hand and insert '.NA' where no observation.



      If there are mulitple missing variables,



      test_data <- as.data.table(test_data)
      test_data[,.NA:=NA]
      melt(test_data, id='id',
      measure = list(c('demo_d1_alt', 'demo_d2_alt', '.NA'),
      c('demo_d1_pre', '.NA', 'demo_d3_pre')))





      share|improve this answer


























        0












        0








        0







        Simple hack would be



        test_data <- as.data.table(test_data)
        test_data[,.NA:=NA]
        melt(test_data, id='id',
        measure = list(c('demo_d1_alt', 'demo_d2_alt', 'demo_d3_alt'),
        c('demo_d1_pre', '.NA', 'demo_d3_pre')))


        align the variables by hand and insert '.NA' where no observation.



        If there are mulitple missing variables,



        test_data <- as.data.table(test_data)
        test_data[,.NA:=NA]
        melt(test_data, id='id',
        measure = list(c('demo_d1_alt', 'demo_d2_alt', '.NA'),
        c('demo_d1_pre', '.NA', 'demo_d3_pre')))





        share|improve this answer













        Simple hack would be



        test_data <- as.data.table(test_data)
        test_data[,.NA:=NA]
        melt(test_data, id='id',
        measure = list(c('demo_d1_alt', 'demo_d2_alt', 'demo_d3_alt'),
        c('demo_d1_pre', '.NA', 'demo_d3_pre')))


        align the variables by hand and insert '.NA' where no observation.



        If there are mulitple missing variables,



        test_data <- as.data.table(test_data)
        test_data[,.NA:=NA]
        melt(test_data, id='id',
        measure = list(c('demo_d1_alt', 'demo_d2_alt', '.NA'),
        c('demo_d1_pre', '.NA', 'demo_d3_pre')))






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 30 '18 at 4:15









        KH KimKH Kim

        3681311




        3681311






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f49002855%2freshaping-into-long-form-when-there-are-no-column-names-for-certain-repetition%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Mossoró

            Error while reading .h5 file using the rhdf5 package in R

            Pushsharp Apns notification error: 'InvalidToken'