How to efficiently return all the column names across 1m records when certain conditions met

-2

Updated with dummy data and dummycode - apologies, I assumed my question was simple and you could advice the best way without a reproducible example.

dummy<-data.frame(prodA=c(0,0,0,1,1,0,0,1),

              prodB=c(0,0,1,1,0,1,1,0),

              prodC=c(1,1,1,0,0,0,0,1))



dummy[,4:6]<-dummy[,1:3]



for (j in (1:nrow(dummy))){

    for (i in 4:6){

            dummy[j,i]<-ifelse(dummy[j,i]==1,colnames(dummy[i]),"")}

}

dummy2<-dummy[,4:6]

dummy$NewProds<-apply(dummy2,1,paste,collapse="") 

dummy$NewProds<-gsub(".1","//",dummy$NewProds)

My second attempt is as:

prods<-dummy[,1:3]

prods[,4:6]<-dummy[,1:3]

for (i in 4:6){

    prods[,i]<-colnames(prods[i-3])

}



prods[,7:9]<-prods[,4:6]

#works, but I will need multiple ifs for this to work, suggesting this

#won't be very efficient

prods[,10]<-ifelse(prods[,1]==1,prods[,4],"")

Original Post Follows:
I am playing with the Santander Product recommendation dataset from Kaggle. I have identified which products have been purchased from one month to another. This means I have 23 columns of 1's ( when a new product is added) and 0's (when not).
I created the following code to return the column name when a product has been purchased. It works great on a sample of 6 lines, but it runs forever when I try this on the 48k customers who changed, let alone the million in the dataset.

Is there another way to do this?

df2[,99:122]<-df2[,72:95]



for (j in (1:nrow(df2))){

    for (i in 99:122){

            df2[j,i]<-ifelse(df2[j,i]==1,colnames(df2[i]),"")}

}

df22<-df2[,99:122]

df2$NewProds<-apply(df22,1,paste,collapse="") 

df2$NewProds<-gsub("change.1","//",df2$NewProds)

I figured the challenge was that I am looking at every variable and so started with another approach whereby I would take a couple of versions of the data, and then do an if variable is 1 then take the name. However I couldn't get this to work, and I think I come to the same issue.

#copy a bunch of 1's and 0's

prods<-df2[,72:95]

#repeat and overwrite with colnames

prods[,25:48]<-df2[,72:95]

for (i in 25:48){

    prods[,i]<-colnames(prods[i-24])

}

prods[,49:72]<-prods[,25:48]

#attempt to only populate colnames if it was originally a 1 - doesn't work

prod[,49]<-ifelse(prod[,1]==1,prod[,25],"")

I haven't provided any data but I hope you can see what I am tring to do and can advise on efficient ways of doing this.
Thanks in advance,
J

edited Dec 29 '18 at 21:27

asked Dec 29 '18 at 18:00

James Oliver

5816

5

So you actually note that you haven't provided any data, but why would you not just include some and make it a reproducible example. If you're not going to take the time to write a good question, why would we take the time to write a good answer

– Conor Neilson
Dec 29 '18 at 18:28

Can you post sample data? Please edit the question with the output of dput(df2). Or, if it is too big with the output of dput(df2[1:20, 72:95])).

– Rui Barradas
Dec 29 '18 at 18:36

1

I don't understand the output you want. The column names of the columns with at least one 1?

– Rui Barradas
Dec 29 '18 at 18:54

I apologise. I thought my question was simple and that this would not need dummy data. I have provided it now and the working example. The point here is that this works, but for the mass of data it takes far too long. I am looking for someone who can give me a more effective way of doing this. Thank you in advance.

– James Oliver
Dec 29 '18 at 21:20

add a comment |

-2

Updated with dummy data and dummycode - apologies, I assumed my question was simple and you could advice the best way without a reproducible example.

dummy<-data.frame(prodA=c(0,0,0,1,1,0,0,1),

              prodB=c(0,0,1,1,0,1,1,0),

              prodC=c(1,1,1,0,0,0,0,1))



dummy[,4:6]<-dummy[,1:3]



for (j in (1:nrow(dummy))){

    for (i in 4:6){

            dummy[j,i]<-ifelse(dummy[j,i]==1,colnames(dummy[i]),"")}

}

dummy2<-dummy[,4:6]

dummy$NewProds<-apply(dummy2,1,paste,collapse="") 

dummy$NewProds<-gsub(".1","//",dummy$NewProds)

My second attempt is as:

prods<-dummy[,1:3]

prods[,4:6]<-dummy[,1:3]

for (i in 4:6){

    prods[,i]<-colnames(prods[i-3])

}



prods[,7:9]<-prods[,4:6]

#works, but I will need multiple ifs for this to work, suggesting this

#won't be very efficient

prods[,10]<-ifelse(prods[,1]==1,prods[,4],"")

Is there another way to do this?

df2[,99:122]<-df2[,72:95]



for (j in (1:nrow(df2))){

    for (i in 99:122){

            df2[j,i]<-ifelse(df2[j,i]==1,colnames(df2[i]),"")}

}

df22<-df2[,99:122]

df2$NewProds<-apply(df22,1,paste,collapse="") 

df2$NewProds<-gsub("change.1","//",df2$NewProds)

#copy a bunch of 1's and 0's

prods<-df2[,72:95]

#repeat and overwrite with colnames

prods[,25:48]<-df2[,72:95]

for (i in 25:48){

    prods[,i]<-colnames(prods[i-24])

}

prods[,49:72]<-prods[,25:48]

#attempt to only populate colnames if it was originally a 1 - doesn't work

prod[,49]<-ifelse(prod[,1]==1,prod[,25],"")

I haven't provided any data but I hope you can see what I am tring to do and can advise on efficient ways of doing this.
Thanks in advance,
J

edited Dec 29 '18 at 21:27

asked Dec 29 '18 at 18:00

James Oliver

5816

5

So you actually note that you haven't provided any data, but why would you not just include some and make it a reproducible example. If you're not going to take the time to write a good question, why would we take the time to write a good answer

– Conor Neilson
Dec 29 '18 at 18:28

Can you post sample data? Please edit the question with the output of dput(df2). Or, if it is too big with the output of dput(df2[1:20, 72:95])).

– Rui Barradas
Dec 29 '18 at 18:36

1

I don't understand the output you want. The column names of the columns with at least one 1?

– Rui Barradas
Dec 29 '18 at 18:54

I apologise. I thought my question was simple and that this would not need dummy data. I have provided it now and the working example. The point here is that this works, but for the mass of data it takes far too long. I am looking for someone who can give me a more effective way of doing this. Thank you in advance.

– James Oliver
Dec 29 '18 at 21:20

add a comment |

-2

Updated with dummy data and dummycode - apologies, I assumed my question was simple and you could advice the best way without a reproducible example.

dummy<-data.frame(prodA=c(0,0,0,1,1,0,0,1),

              prodB=c(0,0,1,1,0,1,1,0),

              prodC=c(1,1,1,0,0,0,0,1))



dummy[,4:6]<-dummy[,1:3]



for (j in (1:nrow(dummy))){

    for (i in 4:6){

            dummy[j,i]<-ifelse(dummy[j,i]==1,colnames(dummy[i]),"")}

}

dummy2<-dummy[,4:6]

dummy$NewProds<-apply(dummy2,1,paste,collapse="") 

dummy$NewProds<-gsub(".1","//",dummy$NewProds)

My second attempt is as:

prods<-dummy[,1:3]

prods[,4:6]<-dummy[,1:3]

for (i in 4:6){

    prods[,i]<-colnames(prods[i-3])

}



prods[,7:9]<-prods[,4:6]

#works, but I will need multiple ifs for this to work, suggesting this

#won't be very efficient

prods[,10]<-ifelse(prods[,1]==1,prods[,4],"")

Is there another way to do this?

df2[,99:122]<-df2[,72:95]



for (j in (1:nrow(df2))){

    for (i in 99:122){

            df2[j,i]<-ifelse(df2[j,i]==1,colnames(df2[i]),"")}

}

df22<-df2[,99:122]

df2$NewProds<-apply(df22,1,paste,collapse="") 

df2$NewProds<-gsub("change.1","//",df2$NewProds)

#copy a bunch of 1's and 0's

prods<-df2[,72:95]

#repeat and overwrite with colnames

prods[,25:48]<-df2[,72:95]

for (i in 25:48){

    prods[,i]<-colnames(prods[i-24])

}

prods[,49:72]<-prods[,25:48]

#attempt to only populate colnames if it was originally a 1 - doesn't work

prod[,49]<-ifelse(prod[,1]==1,prod[,25],"")

I haven't provided any data but I hope you can see what I am tring to do and can advise on efficient ways of doing this.
Thanks in advance,
J

edited Dec 29 '18 at 21:27

asked Dec 29 '18 at 18:00

James Oliver

5816

Updated with dummy data and dummycode - apologies, I assumed my question was simple and you could advice the best way without a reproducible example.

dummy<-data.frame(prodA=c(0,0,0,1,1,0,0,1),

              prodB=c(0,0,1,1,0,1,1,0),

              prodC=c(1,1,1,0,0,0,0,1))



dummy[,4:6]<-dummy[,1:3]



for (j in (1:nrow(dummy))){

    for (i in 4:6){

            dummy[j,i]<-ifelse(dummy[j,i]==1,colnames(dummy[i]),"")}

}

dummy2<-dummy[,4:6]

dummy$NewProds<-apply(dummy2,1,paste,collapse="") 

dummy$NewProds<-gsub(".1","//",dummy$NewProds)

My second attempt is as:

prods<-dummy[,1:3]

prods[,4:6]<-dummy[,1:3]

for (i in 4:6){

    prods[,i]<-colnames(prods[i-3])

}



prods[,7:9]<-prods[,4:6]

#works, but I will need multiple ifs for this to work, suggesting this

#won't be very efficient

prods[,10]<-ifelse(prods[,1]==1,prods[,4],"")

Is there another way to do this?

df2[,99:122]<-df2[,72:95]



for (j in (1:nrow(df2))){

    for (i in 99:122){

            df2[j,i]<-ifelse(df2[j,i]==1,colnames(df2[i]),"")}

}

df22<-df2[,99:122]

df2$NewProds<-apply(df22,1,paste,collapse="") 

df2$NewProds<-gsub("change.1","//",df2$NewProds)

#copy a bunch of 1's and 0's

prods<-df2[,72:95]

#repeat and overwrite with colnames

prods[,25:48]<-df2[,72:95]

for (i in 25:48){

    prods[,i]<-colnames(prods[i-24])

}

prods[,49:72]<-prods[,25:48]

#attempt to only populate colnames if it was originally a 1 - doesn't work

prod[,49]<-ifelse(prod[,1]==1,prod[,25],"")

I haven't provided any data but I hope you can see what I am tring to do and can advise on efficient ways of doing this.
Thanks in advance,
J

r loops

edited Dec 29 '18 at 21:27

asked Dec 29 '18 at 18:00

James Oliver

5816

edited Dec 29 '18 at 21:27

asked Dec 29 '18 at 18:00

James Oliver

5816

edited Dec 29 '18 at 21:27

asked Dec 29 '18 at 18:00

James Oliver

5816

asked Dec 29 '18 at 18:00

James Oliver

5816

asked Dec 29 '18 at 18:00

James Oliver

5816

5

So you actually note that you haven't provided any data, but why would you not just include some and make it a reproducible example. If you're not going to take the time to write a good question, why would we take the time to write a good answer

– Conor Neilson
Dec 29 '18 at 18:28

Can you post sample data? Please edit the question with the output of dput(df2). Or, if it is too big with the output of dput(df2[1:20, 72:95])).

– Rui Barradas
Dec 29 '18 at 18:36

1

I don't understand the output you want. The column names of the columns with at least one 1?

– Rui Barradas
Dec 29 '18 at 18:54

I apologise. I thought my question was simple and that this would not need dummy data. I have provided it now and the working example. The point here is that this works, but for the mass of data it takes far too long. I am looking for someone who can give me a more effective way of doing this. Thank you in advance.

– James Oliver
Dec 29 '18 at 21:20

add a comment |

5

So you actually note that you haven't provided any data, but why would you not just include some and make it a reproducible example. If you're not going to take the time to write a good question, why would we take the time to write a good answer

– Conor Neilson
Dec 29 '18 at 18:28

Can you post sample data? Please edit the question with the output of dput(df2). Or, if it is too big with the output of dput(df2[1:20, 72:95])).

– Rui Barradas
Dec 29 '18 at 18:36

1

I don't understand the output you want. The column names of the columns with at least one 1?

– Rui Barradas
Dec 29 '18 at 18:54

I apologise. I thought my question was simple and that this would not need dummy data. I have provided it now and the working example. The point here is that this works, but for the mass of data it takes far too long. I am looking for someone who can give me a more effective way of doing this. Thank you in advance.

– James Oliver
Dec 29 '18 at 21:20

So you actually note that you haven't provided any data, but why would you not just include some and make it a reproducible example. If you're not going to take the time to write a good question, why would we take the time to write a good answer

– Conor Neilson
Dec 29 '18 at 18:28

Can you post sample data? Please edit the question with the output of dput(df2). Or, if it is too big with the output of dput(df2[1:20, 72:95])).

– Rui Barradas
Dec 29 '18 at 18:36

I don't understand the output you want. The column names of the columns with at least one 1?

– Rui Barradas
Dec 29 '18 at 18:54

I apologise. I thought my question was simple and that this would not need dummy data. I have provided it now and the working example. The point here is that this works, but for the mass of data it takes far too long. I am looking for someone who can give me a more effective way of doing this. Thank you in advance.

– James Oliver
Dec 29 '18 at 21:20

add a comment |

2 Answers
2

active

oldest

votes

Using apply as @AndersEllernBilgrau illustrated is one obvious way to do it, but it will be slow for data sets with many rows.

dummy[["NewProds"]] <- do.call(

    paste,

    c(mapply(ifelse,

             dummy,

             names(dummy),

             MoreArgs = list(no = ""),

             SIMPLIFY = FALSE),

      sep = "//"))

is a bit harder to follow, but it will be much faster:

library(microbenchmark)



n <- 10000

dummy <- data.frame(prodA = rep(c(0,0,0,1,1,0,0,1), n),

                    prodB = rep(c(0,0,1,1,0,1,1,0), n),

                    prodC = rep(c(1,1,1,0,0,0,0,1), n))



microbenchmark(

    do.call = do.call(

        paste,

        c(mapply(ifelse,

                 dummy,

                 names(dummy),

                 MoreArgs = list(no = ""),

                 SIMPLIFY = FALSE),

          sep = "//")),

    apply = apply(

        dummy == 1,

        1,

        function(x) paste0(names(which(x)), collapse = "//")

    ))

## Unit: milliseconds

##     expr       min        lq      mean   median       uq      max neval cld

##  do.call  63.92695  65.44777  72.07261  67.8667  73.3850 184.5151   100  a 

##    apply 296.81323 364.31947 404.71894 397.0927 443.7223 683.3892   100   b

answered Dec 30 '18 at 1:21

Ista

7,69712426

Wow! I cannot believe how quick that was. Thank you. I need to get closer to these functions.

– James Oliver
Jan 5 at 13:17

add a comment |

Without data, I have a hard time understanding precisely what you want to do.
A couple of things are (almost) certain however:

You probably do not need for loops.

You should used R's vectorized functions, the dataset is not that big

Using some toy data, does the following do what you want?

d <- 23

n <- 46e3



# Simulate some toy data

df <- data.frame(matrix(rbinom(d*n, 1, 0.1), n, d),

                 row.names = paste0("row", 1:n))

head(df)

      X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23

row1  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   1   0   0   0   0   0   0   0

row2  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

row3  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   1   0   0   0

row4  0  0  0  1  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   1   0   0   0

row5  0  0  0  0  0  0  1  0  0   0   0   0   0   0   1   0   0   0   0   0   0   0   0

row6  0  0  0  1  0  0  0  0  0   0   0   0   0   0   0   0   0   1   0   0   1   0   0







# Paste together the colnames of all non-zero rows

res <- apply(df == 1, 1, function(x) paste0(names(which(x)), collapse = "-"))

head(res)

#    row1         row2         row3         row4         row5         row6 

#"X8-X16"         "X1"     "X8-X20" "X4-X11-X20"     "X7-X15" "X4-X18-X21"

I.e. res is here a character vector of length n with the colnames of each row the corresponding to 1 entries pasted together (with separator -). This it at least what it appears to me what your code is doing conceptually.

edited Dec 29 '18 at 18:56

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

1

The OP wants colnames.

– Rui Barradas
Dec 29 '18 at 18:41

@RuiBarradas Arh, doh

– Anders Ellern Bilgrau
Dec 29 '18 at 18:42

Thank you for trying. I have updated the question with dummy data and my amended code so that it works with the dummy code.

– James Oliver
Dec 29 '18 at 21:24

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53972034%2fhow-to-efficiently-return-all-the-column-names-across-1m-records-when-certain-co%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Using apply as @AndersEllernBilgrau illustrated is one obvious way to do it, but it will be slow for data sets with many rows.

dummy[["NewProds"]] <- do.call(

    paste,

    c(mapply(ifelse,

             dummy,

             names(dummy),

             MoreArgs = list(no = ""),

             SIMPLIFY = FALSE),

      sep = "//"))

is a bit harder to follow, but it will be much faster:

library(microbenchmark)



n <- 10000

dummy <- data.frame(prodA = rep(c(0,0,0,1,1,0,0,1), n),

                    prodB = rep(c(0,0,1,1,0,1,1,0), n),

                    prodC = rep(c(1,1,1,0,0,0,0,1), n))



microbenchmark(

    do.call = do.call(

        paste,

        c(mapply(ifelse,

                 dummy,

                 names(dummy),

                 MoreArgs = list(no = ""),

                 SIMPLIFY = FALSE),

          sep = "//")),

    apply = apply(

        dummy == 1,

        1,

        function(x) paste0(names(which(x)), collapse = "//")

    ))

## Unit: milliseconds

##     expr       min        lq      mean   median       uq      max neval cld

##  do.call  63.92695  65.44777  72.07261  67.8667  73.3850 184.5151   100  a 

##    apply 296.81323 364.31947 404.71894 397.0927 443.7223 683.3892   100   b

answered Dec 30 '18 at 1:21

Ista

7,69712426

Wow! I cannot believe how quick that was. Thank you. I need to get closer to these functions.

– James Oliver
Jan 5 at 13:17

add a comment |

Using apply as @AndersEllernBilgrau illustrated is one obvious way to do it, but it will be slow for data sets with many rows.

dummy[["NewProds"]] <- do.call(

    paste,

    c(mapply(ifelse,

             dummy,

             names(dummy),

             MoreArgs = list(no = ""),

             SIMPLIFY = FALSE),

      sep = "//"))

is a bit harder to follow, but it will be much faster:

library(microbenchmark)



n <- 10000

dummy <- data.frame(prodA = rep(c(0,0,0,1,1,0,0,1), n),

                    prodB = rep(c(0,0,1,1,0,1,1,0), n),

                    prodC = rep(c(1,1,1,0,0,0,0,1), n))



microbenchmark(

    do.call = do.call(

        paste,

        c(mapply(ifelse,

                 dummy,

                 names(dummy),

                 MoreArgs = list(no = ""),

                 SIMPLIFY = FALSE),

          sep = "//")),

    apply = apply(

        dummy == 1,

        1,

        function(x) paste0(names(which(x)), collapse = "//")

    ))

## Unit: milliseconds

##     expr       min        lq      mean   median       uq      max neval cld

##  do.call  63.92695  65.44777  72.07261  67.8667  73.3850 184.5151   100  a 

##    apply 296.81323 364.31947 404.71894 397.0927 443.7223 683.3892   100   b

answered Dec 30 '18 at 1:21

Ista

7,69712426

Wow! I cannot believe how quick that was. Thank you. I need to get closer to these functions.

– James Oliver
Jan 5 at 13:17

add a comment |

Using apply as @AndersEllernBilgrau illustrated is one obvious way to do it, but it will be slow for data sets with many rows.

dummy[["NewProds"]] <- do.call(

    paste,

    c(mapply(ifelse,

             dummy,

             names(dummy),

             MoreArgs = list(no = ""),

             SIMPLIFY = FALSE),

      sep = "//"))

is a bit harder to follow, but it will be much faster:

library(microbenchmark)



n <- 10000

dummy <- data.frame(prodA = rep(c(0,0,0,1,1,0,0,1), n),

                    prodB = rep(c(0,0,1,1,0,1,1,0), n),

                    prodC = rep(c(1,1,1,0,0,0,0,1), n))



microbenchmark(

    do.call = do.call(

        paste,

        c(mapply(ifelse,

                 dummy,

                 names(dummy),

                 MoreArgs = list(no = ""),

                 SIMPLIFY = FALSE),

          sep = "//")),

    apply = apply(

        dummy == 1,

        1,

        function(x) paste0(names(which(x)), collapse = "//")

    ))

## Unit: milliseconds

##     expr       min        lq      mean   median       uq      max neval cld

##  do.call  63.92695  65.44777  72.07261  67.8667  73.3850 184.5151   100  a 

##    apply 296.81323 364.31947 404.71894 397.0927 443.7223 683.3892   100   b

answered Dec 30 '18 at 1:21

Ista

7,69712426

Using apply as @AndersEllernBilgrau illustrated is one obvious way to do it, but it will be slow for data sets with many rows.

dummy[["NewProds"]] <- do.call(

    paste,

    c(mapply(ifelse,

             dummy,

             names(dummy),

             MoreArgs = list(no = ""),

             SIMPLIFY = FALSE),

      sep = "//"))

is a bit harder to follow, but it will be much faster:

library(microbenchmark)



n <- 10000

dummy <- data.frame(prodA = rep(c(0,0,0,1,1,0,0,1), n),

                    prodB = rep(c(0,0,1,1,0,1,1,0), n),

                    prodC = rep(c(1,1,1,0,0,0,0,1), n))



microbenchmark(

    do.call = do.call(

        paste,

        c(mapply(ifelse,

                 dummy,

                 names(dummy),

                 MoreArgs = list(no = ""),

                 SIMPLIFY = FALSE),

          sep = "//")),

    apply = apply(

        dummy == 1,

        1,

        function(x) paste0(names(which(x)), collapse = "//")

    ))

## Unit: milliseconds

##     expr       min        lq      mean   median       uq      max neval cld

##  do.call  63.92695  65.44777  72.07261  67.8667  73.3850 184.5151   100  a 

##    apply 296.81323 364.31947 404.71894 397.0927 443.7223 683.3892   100   b

answered Dec 30 '18 at 1:21

Ista

7,69712426

answered Dec 30 '18 at 1:21

Ista

7,69712426

answered Dec 30 '18 at 1:21

Ista

7,69712426

answered Dec 30 '18 at 1:21

Ista

7,69712426

Wow! I cannot believe how quick that was. Thank you. I need to get closer to these functions.

– James Oliver
Jan 5 at 13:17

add a comment |

Wow! I cannot believe how quick that was. Thank you. I need to get closer to these functions.

– James Oliver
Jan 5 at 13:17

Wow! I cannot believe how quick that was. Thank you. I need to get closer to these functions.

– James Oliver
Jan 5 at 13:17

add a comment |

Without data, I have a hard time understanding precisely what you want to do.
A couple of things are (almost) certain however:

You probably do not need for loops.

You should used R's vectorized functions, the dataset is not that big

Using some toy data, does the following do what you want?

d <- 23

n <- 46e3



# Simulate some toy data

df <- data.frame(matrix(rbinom(d*n, 1, 0.1), n, d),

                 row.names = paste0("row", 1:n))

head(df)

      X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23

row1  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   1   0   0   0   0   0   0   0

row2  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

row3  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   1   0   0   0

row4  0  0  0  1  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   1   0   0   0

row5  0  0  0  0  0  0  1  0  0   0   0   0   0   0   1   0   0   0   0   0   0   0   0

row6  0  0  0  1  0  0  0  0  0   0   0   0   0   0   0   0   0   1   0   0   1   0   0







# Paste together the colnames of all non-zero rows

res <- apply(df == 1, 1, function(x) paste0(names(which(x)), collapse = "-"))

head(res)

#    row1         row2         row3         row4         row5         row6 

#"X8-X16"         "X1"     "X8-X20" "X4-X11-X20"     "X7-X15" "X4-X18-X21"

edited Dec 29 '18 at 18:56

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

1

The OP wants colnames.

– Rui Barradas
Dec 29 '18 at 18:41

@RuiBarradas Arh, doh

– Anders Ellern Bilgrau
Dec 29 '18 at 18:42

Thank you for trying. I have updated the question with dummy data and my amended code so that it works with the dummy code.

– James Oliver
Dec 29 '18 at 21:24

add a comment |

Without data, I have a hard time understanding precisely what you want to do.
A couple of things are (almost) certain however:

You probably do not need for loops.

You should used R's vectorized functions, the dataset is not that big

Using some toy data, does the following do what you want?

d <- 23

n <- 46e3



# Simulate some toy data

df <- data.frame(matrix(rbinom(d*n, 1, 0.1), n, d),

                 row.names = paste0("row", 1:n))

head(df)

      X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23

row1  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   1   0   0   0   0   0   0   0

row2  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

row3  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   1   0   0   0

row4  0  0  0  1  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   1   0   0   0

row5  0  0  0  0  0  0  1  0  0   0   0   0   0   0   1   0   0   0   0   0   0   0   0

row6  0  0  0  1  0  0  0  0  0   0   0   0   0   0   0   0   0   1   0   0   1   0   0







# Paste together the colnames of all non-zero rows

res <- apply(df == 1, 1, function(x) paste0(names(which(x)), collapse = "-"))

head(res)

#    row1         row2         row3         row4         row5         row6 

#"X8-X16"         "X1"     "X8-X20" "X4-X11-X20"     "X7-X15" "X4-X18-X21"

edited Dec 29 '18 at 18:56

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

1

The OP wants colnames.

– Rui Barradas
Dec 29 '18 at 18:41

@RuiBarradas Arh, doh

– Anders Ellern Bilgrau
Dec 29 '18 at 18:42

Thank you for trying. I have updated the question with dummy data and my amended code so that it works with the dummy code.

– James Oliver
Dec 29 '18 at 21:24

add a comment |

Without data, I have a hard time understanding precisely what you want to do.
A couple of things are (almost) certain however:

You probably do not need for loops.

You should used R's vectorized functions, the dataset is not that big

Using some toy data, does the following do what you want?

d <- 23

n <- 46e3



# Simulate some toy data

df <- data.frame(matrix(rbinom(d*n, 1, 0.1), n, d),

                 row.names = paste0("row", 1:n))

head(df)

      X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23

row1  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   1   0   0   0   0   0   0   0

row2  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

row3  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   1   0   0   0

row4  0  0  0  1  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   1   0   0   0

row5  0  0  0  0  0  0  1  0  0   0   0   0   0   0   1   0   0   0   0   0   0   0   0

row6  0  0  0  1  0  0  0  0  0   0   0   0   0   0   0   0   0   1   0   0   1   0   0







# Paste together the colnames of all non-zero rows

res <- apply(df == 1, 1, function(x) paste0(names(which(x)), collapse = "-"))

head(res)

#    row1         row2         row3         row4         row5         row6 

#"X8-X16"         "X1"     "X8-X20" "X4-X11-X20"     "X7-X15" "X4-X18-X21"

edited Dec 29 '18 at 18:56

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

Without data, I have a hard time understanding precisely what you want to do.
A couple of things are (almost) certain however:

You probably do not need for loops.

You should used R's vectorized functions, the dataset is not that big

Using some toy data, does the following do what you want?

d <- 23

n <- 46e3



# Simulate some toy data

df <- data.frame(matrix(rbinom(d*n, 1, 0.1), n, d),

                 row.names = paste0("row", 1:n))

head(df)

      X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23

row1  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   1   0   0   0   0   0   0   0

row2  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

row3  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   1   0   0   0

row4  0  0  0  1  0  0  0  0  0   0   1   0   0   0   0   0   0   0   0   1   0   0   0

row5  0  0  0  0  0  0  1  0  0   0   0   0   0   0   1   0   0   0   0   0   0   0   0

row6  0  0  0  1  0  0  0  0  0   0   0   0   0   0   0   0   0   1   0   0   1   0   0







# Paste together the colnames of all non-zero rows

res <- apply(df == 1, 1, function(x) paste0(names(which(x)), collapse = "-"))

head(res)

#    row1         row2         row3         row4         row5         row6 

#"X8-X16"         "X1"     "X8-X20" "X4-X11-X20"     "X7-X15" "X4-X18-X21"

edited Dec 29 '18 at 18:56

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

edited Dec 29 '18 at 18:56

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

answered Dec 29 '18 at 18:40

Anders Ellern Bilgrau

6,4231730

1

The OP wants colnames.

– Rui Barradas
Dec 29 '18 at 18:41

@RuiBarradas Arh, doh

– Anders Ellern Bilgrau
Dec 29 '18 at 18:42

Thank you for trying. I have updated the question with dummy data and my amended code so that it works with the dummy code.

– James Oliver
Dec 29 '18 at 21:24

add a comment |

1

The OP wants colnames.

– Rui Barradas
Dec 29 '18 at 18:41

@RuiBarradas Arh, doh

– Anders Ellern Bilgrau
Dec 29 '18 at 18:42

Thank you for trying. I have updated the question with dummy data and my amended code so that it works with the dummy code.

– James Oliver
Dec 29 '18 at 21:24

The OP wants colnames.

– Rui Barradas
Dec 29 '18 at 18:41

@RuiBarradas Arh, doh

– Anders Ellern Bilgrau
Dec 29 '18 at 18:42

Thank you for trying. I have updated the question with dummy data and my amended code so that it works with the dummy code.

– James Oliver
Dec 29 '18 at 21:24

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk