How to split data frame into groups, combine rows












0












$begingroup$


I have a large data set with 405 columns and many rows, with data from 15 sites. Each site has 27 columns, each one one quadrats data. Rows are species data. I want to split the data into the 15 sites and be able to use functions such as adding or averaging together all 27 columns to get an idea of the species presence at each site. I have tried creating a vector of sites and using this to split the data.



Example
"n<-rep(27,15) #repeats 27 15 times (15 sites, 27 quadrats per site in 2018)
names<-c("BB", "BEP","BKP", "BP","BY",'DB','DP','H1P','LTP','NB','NP','NRP','OP','ZB','ZP')
sites18<-as.factor(rep(names[rep(1:15, n)]) ) #site name replicated 27 times



When I use the split function, it loses the species data and makes one long column for each site.
split18<-as.data.frame(split(t(p18),sites18))



I need another solution, perhaps something with an apply function, but I have been unable to find a good solution.










share|improve this question









$endgroup$




bumped to the homepage by Community 5 hours ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.




















    0












    $begingroup$


    I have a large data set with 405 columns and many rows, with data from 15 sites. Each site has 27 columns, each one one quadrats data. Rows are species data. I want to split the data into the 15 sites and be able to use functions such as adding or averaging together all 27 columns to get an idea of the species presence at each site. I have tried creating a vector of sites and using this to split the data.



    Example
    "n<-rep(27,15) #repeats 27 15 times (15 sites, 27 quadrats per site in 2018)
    names<-c("BB", "BEP","BKP", "BP","BY",'DB','DP','H1P','LTP','NB','NP','NRP','OP','ZB','ZP')
    sites18<-as.factor(rep(names[rep(1:15, n)]) ) #site name replicated 27 times



    When I use the split function, it loses the species data and makes one long column for each site.
    split18<-as.data.frame(split(t(p18),sites18))



    I need another solution, perhaps something with an apply function, but I have been unable to find a good solution.










    share|improve this question









    $endgroup$




    bumped to the homepage by Community 5 hours ago


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.


















      0












      0








      0





      $begingroup$


      I have a large data set with 405 columns and many rows, with data from 15 sites. Each site has 27 columns, each one one quadrats data. Rows are species data. I want to split the data into the 15 sites and be able to use functions such as adding or averaging together all 27 columns to get an idea of the species presence at each site. I have tried creating a vector of sites and using this to split the data.



      Example
      "n<-rep(27,15) #repeats 27 15 times (15 sites, 27 quadrats per site in 2018)
      names<-c("BB", "BEP","BKP", "BP","BY",'DB','DP','H1P','LTP','NB','NP','NRP','OP','ZB','ZP')
      sites18<-as.factor(rep(names[rep(1:15, n)]) ) #site name replicated 27 times



      When I use the split function, it loses the species data and makes one long column for each site.
      split18<-as.data.frame(split(t(p18),sites18))



      I need another solution, perhaps something with an apply function, but I have been unable to find a good solution.










      share|improve this question









      $endgroup$




      I have a large data set with 405 columns and many rows, with data from 15 sites. Each site has 27 columns, each one one quadrats data. Rows are species data. I want to split the data into the 15 sites and be able to use functions such as adding or averaging together all 27 columns to get an idea of the species presence at each site. I have tried creating a vector of sites and using this to split the data.



      Example
      "n<-rep(27,15) #repeats 27 15 times (15 sites, 27 quadrats per site in 2018)
      names<-c("BB", "BEP","BKP", "BP","BY",'DB','DP','H1P','LTP','NB','NP','NRP','OP','ZB','ZP')
      sites18<-as.factor(rep(names[rep(1:15, n)]) ) #site name replicated 27 times



      When I use the split function, it loses the species data and makes one long column for each site.
      split18<-as.data.frame(split(t(p18),sites18))



      I need another solution, perhaps something with an apply function, but I have been unable to find a good solution.







      r dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 11 at 21:17









      Rebecca SwabRebecca Swab

      1




      1





      bumped to the homepage by Community 5 hours ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community 5 hours ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
























          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          If I understood you correctly, you want to vertically split your data frame and horizontally union the resulting data frames. Here is an example for what I would do in this case:



          #assuming exemplary dataset
          dataset <- data.frame(matrix(rnorm(n = 27 * 15 * 10), nrow = 10));
          colnames(dataset) <- paste(
          as.character(sites18)
          ,rep(1:27, length.out = 27 * 15)
          ,sep = "_");
          str(dataset);

          #create list of data frames using a vertical split
          (list_df_by_group <- lapply(
          names
          ,function(name) dataset[, paste(rep(name, 27), 1:27, sep = "_")]));
          names(list_df_by_group) <- names;
          str(list_df_by_group);

          #horizontally union data frames
          final_dataset <- data.frame(matrix(ncol = 28, nrow = 0));
          colnames(final_dataset) <- c("group", as.character(1:27));
          for(name in names){
          colnames(list_df_by_group[[name]]) <- as.character(1:27);
          final_dataset <- rbind(
          final_dataset,
          cbind(
          data.frame(group = rep(name, nrow(list_df_by_group[[name]])))
          ,list_df_by_group[[name]]));
          }
          str(final_dataset);





          share|improve this answer









          $endgroup$













          • $begingroup$
            Hi @Franziska , Thanks for taking the time to answer my question! This isn't quite what i'm looking for. for a final product, I am looking for 15 columns, each column for a site. For instance, I want to add together all values for each row for each site. I realize i could do this manually like this "cbind(rowSums(p18[,1:27], rowSums(p18[,28:56], ....etc...". However I am trying to learn how to automate. Perhaps if I modify your "horizontally union data frames" section?
            $endgroup$
            – Rebecca Swab
            Jan 14 at 20:46













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43870%2fhow-to-split-data-frame-into-groups-combine-rows%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          If I understood you correctly, you want to vertically split your data frame and horizontally union the resulting data frames. Here is an example for what I would do in this case:



          #assuming exemplary dataset
          dataset <- data.frame(matrix(rnorm(n = 27 * 15 * 10), nrow = 10));
          colnames(dataset) <- paste(
          as.character(sites18)
          ,rep(1:27, length.out = 27 * 15)
          ,sep = "_");
          str(dataset);

          #create list of data frames using a vertical split
          (list_df_by_group <- lapply(
          names
          ,function(name) dataset[, paste(rep(name, 27), 1:27, sep = "_")]));
          names(list_df_by_group) <- names;
          str(list_df_by_group);

          #horizontally union data frames
          final_dataset <- data.frame(matrix(ncol = 28, nrow = 0));
          colnames(final_dataset) <- c("group", as.character(1:27));
          for(name in names){
          colnames(list_df_by_group[[name]]) <- as.character(1:27);
          final_dataset <- rbind(
          final_dataset,
          cbind(
          data.frame(group = rep(name, nrow(list_df_by_group[[name]])))
          ,list_df_by_group[[name]]));
          }
          str(final_dataset);





          share|improve this answer









          $endgroup$













          • $begingroup$
            Hi @Franziska , Thanks for taking the time to answer my question! This isn't quite what i'm looking for. for a final product, I am looking for 15 columns, each column for a site. For instance, I want to add together all values for each row for each site. I realize i could do this manually like this "cbind(rowSums(p18[,1:27], rowSums(p18[,28:56], ....etc...". However I am trying to learn how to automate. Perhaps if I modify your "horizontally union data frames" section?
            $endgroup$
            – Rebecca Swab
            Jan 14 at 20:46


















          0












          $begingroup$

          If I understood you correctly, you want to vertically split your data frame and horizontally union the resulting data frames. Here is an example for what I would do in this case:



          #assuming exemplary dataset
          dataset <- data.frame(matrix(rnorm(n = 27 * 15 * 10), nrow = 10));
          colnames(dataset) <- paste(
          as.character(sites18)
          ,rep(1:27, length.out = 27 * 15)
          ,sep = "_");
          str(dataset);

          #create list of data frames using a vertical split
          (list_df_by_group <- lapply(
          names
          ,function(name) dataset[, paste(rep(name, 27), 1:27, sep = "_")]));
          names(list_df_by_group) <- names;
          str(list_df_by_group);

          #horizontally union data frames
          final_dataset <- data.frame(matrix(ncol = 28, nrow = 0));
          colnames(final_dataset) <- c("group", as.character(1:27));
          for(name in names){
          colnames(list_df_by_group[[name]]) <- as.character(1:27);
          final_dataset <- rbind(
          final_dataset,
          cbind(
          data.frame(group = rep(name, nrow(list_df_by_group[[name]])))
          ,list_df_by_group[[name]]));
          }
          str(final_dataset);





          share|improve this answer









          $endgroup$













          • $begingroup$
            Hi @Franziska , Thanks for taking the time to answer my question! This isn't quite what i'm looking for. for a final product, I am looking for 15 columns, each column for a site. For instance, I want to add together all values for each row for each site. I realize i could do this manually like this "cbind(rowSums(p18[,1:27], rowSums(p18[,28:56], ....etc...". However I am trying to learn how to automate. Perhaps if I modify your "horizontally union data frames" section?
            $endgroup$
            – Rebecca Swab
            Jan 14 at 20:46
















          0












          0








          0





          $begingroup$

          If I understood you correctly, you want to vertically split your data frame and horizontally union the resulting data frames. Here is an example for what I would do in this case:



          #assuming exemplary dataset
          dataset <- data.frame(matrix(rnorm(n = 27 * 15 * 10), nrow = 10));
          colnames(dataset) <- paste(
          as.character(sites18)
          ,rep(1:27, length.out = 27 * 15)
          ,sep = "_");
          str(dataset);

          #create list of data frames using a vertical split
          (list_df_by_group <- lapply(
          names
          ,function(name) dataset[, paste(rep(name, 27), 1:27, sep = "_")]));
          names(list_df_by_group) <- names;
          str(list_df_by_group);

          #horizontally union data frames
          final_dataset <- data.frame(matrix(ncol = 28, nrow = 0));
          colnames(final_dataset) <- c("group", as.character(1:27));
          for(name in names){
          colnames(list_df_by_group[[name]]) <- as.character(1:27);
          final_dataset <- rbind(
          final_dataset,
          cbind(
          data.frame(group = rep(name, nrow(list_df_by_group[[name]])))
          ,list_df_by_group[[name]]));
          }
          str(final_dataset);





          share|improve this answer









          $endgroup$



          If I understood you correctly, you want to vertically split your data frame and horizontally union the resulting data frames. Here is an example for what I would do in this case:



          #assuming exemplary dataset
          dataset <- data.frame(matrix(rnorm(n = 27 * 15 * 10), nrow = 10));
          colnames(dataset) <- paste(
          as.character(sites18)
          ,rep(1:27, length.out = 27 * 15)
          ,sep = "_");
          str(dataset);

          #create list of data frames using a vertical split
          (list_df_by_group <- lapply(
          names
          ,function(name) dataset[, paste(rep(name, 27), 1:27, sep = "_")]));
          names(list_df_by_group) <- names;
          str(list_df_by_group);

          #horizontally union data frames
          final_dataset <- data.frame(matrix(ncol = 28, nrow = 0));
          colnames(final_dataset) <- c("group", as.character(1:27));
          for(name in names){
          colnames(list_df_by_group[[name]]) <- as.character(1:27);
          final_dataset <- rbind(
          final_dataset,
          cbind(
          data.frame(group = rep(name, nrow(list_df_by_group[[name]])))
          ,list_df_by_group[[name]]));
          }
          str(final_dataset);






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 12 at 19:30









          Franziska W.Franziska W.

          15614




          15614












          • $begingroup$
            Hi @Franziska , Thanks for taking the time to answer my question! This isn't quite what i'm looking for. for a final product, I am looking for 15 columns, each column for a site. For instance, I want to add together all values for each row for each site. I realize i could do this manually like this "cbind(rowSums(p18[,1:27], rowSums(p18[,28:56], ....etc...". However I am trying to learn how to automate. Perhaps if I modify your "horizontally union data frames" section?
            $endgroup$
            – Rebecca Swab
            Jan 14 at 20:46




















          • $begingroup$
            Hi @Franziska , Thanks for taking the time to answer my question! This isn't quite what i'm looking for. for a final product, I am looking for 15 columns, each column for a site. For instance, I want to add together all values for each row for each site. I realize i could do this manually like this "cbind(rowSums(p18[,1:27], rowSums(p18[,28:56], ....etc...". However I am trying to learn how to automate. Perhaps if I modify your "horizontally union data frames" section?
            $endgroup$
            – Rebecca Swab
            Jan 14 at 20:46


















          $begingroup$
          Hi @Franziska , Thanks for taking the time to answer my question! This isn't quite what i'm looking for. for a final product, I am looking for 15 columns, each column for a site. For instance, I want to add together all values for each row for each site. I realize i could do this manually like this "cbind(rowSums(p18[,1:27], rowSums(p18[,28:56], ....etc...". However I am trying to learn how to automate. Perhaps if I modify your "horizontally union data frames" section?
          $endgroup$
          – Rebecca Swab
          Jan 14 at 20:46






          $begingroup$
          Hi @Franziska , Thanks for taking the time to answer my question! This isn't quite what i'm looking for. for a final product, I am looking for 15 columns, each column for a site. For instance, I want to add together all values for each row for each site. I realize i could do this manually like this "cbind(rowSums(p18[,1:27], rowSums(p18[,28:56], ....etc...". However I am trying to learn how to automate. Perhaps if I modify your "horizontally union data frames" section?
          $endgroup$
          – Rebecca Swab
          Jan 14 at 20:46




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43870%2fhow-to-split-data-frame-into-groups-combine-rows%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Ponta tanko

          Tantalo (mitologio)

          Erzsébet Schaár