Data pre processing Libraries












0












$begingroup$


I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?










share|improve this question









$endgroup$












  • $begingroup$
    Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
    $endgroup$
    – Nischal Hp
    yesterday










  • $begingroup$
    I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
    $endgroup$
    – Noob Coder
    23 hours ago
















0












$begingroup$


I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?










share|improve this question









$endgroup$












  • $begingroup$
    Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
    $endgroup$
    – Nischal Hp
    yesterday










  • $begingroup$
    I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
    $endgroup$
    – Noob Coder
    23 hours ago














0












0








0





$begingroup$


I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?










share|improve this question









$endgroup$




I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?







scikit-learn






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked yesterday









Noob CoderNoob Coder

133




133












  • $begingroup$
    Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
    $endgroup$
    – Nischal Hp
    yesterday










  • $begingroup$
    I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
    $endgroup$
    – Noob Coder
    23 hours ago


















  • $begingroup$
    Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
    $endgroup$
    – Nischal Hp
    yesterday










  • $begingroup$
    I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
    $endgroup$
    – Noob Coder
    23 hours ago
















$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday




$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday












$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago




$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago










3 Answers
3






active

oldest

votes


















0












$begingroup$

You can get a bunch of pretrained models using pytorch



https://pytorch.org/docs/stable/torchvision/models.html



There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188






share|improve this answer








New contributor




Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$





















    0












    $begingroup$

    Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.






    share|improve this answer









    $endgroup$





















      0












      $begingroup$

      Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.



      There are some really good data visualisation libraries in Python such as :




      • MatplotLib

      • Seaborn

      • GGPlot

      • Bokeh

      • PyGal

      • Plotly

      • and more.


      Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.




      But, it is always better to understand the data, visualise it like a
      story and clean it manually,
      slowly, instead of passing it through predefined frameworks or
      pipelines.




      These videos :




      • Why You Need Data-Preprocessing


      • Data-preprocessing tasks


      can be helpful. There are many more available easily. Good luck !



      Also, do upvote my answer, if it has helped you. It encourages the community to help each other.






      share|improve this answer








      New contributor




      Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      $endgroup$













        Your Answer





        StackExchange.ifUsing("editor", function () {
        return StackExchange.using("mathjaxEditing", function () {
        StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
        StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
        });
        });
        }, "mathjax-editing");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "557"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f44141%2fdata-pre-processing-libraries%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        0












        $begingroup$

        You can get a bunch of pretrained models using pytorch



        https://pytorch.org/docs/stable/torchvision/models.html



        There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188






        share|improve this answer








        New contributor




        Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        $endgroup$


















          0












          $begingroup$

          You can get a bunch of pretrained models using pytorch



          https://pytorch.org/docs/stable/torchvision/models.html



          There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188






          share|improve this answer








          New contributor




          Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$
















            0












            0








            0





            $begingroup$

            You can get a bunch of pretrained models using pytorch



            https://pytorch.org/docs/stable/torchvision/models.html



            There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188






            share|improve this answer








            New contributor




            Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$



            You can get a bunch of pretrained models using pytorch



            https://pytorch.org/docs/stable/torchvision/models.html



            There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188







            share|improve this answer








            New contributor




            Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            share|improve this answer



            share|improve this answer






            New contributor




            Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            answered yesterday









            Mario RuizMario Ruiz

            1011




            1011




            New contributor




            Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.





            New contributor





            Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.























                0












                $begingroup$

                Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.






                share|improve this answer









                $endgroup$


















                  0












                  $begingroup$

                  Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.






                  share|improve this answer









                  $endgroup$
















                    0












                    0








                    0





                    $begingroup$

                    Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.






                    share|improve this answer









                    $endgroup$



                    Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered 9 hours ago









                    Nischal HpNischal Hp

                    45629




                    45629























                        0












                        $begingroup$

                        Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.



                        There are some really good data visualisation libraries in Python such as :




                        • MatplotLib

                        • Seaborn

                        • GGPlot

                        • Bokeh

                        • PyGal

                        • Plotly

                        • and more.


                        Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.




                        But, it is always better to understand the data, visualise it like a
                        story and clean it manually,
                        slowly, instead of passing it through predefined frameworks or
                        pipelines.




                        These videos :




                        • Why You Need Data-Preprocessing


                        • Data-preprocessing tasks


                        can be helpful. There are many more available easily. Good luck !



                        Also, do upvote my answer, if it has helped you. It encourages the community to help each other.






                        share|improve this answer








                        New contributor




                        Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                        Check out our Code of Conduct.






                        $endgroup$


















                          0












                          $begingroup$

                          Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.



                          There are some really good data visualisation libraries in Python such as :




                          • MatplotLib

                          • Seaborn

                          • GGPlot

                          • Bokeh

                          • PyGal

                          • Plotly

                          • and more.


                          Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.




                          But, it is always better to understand the data, visualise it like a
                          story and clean it manually,
                          slowly, instead of passing it through predefined frameworks or
                          pipelines.




                          These videos :




                          • Why You Need Data-Preprocessing


                          • Data-preprocessing tasks


                          can be helpful. There are many more available easily. Good luck !



                          Also, do upvote my answer, if it has helped you. It encourages the community to help each other.






                          share|improve this answer








                          New contributor




                          Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          $endgroup$
















                            0












                            0








                            0





                            $begingroup$

                            Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.



                            There are some really good data visualisation libraries in Python such as :




                            • MatplotLib

                            • Seaborn

                            • GGPlot

                            • Bokeh

                            • PyGal

                            • Plotly

                            • and more.


                            Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.




                            But, it is always better to understand the data, visualise it like a
                            story and clean it manually,
                            slowly, instead of passing it through predefined frameworks or
                            pipelines.




                            These videos :




                            • Why You Need Data-Preprocessing


                            • Data-preprocessing tasks


                            can be helpful. There are many more available easily. Good luck !



                            Also, do upvote my answer, if it has helped you. It encourages the community to help each other.






                            share|improve this answer








                            New contributor




                            Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                            Check out our Code of Conduct.






                            $endgroup$



                            Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.



                            There are some really good data visualisation libraries in Python such as :




                            • MatplotLib

                            • Seaborn

                            • GGPlot

                            • Bokeh

                            • PyGal

                            • Plotly

                            • and more.


                            Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.




                            But, it is always better to understand the data, visualise it like a
                            story and clean it manually,
                            slowly, instead of passing it through predefined frameworks or
                            pipelines.




                            These videos :




                            • Why You Need Data-Preprocessing


                            • Data-preprocessing tasks


                            can be helpful. There are many more available easily. Good luck !



                            Also, do upvote my answer, if it has helped you. It encourages the community to help each other.







                            share|improve this answer








                            New contributor




                            Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                            Check out our Code of Conduct.









                            share|improve this answer



                            share|improve this answer






                            New contributor




                            Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                            Check out our Code of Conduct.









                            answered 8 hours ago









                            Amitrajit BoseAmitrajit Bose

                            11




                            11




                            New contributor




                            Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                            Check out our Code of Conduct.





                            New contributor





                            Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                            Check out our Code of Conduct.






                            Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                            Check out our Code of Conduct.






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Data Science Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                Use MathJax to format equations. MathJax reference.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f44141%2fdata-pre-processing-libraries%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Aikido

                                Tivadar Csontváry Kosztka

                                Metroo de Marsejlo