Data pre processing Libraries
$begingroup$
I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?
scikit-learn
$endgroup$
add a comment |
$begingroup$
I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?
scikit-learn
$endgroup$
$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday
$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago
add a comment |
$begingroup$
I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?
scikit-learn
$endgroup$
I'm new to Deep Learning. I only know about scikit learn when I have to pre process data for a neural network. are there any other good libraries for that ?
scikit-learn
scikit-learn
asked yesterday
Noob CoderNoob Coder
133
133
$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday
$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago
add a comment |
$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday
$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago
$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday
$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday
$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago
$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
You can get a bunch of pretrained models using pytorch
https://pytorch.org/docs/stable/torchvision/models.html
There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.
$endgroup$
add a comment |
$begingroup$
Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.
There are some really good data visualisation libraries in Python such as :
- MatplotLib
- Seaborn
- GGPlot
- Bokeh
- PyGal
- Plotly
- and more.
Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.
But, it is always better to understand the data, visualise it like a
story and clean it manually,
slowly, instead of passing it through predefined frameworks or
pipelines.
These videos :
- Why You Need Data-Preprocessing
Data-preprocessing tasks
can be helpful. There are many more available easily. Good luck !
Also, do upvote my answer, if it has helped you. It encourages the community to help each other.
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f44141%2fdata-pre-processing-libraries%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You can get a bunch of pretrained models using pytorch
https://pytorch.org/docs/stable/torchvision/models.html
There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
You can get a bunch of pretrained models using pytorch
https://pytorch.org/docs/stable/torchvision/models.html
There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
You can get a bunch of pretrained models using pytorch
https://pytorch.org/docs/stable/torchvision/models.html
There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
You can get a bunch of pretrained models using pytorch
https://pytorch.org/docs/stable/torchvision/models.html
There's a great course in Udacity related to https://www.udacity.com/course/deep-learning-pytorch--ud188
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered yesterday
Mario RuizMario Ruiz
1011
1011
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Mario Ruiz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
$begingroup$
Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.
$endgroup$
add a comment |
$begingroup$
Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.
$endgroup$
add a comment |
$begingroup$
Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.
$endgroup$
Based on your comment, you could use various libraries that support a bunch of pre-processing like handling missing values. Over the top of head, pandas is a really good library to do the needful. It supports handling missing values, encoding and lot of other cool features.
answered 9 hours ago
Nischal HpNischal Hp
45629
45629
add a comment |
add a comment |
$begingroup$
Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.
There are some really good data visualisation libraries in Python such as :
- MatplotLib
- Seaborn
- GGPlot
- Bokeh
- PyGal
- Plotly
- and more.
Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.
But, it is always better to understand the data, visualise it like a
story and clean it manually,
slowly, instead of passing it through predefined frameworks or
pipelines.
These videos :
- Why You Need Data-Preprocessing
Data-preprocessing tasks
can be helpful. There are many more available easily. Good luck !
Also, do upvote my answer, if it has helped you. It encourages the community to help each other.
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.
There are some really good data visualisation libraries in Python such as :
- MatplotLib
- Seaborn
- GGPlot
- Bokeh
- PyGal
- Plotly
- and more.
Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.
But, it is always better to understand the data, visualise it like a
story and clean it manually,
slowly, instead of passing it through predefined frameworks or
pipelines.
These videos :
- Why You Need Data-Preprocessing
Data-preprocessing tasks
can be helpful. There are many more available easily. Good luck !
Also, do upvote my answer, if it has helped you. It encourages the community to help each other.
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.
There are some really good data visualisation libraries in Python such as :
- MatplotLib
- Seaborn
- GGPlot
- Bokeh
- PyGal
- Plotly
- and more.
Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.
But, it is always better to understand the data, visualise it like a
story and clean it manually,
slowly, instead of passing it through predefined frameworks or
pipelines.
These videos :
- Why You Need Data-Preprocessing
Data-preprocessing tasks
can be helpful. There are many more available easily. Good luck !
Also, do upvote my answer, if it has helped you. It encourages the community to help each other.
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
Pre-processing is more than 40% of the entire pipeline. With better data you build better machine learning/deep learning models. But unfortunately, cleaning data is something that needs time and experience. You need to visualise it and make your hands dirty. The aim is to remove noise, garbage and outliers, in short.
There are some really good data visualisation libraries in Python such as :
- MatplotLib
- Seaborn
- GGPlot
- Bokeh
- PyGal
- Plotly
- and more.
Since, you are comfortable with Scikit learn, as you mentioned in the question, I would suggest you to look up the preprocessing modules in scikit learn, it contains several APIs such as feature extraction, normalization, feature scaling, mean removal, variance scaling, standardization, etc.
But, it is always better to understand the data, visualise it like a
story and clean it manually,
slowly, instead of passing it through predefined frameworks or
pipelines.
These videos :
- Why You Need Data-Preprocessing
Data-preprocessing tasks
can be helpful. There are many more available easily. Good luck !
Also, do upvote my answer, if it has helped you. It encourages the community to help each other.
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 8 hours ago
Amitrajit BoseAmitrajit Bose
11
11
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Amitrajit Bose is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f44141%2fdata-pre-processing-libraries%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Hello Noob Coder, It would be better for you to elaborate on the problem you are trying to solve and what sort of pre-processing you are looking for in order to get better answers.
$endgroup$
– Nischal Hp
yesterday
$begingroup$
I only need to Pre Process data ( one hot encoding , remove variable trap , fill missing values , standard scale) Currently I use sklearn library . I wonder is their any other libraries or soft wares people using
$endgroup$
– Noob Coder
23 hours ago