NLP using seq2seq on Indian language corpus












0












$begingroup$


I have a large corpus in an Indian language. The problem in the text is every word is split in to two, three or more words. Is seq2seq right way to correct it or any better algorithm for this problem?










share|improve this question









New contributor




vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Hello, welcome to Stack Exchange. Please consider improving the quality of your question. In its current form it is at risk of being closed.
    $endgroup$
    – Ethan
    7 hours ago
















0












$begingroup$


I have a large corpus in an Indian language. The problem in the text is every word is split in to two, three or more words. Is seq2seq right way to correct it or any better algorithm for this problem?










share|improve this question









New contributor




vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Hello, welcome to Stack Exchange. Please consider improving the quality of your question. In its current form it is at risk of being closed.
    $endgroup$
    – Ethan
    7 hours ago














0












0








0





$begingroup$


I have a large corpus in an Indian language. The problem in the text is every word is split in to two, three or more words. Is seq2seq right way to correct it or any better algorithm for this problem?










share|improve this question









New contributor




vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I have a large corpus in an Indian language. The problem in the text is every word is split in to two, three or more words. Is seq2seq right way to correct it or any better algorithm for this problem?







machine-learning tensorflow sequence-to-sequence machine-translation






share|improve this question









New contributor




vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 29 mins ago









Community

1




1






New contributor




vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 9 hours ago









vijay rvijay r

6




6




New contributor




vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






vijay r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    Hello, welcome to Stack Exchange. Please consider improving the quality of your question. In its current form it is at risk of being closed.
    $endgroup$
    – Ethan
    7 hours ago


















  • $begingroup$
    Hello, welcome to Stack Exchange. Please consider improving the quality of your question. In its current form it is at risk of being closed.
    $endgroup$
    – Ethan
    7 hours ago
















$begingroup$
Hello, welcome to Stack Exchange. Please consider improving the quality of your question. In its current form it is at risk of being closed.
$endgroup$
– Ethan
7 hours ago




$begingroup$
Hello, welcome to Stack Exchange. Please consider improving the quality of your question. In its current form it is at risk of being closed.
$endgroup$
– Ethan
7 hours ago










0






active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






vijay r is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47252%2fnlp-using-seq2seq-on-indian-language-corpus%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes








vijay r is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















vijay r is a new contributor. Be nice, and check out our Code of Conduct.













vijay r is a new contributor. Be nice, and check out our Code of Conduct.












vijay r is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47252%2fnlp-using-seq2seq-on-indian-language-corpus%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Ponta tanko

Tantalo (mitologio)

Erzsébet Schaár