filter a sequence of coordinates












0












$begingroup$


Let's say that I receive lists of coordinates (latitude,longitude), each representing a 2D route. Later I want to use this data for route prediction, but first I have a preliminary problem to solve: I was expecting these lists to be sorted and, although the majority of the coordinates of each route are indeed sorted, some are not. For example:



a) expected:



enter image description here



b) actual:



enter image description here



where t is a timestamp.



I am looking for solutions to address this problem, for example remove all coordinates like b.t3 or sort the lists appropriately. Does anyone have any suggestion?










share|improve this question









$endgroup$




bumped to the homepage by Community 13 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • $begingroup$
    You could split the actual routes you get when they represent several routes?
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 14:57










  • $begingroup$
    I don't think I understood your question, each list of coordinates represents only one route. Did I answer your question?
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:00










  • $begingroup$
    Actual shows two possible paths to go from t1 to t4: t1->t2->t4 or t1->t2->t3->t4
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 15:09










  • $begingroup$
    Sorry for the confusion. There is only one path t1->t2->t3->t4, these are 4 pairs of coordinates of a participant, in which she traveled in a straight line. Unfortunately, the order and/or timestamps are not always correct.
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:13
















0












$begingroup$


Let's say that I receive lists of coordinates (latitude,longitude), each representing a 2D route. Later I want to use this data for route prediction, but first I have a preliminary problem to solve: I was expecting these lists to be sorted and, although the majority of the coordinates of each route are indeed sorted, some are not. For example:



a) expected:



enter image description here



b) actual:



enter image description here



where t is a timestamp.



I am looking for solutions to address this problem, for example remove all coordinates like b.t3 or sort the lists appropriately. Does anyone have any suggestion?










share|improve this question









$endgroup$




bumped to the homepage by Community 13 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • $begingroup$
    You could split the actual routes you get when they represent several routes?
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 14:57










  • $begingroup$
    I don't think I understood your question, each list of coordinates represents only one route. Did I answer your question?
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:00










  • $begingroup$
    Actual shows two possible paths to go from t1 to t4: t1->t2->t4 or t1->t2->t3->t4
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 15:09










  • $begingroup$
    Sorry for the confusion. There is only one path t1->t2->t3->t4, these are 4 pairs of coordinates of a participant, in which she traveled in a straight line. Unfortunately, the order and/or timestamps are not always correct.
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:13














0












0








0





$begingroup$


Let's say that I receive lists of coordinates (latitude,longitude), each representing a 2D route. Later I want to use this data for route prediction, but first I have a preliminary problem to solve: I was expecting these lists to be sorted and, although the majority of the coordinates of each route are indeed sorted, some are not. For example:



a) expected:



enter image description here



b) actual:



enter image description here



where t is a timestamp.



I am looking for solutions to address this problem, for example remove all coordinates like b.t3 or sort the lists appropriately. Does anyone have any suggestion?










share|improve this question









$endgroup$




Let's say that I receive lists of coordinates (latitude,longitude), each representing a 2D route. Later I want to use this data for route prediction, but first I have a preliminary problem to solve: I was expecting these lists to be sorted and, although the majority of the coordinates of each route are indeed sorted, some are not. For example:



a) expected:



enter image description here



b) actual:



enter image description here



where t is a timestamp.



I am looking for solutions to address this problem, for example remove all coordinates like b.t3 or sort the lists appropriately. Does anyone have any suggestion?







machine-learning






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 20 '18 at 14:54









João MatosJoão Matos

101




101





bumped to the homepage by Community 13 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 13 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.














  • $begingroup$
    You could split the actual routes you get when they represent several routes?
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 14:57










  • $begingroup$
    I don't think I understood your question, each list of coordinates represents only one route. Did I answer your question?
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:00










  • $begingroup$
    Actual shows two possible paths to go from t1 to t4: t1->t2->t4 or t1->t2->t3->t4
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 15:09










  • $begingroup$
    Sorry for the confusion. There is only one path t1->t2->t3->t4, these are 4 pairs of coordinates of a participant, in which she traveled in a straight line. Unfortunately, the order and/or timestamps are not always correct.
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:13


















  • $begingroup$
    You could split the actual routes you get when they represent several routes?
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 14:57










  • $begingroup$
    I don't think I understood your question, each list of coordinates represents only one route. Did I answer your question?
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:00










  • $begingroup$
    Actual shows two possible paths to go from t1 to t4: t1->t2->t4 or t1->t2->t3->t4
    $endgroup$
    – Matthieu Brucher
    Dec 20 '18 at 15:09










  • $begingroup$
    Sorry for the confusion. There is only one path t1->t2->t3->t4, these are 4 pairs of coordinates of a participant, in which she traveled in a straight line. Unfortunately, the order and/or timestamps are not always correct.
    $endgroup$
    – João Matos
    Dec 20 '18 at 15:13
















$begingroup$
You could split the actual routes you get when they represent several routes?
$endgroup$
– Matthieu Brucher
Dec 20 '18 at 14:57




$begingroup$
You could split the actual routes you get when they represent several routes?
$endgroup$
– Matthieu Brucher
Dec 20 '18 at 14:57












$begingroup$
I don't think I understood your question, each list of coordinates represents only one route. Did I answer your question?
$endgroup$
– João Matos
Dec 20 '18 at 15:00




$begingroup$
I don't think I understood your question, each list of coordinates represents only one route. Did I answer your question?
$endgroup$
– João Matos
Dec 20 '18 at 15:00












$begingroup$
Actual shows two possible paths to go from t1 to t4: t1->t2->t4 or t1->t2->t3->t4
$endgroup$
– Matthieu Brucher
Dec 20 '18 at 15:09




$begingroup$
Actual shows two possible paths to go from t1 to t4: t1->t2->t4 or t1->t2->t3->t4
$endgroup$
– Matthieu Brucher
Dec 20 '18 at 15:09












$begingroup$
Sorry for the confusion. There is only one path t1->t2->t3->t4, these are 4 pairs of coordinates of a participant, in which she traveled in a straight line. Unfortunately, the order and/or timestamps are not always correct.
$endgroup$
– João Matos
Dec 20 '18 at 15:13




$begingroup$
Sorry for the confusion. There is only one path t1->t2->t3->t4, these are 4 pairs of coordinates of a participant, in which she traveled in a straight line. Unfortunately, the order and/or timestamps are not always correct.
$endgroup$
– João Matos
Dec 20 '18 at 15:13










1 Answer
1






active

oldest

votes


















0












$begingroup$

This raises many questions that you may or may not be able to answer, and this may not be helpful but hopefully it is.



Regarding the data, what are it's origins? Is it all electronically collected or is it an amalgamation of, say delivery driver logs that could be easily falsified or recorded incorrectly? Asked another way, how do you know that the order, based on timestamps, is not correct?



On the other hand, if you know the relationship between the points on the route, why bother with the timestamp values? From your comments, it seems as though the participant traveled in a straight line. If that is the case, you know the order based on the coordinates. You could order them by distance from the mean of the coordinates.



Since there is some disagreement between the "known" route and the "recorded" route I think to resolve the data you need to understand what causes this disagreement? Is the timestamp produced by one device or multiple? If one device, maybe someone is not driving the advertised route. If multiple devices, maybe one is set to the wrong timezone or just has the wrong time.



Again, not really an answer, but hopefully helpful.






share|improve this answer









$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "557"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f42950%2ffilter-a-sequence-of-coordinates%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    This raises many questions that you may or may not be able to answer, and this may not be helpful but hopefully it is.



    Regarding the data, what are it's origins? Is it all electronically collected or is it an amalgamation of, say delivery driver logs that could be easily falsified or recorded incorrectly? Asked another way, how do you know that the order, based on timestamps, is not correct?



    On the other hand, if you know the relationship between the points on the route, why bother with the timestamp values? From your comments, it seems as though the participant traveled in a straight line. If that is the case, you know the order based on the coordinates. You could order them by distance from the mean of the coordinates.



    Since there is some disagreement between the "known" route and the "recorded" route I think to resolve the data you need to understand what causes this disagreement? Is the timestamp produced by one device or multiple? If one device, maybe someone is not driving the advertised route. If multiple devices, maybe one is set to the wrong timezone or just has the wrong time.



    Again, not really an answer, but hopefully helpful.






    share|improve this answer









    $endgroup$


















      0












      $begingroup$

      This raises many questions that you may or may not be able to answer, and this may not be helpful but hopefully it is.



      Regarding the data, what are it's origins? Is it all electronically collected or is it an amalgamation of, say delivery driver logs that could be easily falsified or recorded incorrectly? Asked another way, how do you know that the order, based on timestamps, is not correct?



      On the other hand, if you know the relationship between the points on the route, why bother with the timestamp values? From your comments, it seems as though the participant traveled in a straight line. If that is the case, you know the order based on the coordinates. You could order them by distance from the mean of the coordinates.



      Since there is some disagreement between the "known" route and the "recorded" route I think to resolve the data you need to understand what causes this disagreement? Is the timestamp produced by one device or multiple? If one device, maybe someone is not driving the advertised route. If multiple devices, maybe one is set to the wrong timezone or just has the wrong time.



      Again, not really an answer, but hopefully helpful.






      share|improve this answer









      $endgroup$
















        0












        0








        0





        $begingroup$

        This raises many questions that you may or may not be able to answer, and this may not be helpful but hopefully it is.



        Regarding the data, what are it's origins? Is it all electronically collected or is it an amalgamation of, say delivery driver logs that could be easily falsified or recorded incorrectly? Asked another way, how do you know that the order, based on timestamps, is not correct?



        On the other hand, if you know the relationship between the points on the route, why bother with the timestamp values? From your comments, it seems as though the participant traveled in a straight line. If that is the case, you know the order based on the coordinates. You could order them by distance from the mean of the coordinates.



        Since there is some disagreement between the "known" route and the "recorded" route I think to resolve the data you need to understand what causes this disagreement? Is the timestamp produced by one device or multiple? If one device, maybe someone is not driving the advertised route. If multiple devices, maybe one is set to the wrong timezone or just has the wrong time.



        Again, not really an answer, but hopefully helpful.






        share|improve this answer









        $endgroup$



        This raises many questions that you may or may not be able to answer, and this may not be helpful but hopefully it is.



        Regarding the data, what are it's origins? Is it all electronically collected or is it an amalgamation of, say delivery driver logs that could be easily falsified or recorded incorrectly? Asked another way, how do you know that the order, based on timestamps, is not correct?



        On the other hand, if you know the relationship between the points on the route, why bother with the timestamp values? From your comments, it seems as though the participant traveled in a straight line. If that is the case, you know the order based on the coordinates. You could order them by distance from the mean of the coordinates.



        Since there is some disagreement between the "known" route and the "recorded" route I think to resolve the data you need to understand what causes this disagreement? Is the timestamp produced by one device or multiple? If one device, maybe someone is not driving the advertised route. If multiple devices, maybe one is set to the wrong timezone or just has the wrong time.



        Again, not really an answer, but hopefully helpful.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 20 '18 at 22:24









        SkiddlesSkiddles

        655210




        655210






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f42950%2ffilter-a-sequence-of-coordinates%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Aikido

            Tivadar Csontváry Kosztka

            Metroo de Marsejlo