How to run a saved TensorFlow Model? (Video Prediction Model)

I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model

python prediction_train.py

with optional parameters.

Can anyone please explain how do I use this model to predict a video sequence? I'm new to deeplearning and tensorflow. So, I'm not able to understand the code properly. My current task is to just run the code and see the output (i.e. the videos predicted by this model).

All I could understand was, it uses tensirflow saver to save the checkpoints. I'm guessing these checkpoints are the intermediate trained model after few epochs (2000 in this case). How to use these models to predict the next frames of a video?

Any help is greatly appreciated :)

edited Jan 10 at 18:09

user12075

1,276515

asked Jan 10 at 5:57

Nagabhushan S N

1276

1

$begingroup$
You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
$endgroup$
– user12075
Jan 10 at 18:37

add a comment |

I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model

python prediction_train.py

Any help is greatly appreciated :)

edited Jan 10 at 18:09

user12075

1,276515

asked Jan 10 at 5:57

Nagabhushan S N

1276

1

$begingroup$
You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
$endgroup$
– user12075
Jan 10 at 18:37

add a comment |

I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model

python prediction_train.py

Any help is greatly appreciated :)

edited Jan 10 at 18:09

user12075

1,276515

asked Jan 10 at 5:57

Nagabhushan S N

1276

I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model

python prediction_train.py

Any help is greatly appreciated :)

deep-learning tensorflow

edited Jan 10 at 18:09

user12075

1,276515

asked Jan 10 at 5:57

Nagabhushan S N

1276

edited Jan 10 at 18:09

user12075

1,276515

asked Jan 10 at 5:57

Nagabhushan S N

1276

edited Jan 10 at 18:09

user12075

1,276515

edited Jan 10 at 18:09

user12075

1,276515

edited Jan 10 at 18:09

user12075

1,276515

asked Jan 10 at 5:57

Nagabhushan S N

1276

asked Jan 10 at 5:57

Nagabhushan S N

1276

asked Jan 10 at 5:57

Nagabhushan S N

1276

1

$begingroup$
You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
$endgroup$
– user12075
Jan 10 at 18:37

add a comment |

1

$begingroup$
You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
$endgroup$
– user12075
Jan 10 at 18:37

You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.

– user12075
Jan 10 at 18:37

add a comment |

2 Answers
2

active

oldest

votes

The TensorFlow saver is used to save the weights of a specific model at some given point. When you want to use a trained model, you must first define the model's architecture (which should be similar to the one used for saving the weights), then you can use the same "saver" class to restore the weights:

with tf.Session() as sess:

    # Restore variables from disk.

    saver.restore(sess, "../my_saved_model.ckpt")

Regarding your initial question. I think that if you are just starting with deep learning and TensorFlow, this is the wrong place to start and you should first understand how TensorFlow works in general by applying it at easier tasks like image classification (start with MNIST).

From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:

def construct_model(images,

                actions=None,

                states=None,

                iter_num=-1.0,

                k=-1,

                use_state=True,

                num_masks=10,

                stp=False,

                cdna=True,

                dna=False,

                context_frames=2):

"""Build convolutional lstm video predictor using STP, CDNA, or DNA.

  Args:

    images: tensor of ground truth image sequences

    actions: tensor of action sequences

    states: tensor of ground truth state sequences

    iter_num: tensor of the current training iteration (for sched. sampling)

    k: constant used for scheduled sampling. -1 to feed in own prediction.

    use_state: True to include state and action in prediction

    num_masks: the number of different pixel motion predictions (and

           the number of masks for each of those predictions)

    stp: True to use Spatial Transformer Predictor (STP)

    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)

    dna: True to use Dynamic Neural Advection (DNA)

    context_frames: number of ground truth frames to pass in before

                feeding in own predictions

  Returns:

    gen_images: predicted future image frames

    gen_states: predicted future states

answered Jan 10 at 15:58

Mark.F

766218

$begingroup$
Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
$endgroup$
– Nagabhushan S N
Jan 11 at 5:52

add a comment |

The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .

"""

Sections of this code were taken from:

https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

"""

import numpy as np



import os

import six.moves.urllib as urllib

import sys

import tarfile

import tensorflow as tf

import zipfile



from collections import defaultdict

from io import StringIO

from matplotlib import pyplot as plt

from PIL import Image



from utils import label_map_util



from utils import visualization_utils as vis_util



import cv2



# Path to frozen detection graph. This is the actual model that is used

# for the object detection.

PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'



# List of the strings that is used to add correct label for each box.

PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')



NUM_CLASSES = 1



sys.path.append("..")





def detect_in_video():



    # VideoWriter is the responsible of creating a copy of the video

    # used for the detections but with the detections overlays. Keep in

    # mind the frame size has to be the same as original video.

    out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(

        'M', 'J', 'P', 'G'), 10, (1280, 720))



    detection_graph = tf.Graph()

    with detection_graph.as_default():

        od_graph_def = tf.GraphDef()

        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:

            serialized_graph = fid.read()

            od_graph_def.ParseFromString(serialized_graph)

            tf.import_graph_def(od_graph_def, name='')



    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)

    categories = label_map_util.convert_label_map_to_categories(

        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)

    category_index = label_map_util.create_category_index(categories)



    with detection_graph.as_default():

        with tf.Session(graph=detection_graph) as sess:

            # Definite input and output Tensors for detection_graph

            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

            # Each box represents a part of the image where a particular object

            # was detected.

            detection_boxes = detection_graph.get_tensor_by_name(

                'detection_boxes:0')

            # Each score represent how level of confidence for each of the objects.

            # Score is shown on the result image, together with the class

            # label.

            detection_scores = detection_graph.get_tensor_by_name(

                'detection_scores:0')

            detection_classes = detection_graph.get_tensor_by_name(

                'detection_classes:0')

            num_detections = detection_graph.get_tensor_by_name(

                'num_detections:0')

            cap = cv2.VideoCapture('PikachuKetchup.mp4')



            while(cap.isOpened()):

                # Read the frame

                ret, frame = cap.read()



                # Recolor the frame. By default, OpenCV uses BGR color space.

                # This short blog post explains this better:

                # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/

                color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)



                image_np_expanded = np.expand_dims(color_frame, axis=0)



                # Actual detection.

                (boxes, scores, classes, num) = sess.run(

                    [detection_boxes, detection_scores,

                        detection_classes, num_detections],

                    feed_dict={image_tensor: image_np_expanded})



                # Visualization of the results of a detection.

                # note: perform the detections using a higher threshold

                vis_util.visualize_boxes_and_labels_on_image_array(

                    color_frame,

                    np.squeeze(boxes),

                    np.squeeze(classes).astype(np.int32),

                    np.squeeze(scores),

                    category_index,

                    use_normalized_coordinates=True,

                    line_thickness=8,

                    min_score_thresh=.20)



                cv2.imshow('frame', color_frame)

                output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)

                out.write(output_rgb)



                if cv2.waitKey(1) & 0xFF == ord('q'):

                    break



            out.release()

            cap.release()

            cv2.destroyAllWindows()





def main():

    detect_in_video()





if __name__ == '__main__':

    main()

answered 1 hour ago

Tamil Selvan S

112

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43756%2fhow-to-run-a-saved-tensorflow-model-video-prediction-model%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

with tf.Session() as sess:

    # Restore variables from disk.

    saver.restore(sess, "../my_saved_model.ckpt")

From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:

def construct_model(images,

                actions=None,

                states=None,

                iter_num=-1.0,

                k=-1,

                use_state=True,

                num_masks=10,

                stp=False,

                cdna=True,

                dna=False,

                context_frames=2):

"""Build convolutional lstm video predictor using STP, CDNA, or DNA.

  Args:

    images: tensor of ground truth image sequences

    actions: tensor of action sequences

    states: tensor of ground truth state sequences

    iter_num: tensor of the current training iteration (for sched. sampling)

    k: constant used for scheduled sampling. -1 to feed in own prediction.

    use_state: True to include state and action in prediction

    num_masks: the number of different pixel motion predictions (and

           the number of masks for each of those predictions)

    stp: True to use Spatial Transformer Predictor (STP)

    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)

    dna: True to use Dynamic Neural Advection (DNA)

    context_frames: number of ground truth frames to pass in before

                feeding in own predictions

  Returns:

    gen_images: predicted future image frames

    gen_states: predicted future states

answered Jan 10 at 15:58

Mark.F

766218

$begingroup$
Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
$endgroup$
– Nagabhushan S N
Jan 11 at 5:52

add a comment |

with tf.Session() as sess:

    # Restore variables from disk.

    saver.restore(sess, "../my_saved_model.ckpt")

From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:

def construct_model(images,

                actions=None,

                states=None,

                iter_num=-1.0,

                k=-1,

                use_state=True,

                num_masks=10,

                stp=False,

                cdna=True,

                dna=False,

                context_frames=2):

"""Build convolutional lstm video predictor using STP, CDNA, or DNA.

  Args:

    images: tensor of ground truth image sequences

    actions: tensor of action sequences

    states: tensor of ground truth state sequences

    iter_num: tensor of the current training iteration (for sched. sampling)

    k: constant used for scheduled sampling. -1 to feed in own prediction.

    use_state: True to include state and action in prediction

    num_masks: the number of different pixel motion predictions (and

           the number of masks for each of those predictions)

    stp: True to use Spatial Transformer Predictor (STP)

    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)

    dna: True to use Dynamic Neural Advection (DNA)

    context_frames: number of ground truth frames to pass in before

                feeding in own predictions

  Returns:

    gen_images: predicted future image frames

    gen_states: predicted future states

answered Jan 10 at 15:58

Mark.F

766218

$begingroup$
Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
$endgroup$
– Nagabhushan S N
Jan 11 at 5:52

add a comment |

with tf.Session() as sess:

    # Restore variables from disk.

    saver.restore(sess, "../my_saved_model.ckpt")

From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:

def construct_model(images,

                actions=None,

                states=None,

                iter_num=-1.0,

                k=-1,

                use_state=True,

                num_masks=10,

                stp=False,

                cdna=True,

                dna=False,

                context_frames=2):

"""Build convolutional lstm video predictor using STP, CDNA, or DNA.

  Args:

    images: tensor of ground truth image sequences

    actions: tensor of action sequences

    states: tensor of ground truth state sequences

    iter_num: tensor of the current training iteration (for sched. sampling)

    k: constant used for scheduled sampling. -1 to feed in own prediction.

    use_state: True to include state and action in prediction

    num_masks: the number of different pixel motion predictions (and

           the number of masks for each of those predictions)

    stp: True to use Spatial Transformer Predictor (STP)

    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)

    dna: True to use Dynamic Neural Advection (DNA)

    context_frames: number of ground truth frames to pass in before

                feeding in own predictions

  Returns:

    gen_images: predicted future image frames

    gen_states: predicted future states

answered Jan 10 at 15:58

Mark.F

766218

with tf.Session() as sess:

    # Restore variables from disk.

    saver.restore(sess, "../my_saved_model.ckpt")

From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:

def construct_model(images,

                actions=None,

                states=None,

                iter_num=-1.0,

                k=-1,

                use_state=True,

                num_masks=10,

                stp=False,

                cdna=True,

                dna=False,

                context_frames=2):

"""Build convolutional lstm video predictor using STP, CDNA, or DNA.

  Args:

    images: tensor of ground truth image sequences

    actions: tensor of action sequences

    states: tensor of ground truth state sequences

    iter_num: tensor of the current training iteration (for sched. sampling)

    k: constant used for scheduled sampling. -1 to feed in own prediction.

    use_state: True to include state and action in prediction

    num_masks: the number of different pixel motion predictions (and

           the number of masks for each of those predictions)

    stp: True to use Spatial Transformer Predictor (STP)

    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)

    dna: True to use Dynamic Neural Advection (DNA)

    context_frames: number of ground truth frames to pass in before

                feeding in own predictions

  Returns:

    gen_images: predicted future image frames

    gen_states: predicted future states

answered Jan 10 at 15:58

Mark.F

766218

answered Jan 10 at 15:58

Mark.F

766218

answered Jan 10 at 15:58

Mark.F

766218

answered Jan 10 at 15:58

Mark.F

766218

$begingroup$
Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
$endgroup$
– Nagabhushan S N
Jan 11 at 5:52

add a comment |

$begingroup$
Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
$endgroup$
– Nagabhushan S N
Jan 11 at 5:52

Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)

– Nagabhushan S N
Jan 11 at 5:52

add a comment |

The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .

"""

Sections of this code were taken from:

https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

"""

import numpy as np



import os

import six.moves.urllib as urllib

import sys

import tarfile

import tensorflow as tf

import zipfile



from collections import defaultdict

from io import StringIO

from matplotlib import pyplot as plt

from PIL import Image



from utils import label_map_util



from utils import visualization_utils as vis_util



import cv2



# Path to frozen detection graph. This is the actual model that is used

# for the object detection.

PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'



# List of the strings that is used to add correct label for each box.

PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')



NUM_CLASSES = 1



sys.path.append("..")





def detect_in_video():



    # VideoWriter is the responsible of creating a copy of the video

    # used for the detections but with the detections overlays. Keep in

    # mind the frame size has to be the same as original video.

    out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(

        'M', 'J', 'P', 'G'), 10, (1280, 720))



    detection_graph = tf.Graph()

    with detection_graph.as_default():

        od_graph_def = tf.GraphDef()

        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:

            serialized_graph = fid.read()

            od_graph_def.ParseFromString(serialized_graph)

            tf.import_graph_def(od_graph_def, name='')



    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)

    categories = label_map_util.convert_label_map_to_categories(

        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)

    category_index = label_map_util.create_category_index(categories)



    with detection_graph.as_default():

        with tf.Session(graph=detection_graph) as sess:

            # Definite input and output Tensors for detection_graph

            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

            # Each box represents a part of the image where a particular object

            # was detected.

            detection_boxes = detection_graph.get_tensor_by_name(

                'detection_boxes:0')

            # Each score represent how level of confidence for each of the objects.

            # Score is shown on the result image, together with the class

            # label.

            detection_scores = detection_graph.get_tensor_by_name(

                'detection_scores:0')

            detection_classes = detection_graph.get_tensor_by_name(

                'detection_classes:0')

            num_detections = detection_graph.get_tensor_by_name(

                'num_detections:0')

            cap = cv2.VideoCapture('PikachuKetchup.mp4')



            while(cap.isOpened()):

                # Read the frame

                ret, frame = cap.read()



                # Recolor the frame. By default, OpenCV uses BGR color space.

                # This short blog post explains this better:

                # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/

                color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)



                image_np_expanded = np.expand_dims(color_frame, axis=0)



                # Actual detection.

                (boxes, scores, classes, num) = sess.run(

                    [detection_boxes, detection_scores,

                        detection_classes, num_detections],

                    feed_dict={image_tensor: image_np_expanded})



                # Visualization of the results of a detection.

                # note: perform the detections using a higher threshold

                vis_util.visualize_boxes_and_labels_on_image_array(

                    color_frame,

                    np.squeeze(boxes),

                    np.squeeze(classes).astype(np.int32),

                    np.squeeze(scores),

                    category_index,

                    use_normalized_coordinates=True,

                    line_thickness=8,

                    min_score_thresh=.20)



                cv2.imshow('frame', color_frame)

                output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)

                out.write(output_rgb)



                if cv2.waitKey(1) & 0xFF == ord('q'):

                    break



            out.release()

            cap.release()

            cv2.destroyAllWindows()





def main():

    detect_in_video()





if __name__ == '__main__':

    main()

answered 1 hour ago

Tamil Selvan S

112

add a comment |

The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .

"""

Sections of this code were taken from:

https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

"""

import numpy as np



import os

import six.moves.urllib as urllib

import sys

import tarfile

import tensorflow as tf

import zipfile



from collections import defaultdict

from io import StringIO

from matplotlib import pyplot as plt

from PIL import Image



from utils import label_map_util



from utils import visualization_utils as vis_util



import cv2



# Path to frozen detection graph. This is the actual model that is used

# for the object detection.

PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'



# List of the strings that is used to add correct label for each box.

PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')



NUM_CLASSES = 1



sys.path.append("..")





def detect_in_video():



    # VideoWriter is the responsible of creating a copy of the video

    # used for the detections but with the detections overlays. Keep in

    # mind the frame size has to be the same as original video.

    out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(

        'M', 'J', 'P', 'G'), 10, (1280, 720))



    detection_graph = tf.Graph()

    with detection_graph.as_default():

        od_graph_def = tf.GraphDef()

        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:

            serialized_graph = fid.read()

            od_graph_def.ParseFromString(serialized_graph)

            tf.import_graph_def(od_graph_def, name='')



    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)

    categories = label_map_util.convert_label_map_to_categories(

        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)

    category_index = label_map_util.create_category_index(categories)



    with detection_graph.as_default():

        with tf.Session(graph=detection_graph) as sess:

            # Definite input and output Tensors for detection_graph

            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

            # Each box represents a part of the image where a particular object

            # was detected.

            detection_boxes = detection_graph.get_tensor_by_name(

                'detection_boxes:0')

            # Each score represent how level of confidence for each of the objects.

            # Score is shown on the result image, together with the class

            # label.

            detection_scores = detection_graph.get_tensor_by_name(

                'detection_scores:0')

            detection_classes = detection_graph.get_tensor_by_name(

                'detection_classes:0')

            num_detections = detection_graph.get_tensor_by_name(

                'num_detections:0')

            cap = cv2.VideoCapture('PikachuKetchup.mp4')



            while(cap.isOpened()):

                # Read the frame

                ret, frame = cap.read()



                # Recolor the frame. By default, OpenCV uses BGR color space.

                # This short blog post explains this better:

                # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/

                color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)



                image_np_expanded = np.expand_dims(color_frame, axis=0)



                # Actual detection.

                (boxes, scores, classes, num) = sess.run(

                    [detection_boxes, detection_scores,

                        detection_classes, num_detections],

                    feed_dict={image_tensor: image_np_expanded})



                # Visualization of the results of a detection.

                # note: perform the detections using a higher threshold

                vis_util.visualize_boxes_and_labels_on_image_array(

                    color_frame,

                    np.squeeze(boxes),

                    np.squeeze(classes).astype(np.int32),

                    np.squeeze(scores),

                    category_index,

                    use_normalized_coordinates=True,

                    line_thickness=8,

                    min_score_thresh=.20)



                cv2.imshow('frame', color_frame)

                output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)

                out.write(output_rgb)



                if cv2.waitKey(1) & 0xFF == ord('q'):

                    break



            out.release()

            cap.release()

            cv2.destroyAllWindows()





def main():

    detect_in_video()





if __name__ == '__main__':

    main()

answered 1 hour ago

Tamil Selvan S

112

add a comment |

The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .

"""

Sections of this code were taken from:

https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

"""

import numpy as np



import os

import six.moves.urllib as urllib

import sys

import tarfile

import tensorflow as tf

import zipfile



from collections import defaultdict

from io import StringIO

from matplotlib import pyplot as plt

from PIL import Image



from utils import label_map_util



from utils import visualization_utils as vis_util



import cv2



# Path to frozen detection graph. This is the actual model that is used

# for the object detection.

PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'



# List of the strings that is used to add correct label for each box.

PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')



NUM_CLASSES = 1



sys.path.append("..")





def detect_in_video():



    # VideoWriter is the responsible of creating a copy of the video

    # used for the detections but with the detections overlays. Keep in

    # mind the frame size has to be the same as original video.

    out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(

        'M', 'J', 'P', 'G'), 10, (1280, 720))



    detection_graph = tf.Graph()

    with detection_graph.as_default():

        od_graph_def = tf.GraphDef()

        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:

            serialized_graph = fid.read()

            od_graph_def.ParseFromString(serialized_graph)

            tf.import_graph_def(od_graph_def, name='')



    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)

    categories = label_map_util.convert_label_map_to_categories(

        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)

    category_index = label_map_util.create_category_index(categories)



    with detection_graph.as_default():

        with tf.Session(graph=detection_graph) as sess:

            # Definite input and output Tensors for detection_graph

            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

            # Each box represents a part of the image where a particular object

            # was detected.

            detection_boxes = detection_graph.get_tensor_by_name(

                'detection_boxes:0')

            # Each score represent how level of confidence for each of the objects.

            # Score is shown on the result image, together with the class

            # label.

            detection_scores = detection_graph.get_tensor_by_name(

                'detection_scores:0')

            detection_classes = detection_graph.get_tensor_by_name(

                'detection_classes:0')

            num_detections = detection_graph.get_tensor_by_name(

                'num_detections:0')

            cap = cv2.VideoCapture('PikachuKetchup.mp4')



            while(cap.isOpened()):

                # Read the frame

                ret, frame = cap.read()



                # Recolor the frame. By default, OpenCV uses BGR color space.

                # This short blog post explains this better:

                # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/

                color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)



                image_np_expanded = np.expand_dims(color_frame, axis=0)



                # Actual detection.

                (boxes, scores, classes, num) = sess.run(

                    [detection_boxes, detection_scores,

                        detection_classes, num_detections],

                    feed_dict={image_tensor: image_np_expanded})



                # Visualization of the results of a detection.

                # note: perform the detections using a higher threshold

                vis_util.visualize_boxes_and_labels_on_image_array(

                    color_frame,

                    np.squeeze(boxes),

                    np.squeeze(classes).astype(np.int32),

                    np.squeeze(scores),

                    category_index,

                    use_normalized_coordinates=True,

                    line_thickness=8,

                    min_score_thresh=.20)



                cv2.imshow('frame', color_frame)

                output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)

                out.write(output_rgb)



                if cv2.waitKey(1) & 0xFF == ord('q'):

                    break



            out.release()

            cap.release()

            cv2.destroyAllWindows()





def main():

    detect_in_video()





if __name__ == '__main__':

    main()

answered 1 hour ago

Tamil Selvan S

112

The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .

"""

Sections of this code were taken from:

https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

"""

import numpy as np



import os

import six.moves.urllib as urllib

import sys

import tarfile

import tensorflow as tf

import zipfile



from collections import defaultdict

from io import StringIO

from matplotlib import pyplot as plt

from PIL import Image



from utils import label_map_util



from utils import visualization_utils as vis_util



import cv2



# Path to frozen detection graph. This is the actual model that is used

# for the object detection.

PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'



# List of the strings that is used to add correct label for each box.

PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')



NUM_CLASSES = 1



sys.path.append("..")





def detect_in_video():



    # VideoWriter is the responsible of creating a copy of the video

    # used for the detections but with the detections overlays. Keep in

    # mind the frame size has to be the same as original video.

    out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(

        'M', 'J', 'P', 'G'), 10, (1280, 720))



    detection_graph = tf.Graph()

    with detection_graph.as_default():

        od_graph_def = tf.GraphDef()

        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:

            serialized_graph = fid.read()

            od_graph_def.ParseFromString(serialized_graph)

            tf.import_graph_def(od_graph_def, name='')



    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)

    categories = label_map_util.convert_label_map_to_categories(

        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)

    category_index = label_map_util.create_category_index(categories)



    with detection_graph.as_default():

        with tf.Session(graph=detection_graph) as sess:

            # Definite input and output Tensors for detection_graph

            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

            # Each box represents a part of the image where a particular object

            # was detected.

            detection_boxes = detection_graph.get_tensor_by_name(

                'detection_boxes:0')

            # Each score represent how level of confidence for each of the objects.

            # Score is shown on the result image, together with the class

            # label.

            detection_scores = detection_graph.get_tensor_by_name(

                'detection_scores:0')

            detection_classes = detection_graph.get_tensor_by_name(

                'detection_classes:0')

            num_detections = detection_graph.get_tensor_by_name(

                'num_detections:0')

            cap = cv2.VideoCapture('PikachuKetchup.mp4')



            while(cap.isOpened()):

                # Read the frame

                ret, frame = cap.read()



                # Recolor the frame. By default, OpenCV uses BGR color space.

                # This short blog post explains this better:

                # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/

                color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)



                image_np_expanded = np.expand_dims(color_frame, axis=0)



                # Actual detection.

                (boxes, scores, classes, num) = sess.run(

                    [detection_boxes, detection_scores,

                        detection_classes, num_detections],

                    feed_dict={image_tensor: image_np_expanded})



                # Visualization of the results of a detection.

                # note: perform the detections using a higher threshold

                vis_util.visualize_boxes_and_labels_on_image_array(

                    color_frame,

                    np.squeeze(boxes),

                    np.squeeze(classes).astype(np.int32),

                    np.squeeze(scores),

                    category_index,

                    use_normalized_coordinates=True,

                    line_thickness=8,

                    min_score_thresh=.20)



                cv2.imshow('frame', color_frame)

                output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)

                out.write(output_rgb)



                if cv2.waitKey(1) & 0xFF == ord('q'):

                    break



            out.release()

            cap.release()

            cv2.destroyAllWindows()





def main():

    detect_in_video()





if __name__ == '__main__':

    main()

answered 1 hour ago

Tamil Selvan S

112

answered 1 hour ago

Tamil Selvan S

112

answered 1 hour ago

Tamil Selvan S

112

answered 1 hour ago

Tamil Selvan S

112

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

aike8YYr xT R5o5 6rM,D ti

搜尋此網誌

Gfyuki