How to run a saved TensorFlow Model? (Video Prediction Model)












0












$begingroup$


I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model



python prediction_train.py


with optional parameters.

Can anyone please explain how do I use this model to predict a video sequence? I'm new to deeplearning and tensorflow. So, I'm not able to understand the code properly. My current task is to just run the code and see the output (i.e. the videos predicted by this model).



All I could understand was, it uses tensirflow saver to save the checkpoints. I'm guessing these checkpoints are the intermediate trained model after few epochs (2000 in this case). How to use these models to predict the next frames of a video?



Any help is greatly appreciated :)










share|improve this question











$endgroup$








  • 1




    $begingroup$
    You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
    $endgroup$
    – user12075
    Jan 10 at 18:37
















0












$begingroup$


I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model



python prediction_train.py


with optional parameters.

Can anyone please explain how do I use this model to predict a video sequence? I'm new to deeplearning and tensorflow. So, I'm not able to understand the code properly. My current task is to just run the code and see the output (i.e. the videos predicted by this model).



All I could understand was, it uses tensirflow saver to save the checkpoints. I'm guessing these checkpoints are the intermediate trained model after few epochs (2000 in this case). How to use these models to predict the next frames of a video?



Any help is greatly appreciated :)










share|improve this question











$endgroup$








  • 1




    $begingroup$
    You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
    $endgroup$
    – user12075
    Jan 10 at 18:37














0












0








0





$begingroup$


I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model



python prediction_train.py


with optional parameters.

Can anyone please explain how do I use this model to predict a video sequence? I'm new to deeplearning and tensorflow. So, I'm not able to understand the code properly. My current task is to just run the code and see the output (i.e. the videos predicted by this model).



All I could understand was, it uses tensirflow saver to save the checkpoints. I'm guessing these checkpoints are the intermediate trained model after few epochs (2000 in this case). How to use these models to predict the next frames of a video?



Any help is greatly appreciated :)










share|improve this question











$endgroup$




I was reading this paper. The code is available in GitHub. In the README.md file, they've mentioned how to train the model



python prediction_train.py


with optional parameters.

Can anyone please explain how do I use this model to predict a video sequence? I'm new to deeplearning and tensorflow. So, I'm not able to understand the code properly. My current task is to just run the code and see the output (i.e. the videos predicted by this model).



All I could understand was, it uses tensirflow saver to save the checkpoints. I'm guessing these checkpoints are the intermediate trained model after few epochs (2000 in this case). How to use these models to predict the next frames of a video?



Any help is greatly appreciated :)







deep-learning tensorflow






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 10 at 18:09









user12075

1,276515




1,276515










asked Jan 10 at 5:57









Nagabhushan S NNagabhushan S N

1276




1276








  • 1




    $begingroup$
    You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
    $endgroup$
    – user12075
    Jan 10 at 18:37














  • 1




    $begingroup$
    You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
    $endgroup$
    – user12075
    Jan 10 at 18:37








1




1




$begingroup$
You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
$endgroup$
– user12075
Jan 10 at 18:37




$begingroup$
You can start by modifying the prediction_train.py into your own script. You need to load a pretrained model by providing the model path in line 48; then run the gen_images op instead of the train_op or summ_op to get the predicted images.
$endgroup$
– user12075
Jan 10 at 18:37










2 Answers
2






active

oldest

votes


















1












$begingroup$

The TensorFlow saver is used to save the weights of a specific model at some given point. When you want to use a trained model, you must first define the model's architecture (which should be similar to the one used for saving the weights), then you can use the same "saver" class to restore the weights:



with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "../my_saved_model.ckpt")


Regarding your initial question. I think that if you are just starting with deep learning and TensorFlow, this is the wrong place to start and you should first understand how TensorFlow works in general by applying it at easier tasks like image classification (start with MNIST).



From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:



def construct_model(images,
actions=None,
states=None,
iter_num=-1.0,
k=-1,
use_state=True,
num_masks=10,
stp=False,
cdna=True,
dna=False,
context_frames=2):
"""Build convolutional lstm video predictor using STP, CDNA, or DNA.
Args:
images: tensor of ground truth image sequences
actions: tensor of action sequences
states: tensor of ground truth state sequences
iter_num: tensor of the current training iteration (for sched. sampling)
k: constant used for scheduled sampling. -1 to feed in own prediction.
use_state: True to include state and action in prediction
num_masks: the number of different pixel motion predictions (and
the number of masks for each of those predictions)
stp: True to use Spatial Transformer Predictor (STP)
cdna: True to use Convoluational Dynamic Neural Advection (CDNA)
dna: True to use Dynamic Neural Advection (DNA)
context_frames: number of ground truth frames to pass in before
feeding in own predictions
Returns:
gen_images: predicted future image frames
gen_states: predicted future states





share|improve this answer









$endgroup$













  • $begingroup$
    Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
    $endgroup$
    – Nagabhushan S N
    Jan 11 at 5:52



















0












$begingroup$

The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .



"""
Sections of this code were taken from:
https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
"""
import numpy as np

import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

from utils import label_map_util

from utils import visualization_utils as vis_util

import cv2

# Path to frozen detection graph. This is the actual model that is used
# for the object detection.
PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')

NUM_CLASSES = 1

sys.path.append("..")


def detect_in_video():

# VideoWriter is the responsible of creating a copy of the video
# used for the detections but with the detections overlays. Keep in
# mind the frame size has to be the same as original video.
out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(
'M', 'J', 'P', 'G'), 10, (1280, 720))

detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
# Definite input and output Tensors for detection_graph
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object
# was detected.
detection_boxes = detection_graph.get_tensor_by_name(
'detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class
# label.
detection_scores = detection_graph.get_tensor_by_name(
'detection_scores:0')
detection_classes = detection_graph.get_tensor_by_name(
'detection_classes:0')
num_detections = detection_graph.get_tensor_by_name(
'num_detections:0')
cap = cv2.VideoCapture('PikachuKetchup.mp4')

while(cap.isOpened()):
# Read the frame
ret, frame = cap.read()

# Recolor the frame. By default, OpenCV uses BGR color space.
# This short blog post explains this better:
# https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/
color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

image_np_expanded = np.expand_dims(color_frame, axis=0)

# Actual detection.
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores,
detection_classes, num_detections],
feed_dict={image_tensor: image_np_expanded})

# Visualization of the results of a detection.
# note: perform the detections using a higher threshold
vis_util.visualize_boxes_and_labels_on_image_array(
color_frame,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8,
min_score_thresh=.20)

cv2.imshow('frame', color_frame)
output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)
out.write(output_rgb)

if cv2.waitKey(1) & 0xFF == ord('q'):
break

out.release()
cap.release()
cv2.destroyAllWindows()


def main():
detect_in_video()


if __name__ == '__main__':
main()





share|improve this answer









$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "557"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43756%2fhow-to-run-a-saved-tensorflow-model-video-prediction-model%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1












    $begingroup$

    The TensorFlow saver is used to save the weights of a specific model at some given point. When you want to use a trained model, you must first define the model's architecture (which should be similar to the one used for saving the weights), then you can use the same "saver" class to restore the weights:



    with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, "../my_saved_model.ckpt")


    Regarding your initial question. I think that if you are just starting with deep learning and TensorFlow, this is the wrong place to start and you should first understand how TensorFlow works in general by applying it at easier tasks like image classification (start with MNIST).



    From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:



    def construct_model(images,
    actions=None,
    states=None,
    iter_num=-1.0,
    k=-1,
    use_state=True,
    num_masks=10,
    stp=False,
    cdna=True,
    dna=False,
    context_frames=2):
    """Build convolutional lstm video predictor using STP, CDNA, or DNA.
    Args:
    images: tensor of ground truth image sequences
    actions: tensor of action sequences
    states: tensor of ground truth state sequences
    iter_num: tensor of the current training iteration (for sched. sampling)
    k: constant used for scheduled sampling. -1 to feed in own prediction.
    use_state: True to include state and action in prediction
    num_masks: the number of different pixel motion predictions (and
    the number of masks for each of those predictions)
    stp: True to use Spatial Transformer Predictor (STP)
    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)
    dna: True to use Dynamic Neural Advection (DNA)
    context_frames: number of ground truth frames to pass in before
    feeding in own predictions
    Returns:
    gen_images: predicted future image frames
    gen_states: predicted future states





    share|improve this answer









    $endgroup$













    • $begingroup$
      Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
      $endgroup$
      – Nagabhushan S N
      Jan 11 at 5:52
















    1












    $begingroup$

    The TensorFlow saver is used to save the weights of a specific model at some given point. When you want to use a trained model, you must first define the model's architecture (which should be similar to the one used for saving the weights), then you can use the same "saver" class to restore the weights:



    with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, "../my_saved_model.ckpt")


    Regarding your initial question. I think that if you are just starting with deep learning and TensorFlow, this is the wrong place to start and you should first understand how TensorFlow works in general by applying it at easier tasks like image classification (start with MNIST).



    From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:



    def construct_model(images,
    actions=None,
    states=None,
    iter_num=-1.0,
    k=-1,
    use_state=True,
    num_masks=10,
    stp=False,
    cdna=True,
    dna=False,
    context_frames=2):
    """Build convolutional lstm video predictor using STP, CDNA, or DNA.
    Args:
    images: tensor of ground truth image sequences
    actions: tensor of action sequences
    states: tensor of ground truth state sequences
    iter_num: tensor of the current training iteration (for sched. sampling)
    k: constant used for scheduled sampling. -1 to feed in own prediction.
    use_state: True to include state and action in prediction
    num_masks: the number of different pixel motion predictions (and
    the number of masks for each of those predictions)
    stp: True to use Spatial Transformer Predictor (STP)
    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)
    dna: True to use Dynamic Neural Advection (DNA)
    context_frames: number of ground truth frames to pass in before
    feeding in own predictions
    Returns:
    gen_images: predicted future image frames
    gen_states: predicted future states





    share|improve this answer









    $endgroup$













    • $begingroup$
      Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
      $endgroup$
      – Nagabhushan S N
      Jan 11 at 5:52














    1












    1








    1





    $begingroup$

    The TensorFlow saver is used to save the weights of a specific model at some given point. When you want to use a trained model, you must first define the model's architecture (which should be similar to the one used for saving the weights), then you can use the same "saver" class to restore the weights:



    with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, "../my_saved_model.ckpt")


    Regarding your initial question. I think that if you are just starting with deep learning and TensorFlow, this is the wrong place to start and you should first understand how TensorFlow works in general by applying it at easier tasks like image classification (start with MNIST).



    From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:



    def construct_model(images,
    actions=None,
    states=None,
    iter_num=-1.0,
    k=-1,
    use_state=True,
    num_masks=10,
    stp=False,
    cdna=True,
    dna=False,
    context_frames=2):
    """Build convolutional lstm video predictor using STP, CDNA, or DNA.
    Args:
    images: tensor of ground truth image sequences
    actions: tensor of action sequences
    states: tensor of ground truth state sequences
    iter_num: tensor of the current training iteration (for sched. sampling)
    k: constant used for scheduled sampling. -1 to feed in own prediction.
    use_state: True to include state and action in prediction
    num_masks: the number of different pixel motion predictions (and
    the number of masks for each of those predictions)
    stp: True to use Spatial Transformer Predictor (STP)
    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)
    dna: True to use Dynamic Neural Advection (DNA)
    context_frames: number of ground truth frames to pass in before
    feeding in own predictions
    Returns:
    gen_images: predicted future image frames
    gen_states: predicted future states





    share|improve this answer









    $endgroup$



    The TensorFlow saver is used to save the weights of a specific model at some given point. When you want to use a trained model, you must first define the model's architecture (which should be similar to the one used for saving the weights), then you can use the same "saver" class to restore the weights:



    with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, "../my_saved_model.ckpt")


    Regarding your initial question. I think that if you are just starting with deep learning and TensorFlow, this is the wrong place to start and you should first understand how TensorFlow works in general by applying it at easier tasks like image classification (start with MNIST).



    From what I understand, you need to use the "construct_model" function and pass it your initial image sequence (video) and some action tensor, and it should output a predicted frames:



    def construct_model(images,
    actions=None,
    states=None,
    iter_num=-1.0,
    k=-1,
    use_state=True,
    num_masks=10,
    stp=False,
    cdna=True,
    dna=False,
    context_frames=2):
    """Build convolutional lstm video predictor using STP, CDNA, or DNA.
    Args:
    images: tensor of ground truth image sequences
    actions: tensor of action sequences
    states: tensor of ground truth state sequences
    iter_num: tensor of the current training iteration (for sched. sampling)
    k: constant used for scheduled sampling. -1 to feed in own prediction.
    use_state: True to include state and action in prediction
    num_masks: the number of different pixel motion predictions (and
    the number of masks for each of those predictions)
    stp: True to use Spatial Transformer Predictor (STP)
    cdna: True to use Convoluational Dynamic Neural Advection (CDNA)
    dna: True to use Dynamic Neural Advection (DNA)
    context_frames: number of ground truth frames to pass in before
    feeding in own predictions
    Returns:
    gen_images: predicted future image frames
    gen_states: predicted future states






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jan 10 at 15:58









    Mark.FMark.F

    766218




    766218












    • $begingroup$
      Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
      $endgroup$
      – Nagabhushan S N
      Jan 11 at 5:52


















    • $begingroup$
      Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
      $endgroup$
      – Nagabhushan S N
      Jan 11 at 5:52
















    $begingroup$
    Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
    $endgroup$
    – Nagabhushan S N
    Jan 11 at 5:52




    $begingroup$
    Thanks! And yes, I'm also parallelly reading neural networks used to classify MNIST digits from here: neuralnetworksanddeeplearning.com by Michael Nielsen. But again that takes some time. My guide has given me this task. So, I have to do this now :)
    $endgroup$
    – Nagabhushan S N
    Jan 11 at 5:52











    0












    $begingroup$

    The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .



    """
    Sections of this code were taken from:
    https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
    """
    import numpy as np

    import os
    import six.moves.urllib as urllib
    import sys
    import tarfile
    import tensorflow as tf
    import zipfile

    from collections import defaultdict
    from io import StringIO
    from matplotlib import pyplot as plt
    from PIL import Image

    from utils import label_map_util

    from utils import visualization_utils as vis_util

    import cv2

    # Path to frozen detection graph. This is the actual model that is used
    # for the object detection.
    PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'

    # List of the strings that is used to add correct label for each box.
    PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')

    NUM_CLASSES = 1

    sys.path.append("..")


    def detect_in_video():

    # VideoWriter is the responsible of creating a copy of the video
    # used for the detections but with the detections overlays. Keep in
    # mind the frame size has to be the same as original video.
    out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(
    'M', 'J', 'P', 'G'), 10, (1280, 720))

    detection_graph = tf.Graph()
    with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')

    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
    categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)

    with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
    # Definite input and output Tensors for detection_graph
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object
    # was detected.
    detection_boxes = detection_graph.get_tensor_by_name(
    'detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class
    # label.
    detection_scores = detection_graph.get_tensor_by_name(
    'detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name(
    'detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name(
    'num_detections:0')
    cap = cv2.VideoCapture('PikachuKetchup.mp4')

    while(cap.isOpened()):
    # Read the frame
    ret, frame = cap.read()

    # Recolor the frame. By default, OpenCV uses BGR color space.
    # This short blog post explains this better:
    # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/
    color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    image_np_expanded = np.expand_dims(color_frame, axis=0)

    # Actual detection.
    (boxes, scores, classes, num) = sess.run(
    [detection_boxes, detection_scores,
    detection_classes, num_detections],
    feed_dict={image_tensor: image_np_expanded})

    # Visualization of the results of a detection.
    # note: perform the detections using a higher threshold
    vis_util.visualize_boxes_and_labels_on_image_array(
    color_frame,
    np.squeeze(boxes),
    np.squeeze(classes).astype(np.int32),
    np.squeeze(scores),
    category_index,
    use_normalized_coordinates=True,
    line_thickness=8,
    min_score_thresh=.20)

    cv2.imshow('frame', color_frame)
    output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)
    out.write(output_rgb)

    if cv2.waitKey(1) & 0xFF == ord('q'):
    break

    out.release()
    cap.release()
    cv2.destroyAllWindows()


    def main():
    detect_in_video()


    if __name__ == '__main__':
    main()





    share|improve this answer









    $endgroup$


















      0












      $begingroup$

      The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .



      """
      Sections of this code were taken from:
      https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
      """
      import numpy as np

      import os
      import six.moves.urllib as urllib
      import sys
      import tarfile
      import tensorflow as tf
      import zipfile

      from collections import defaultdict
      from io import StringIO
      from matplotlib import pyplot as plt
      from PIL import Image

      from utils import label_map_util

      from utils import visualization_utils as vis_util

      import cv2

      # Path to frozen detection graph. This is the actual model that is used
      # for the object detection.
      PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'

      # List of the strings that is used to add correct label for each box.
      PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')

      NUM_CLASSES = 1

      sys.path.append("..")


      def detect_in_video():

      # VideoWriter is the responsible of creating a copy of the video
      # used for the detections but with the detections overlays. Keep in
      # mind the frame size has to be the same as original video.
      out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(
      'M', 'J', 'P', 'G'), 10, (1280, 720))

      detection_graph = tf.Graph()
      with detection_graph.as_default():
      od_graph_def = tf.GraphDef()
      with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
      serialized_graph = fid.read()
      od_graph_def.ParseFromString(serialized_graph)
      tf.import_graph_def(od_graph_def, name='')

      label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
      categories = label_map_util.convert_label_map_to_categories(
      label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
      category_index = label_map_util.create_category_index(categories)

      with detection_graph.as_default():
      with tf.Session(graph=detection_graph) as sess:
      # Definite input and output Tensors for detection_graph
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      # Each box represents a part of the image where a particular object
      # was detected.
      detection_boxes = detection_graph.get_tensor_by_name(
      'detection_boxes:0')
      # Each score represent how level of confidence for each of the objects.
      # Score is shown on the result image, together with the class
      # label.
      detection_scores = detection_graph.get_tensor_by_name(
      'detection_scores:0')
      detection_classes = detection_graph.get_tensor_by_name(
      'detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name(
      'num_detections:0')
      cap = cv2.VideoCapture('PikachuKetchup.mp4')

      while(cap.isOpened()):
      # Read the frame
      ret, frame = cap.read()

      # Recolor the frame. By default, OpenCV uses BGR color space.
      # This short blog post explains this better:
      # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/
      color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

      image_np_expanded = np.expand_dims(color_frame, axis=0)

      # Actual detection.
      (boxes, scores, classes, num) = sess.run(
      [detection_boxes, detection_scores,
      detection_classes, num_detections],
      feed_dict={image_tensor: image_np_expanded})

      # Visualization of the results of a detection.
      # note: perform the detections using a higher threshold
      vis_util.visualize_boxes_and_labels_on_image_array(
      color_frame,
      np.squeeze(boxes),
      np.squeeze(classes).astype(np.int32),
      np.squeeze(scores),
      category_index,
      use_normalized_coordinates=True,
      line_thickness=8,
      min_score_thresh=.20)

      cv2.imshow('frame', color_frame)
      output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)
      out.write(output_rgb)

      if cv2.waitKey(1) & 0xFF == ord('q'):
      break

      out.release()
      cap.release()
      cv2.destroyAllWindows()


      def main():
      detect_in_video()


      if __name__ == '__main__':
      main()





      share|improve this answer









      $endgroup$
















        0












        0








        0





        $begingroup$

        The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .



        """
        Sections of this code were taken from:
        https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
        """
        import numpy as np

        import os
        import six.moves.urllib as urllib
        import sys
        import tarfile
        import tensorflow as tf
        import zipfile

        from collections import defaultdict
        from io import StringIO
        from matplotlib import pyplot as plt
        from PIL import Image

        from utils import label_map_util

        from utils import visualization_utils as vis_util

        import cv2

        # Path to frozen detection graph. This is the actual model that is used
        # for the object detection.
        PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'

        # List of the strings that is used to add correct label for each box.
        PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')

        NUM_CLASSES = 1

        sys.path.append("..")


        def detect_in_video():

        # VideoWriter is the responsible of creating a copy of the video
        # used for the detections but with the detections overlays. Keep in
        # mind the frame size has to be the same as original video.
        out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(
        'M', 'J', 'P', 'G'), 10, (1280, 720))

        detection_graph = tf.Graph()
        with detection_graph.as_default():
        od_graph_def = tf.GraphDef()
        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

        label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
        categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
        category_index = label_map_util.create_category_index(categories)

        with detection_graph.as_default():
        with tf.Session(graph=detection_graph) as sess:
        # Definite input and output Tensors for detection_graph
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        # Each box represents a part of the image where a particular object
        # was detected.
        detection_boxes = detection_graph.get_tensor_by_name(
        'detection_boxes:0')
        # Each score represent how level of confidence for each of the objects.
        # Score is shown on the result image, together with the class
        # label.
        detection_scores = detection_graph.get_tensor_by_name(
        'detection_scores:0')
        detection_classes = detection_graph.get_tensor_by_name(
        'detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name(
        'num_detections:0')
        cap = cv2.VideoCapture('PikachuKetchup.mp4')

        while(cap.isOpened()):
        # Read the frame
        ret, frame = cap.read()

        # Recolor the frame. By default, OpenCV uses BGR color space.
        # This short blog post explains this better:
        # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/
        color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        image_np_expanded = np.expand_dims(color_frame, axis=0)

        # Actual detection.
        (boxes, scores, classes, num) = sess.run(
        [detection_boxes, detection_scores,
        detection_classes, num_detections],
        feed_dict={image_tensor: image_np_expanded})

        # Visualization of the results of a detection.
        # note: perform the detections using a higher threshold
        vis_util.visualize_boxes_and_labels_on_image_array(
        color_frame,
        np.squeeze(boxes),
        np.squeeze(classes).astype(np.int32),
        np.squeeze(scores),
        category_index,
        use_normalized_coordinates=True,
        line_thickness=8,
        min_score_thresh=.20)

        cv2.imshow('frame', color_frame)
        output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)
        out.write(output_rgb)

        if cv2.waitKey(1) & 0xFF == ord('q'):
        break

        out.release()
        cap.release()
        cv2.destroyAllWindows()


        def main():
        detect_in_video()


        if __name__ == '__main__':
        main()





        share|improve this answer









        $endgroup$



        The Prediction in video is too Easy..Use Opencv for video to read video into images and store them in an numpy array.Later you will load your tf model and then do some prediction .



        """
        Sections of this code were taken from:
        https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb
        """
        import numpy as np

        import os
        import six.moves.urllib as urllib
        import sys
        import tarfile
        import tensorflow as tf
        import zipfile

        from collections import defaultdict
        from io import StringIO
        from matplotlib import pyplot as plt
        from PIL import Image

        from utils import label_map_util

        from utils import visualization_utils as vis_util

        import cv2

        # Path to frozen detection graph. This is the actual model that is used
        # for the object detection.
        PATH_TO_CKPT = '../freezed_pb5_optimized/frozen_inference_graph.pb'

        # List of the strings that is used to add correct label for each box.
        PATH_TO_LABELS = os.path.join('../../../training', 'object-detection.pbtxt')

        NUM_CLASSES = 1

        sys.path.append("..")


        def detect_in_video():

        # VideoWriter is the responsible of creating a copy of the video
        # used for the detections but with the detections overlays. Keep in
        # mind the frame size has to be the same as original video.
        out = cv2.VideoWriter('pikachu_detection_1v3.avi', cv2.VideoWriter_fourcc(
        'M', 'J', 'P', 'G'), 10, (1280, 720))

        detection_graph = tf.Graph()
        with detection_graph.as_default():
        od_graph_def = tf.GraphDef()
        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

        label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
        categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
        category_index = label_map_util.create_category_index(categories)

        with detection_graph.as_default():
        with tf.Session(graph=detection_graph) as sess:
        # Definite input and output Tensors for detection_graph
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        # Each box represents a part of the image where a particular object
        # was detected.
        detection_boxes = detection_graph.get_tensor_by_name(
        'detection_boxes:0')
        # Each score represent how level of confidence for each of the objects.
        # Score is shown on the result image, together with the class
        # label.
        detection_scores = detection_graph.get_tensor_by_name(
        'detection_scores:0')
        detection_classes = detection_graph.get_tensor_by_name(
        'detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name(
        'num_detections:0')
        cap = cv2.VideoCapture('PikachuKetchup.mp4')

        while(cap.isOpened()):
        # Read the frame
        ret, frame = cap.read()

        # Recolor the frame. By default, OpenCV uses BGR color space.
        # This short blog post explains this better:
        # https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/
        color_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        image_np_expanded = np.expand_dims(color_frame, axis=0)

        # Actual detection.
        (boxes, scores, classes, num) = sess.run(
        [detection_boxes, detection_scores,
        detection_classes, num_detections],
        feed_dict={image_tensor: image_np_expanded})

        # Visualization of the results of a detection.
        # note: perform the detections using a higher threshold
        vis_util.visualize_boxes_and_labels_on_image_array(
        color_frame,
        np.squeeze(boxes),
        np.squeeze(classes).astype(np.int32),
        np.squeeze(scores),
        category_index,
        use_normalized_coordinates=True,
        line_thickness=8,
        min_score_thresh=.20)

        cv2.imshow('frame', color_frame)
        output_rgb = cv2.cvtColor(color_frame, cv2.COLOR_RGB2BGR)
        out.write(output_rgb)

        if cv2.waitKey(1) & 0xFF == ord('q'):
        break

        out.release()
        cap.release()
        cv2.destroyAllWindows()


        def main():
        detect_in_video()


        if __name__ == '__main__':
        main()






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 1 hour ago









        Tamil Selvan STamil Selvan S

        112




        112






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43756%2fhow-to-run-a-saved-tensorflow-model-video-prediction-model%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Ponta tanko

            Tantalo (mitologio)

            Erzsébet Schaár