上文中给我的小派按上了摄像头,我想它已经能够清晰地看见这个世界了,但是它能够认识这个世界吗?恐怕还是得先学习下,今天我们就要使用高大上的人工智能技术让小派来认识一下这个世界!
人工智能物体识别
Tensonflow安装
在这里我们用到了tensorflow技术,这是谷歌开源的人工智能深度学习的框架,使用这项技术可以让树莓派通过摄像头进行物体识别。我们现在安装tensorflow吧,首先在
https://github.com/lhelontra/tensorflow-on-arm/releases这个地址下载
tensorflow-2.4.0-cp37-none-linux_armv7l.whl,然后在树莓派上执行pip install
tensorflow-2.4.0-cp37-none-linux_armv7l.whl,稍等片刻就可以了。
tensorflow
识别模型安装
tensorflow安装完以后,我们需要安装物体模型,这里的物体模型就是谷歌已经训练好的一些常见物品的模型参数,有了这些模型,相当于小派就能知道比如这是香蕉,这是苹果。安装的步骤如下:
1.下载tensorflow提供的models API并解压,我这里解压后的目录为models,下载路径:
https://github.com/tensorflow/models/tree/master/research/object_detection/models
2.下载训练好的模型并放到上一步models下的object_detection/models目录,下载路径:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
然后还要安装一个辅助的工具Protobuf ,下载的地址是下载地址:
https://github.com/google/protobuf/releases,安装比较简单,在树莓派中下图操作就行。
protobuf 安装
测试代码
最后我们结合上文已经安装的openCV,来测试一下物体识别的效果,测试代码如下:
import numpy as npimport osimport sysimport tarfileimport tensorflow as tfimport cv2import timefrom collections import defaultdict# This is needed since the notebook is stored in the object_detection folder.sys.path.append("../..")from object_detection.utils import label_map_utilfrom object_detection.utils import visualization_utils as vis_util# What model to download.MODEL_NAME = 'ssd_mobilenet_v1_coco_2018_01_28'#MODEL_NAME = 'faster_rcnn_resnet101_coco_11_06_2017'#MODEL_NAME = 'ssd_inception_v2_coco_11_06_2017'MODEL_FILE = MODEL_NAME + '.tar.gz.tar'# Path to frozen detection graph. This is the actual model that is used for the object detection.PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'# List of the strings that is used to add correct label for each box.PATH_TO_LABELS = os.path.join('/home/pi/models/research/object_detection/data', 'mscoco_label_map.pbtxt')#extract the ssd_mobilenetstart = time.clock()NUM_CLASSES = 90#opener = urllib.request.URLopener()#opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)tar_file = tarfile.open(MODEL_FILE)for file in tar_file.getmembers(): file_name = os.path.basename(file.name) if 'frozen_inference_graph.pb' in file_name: tar_file.extract(file, os.getcwd())end= time.clock()print('load the model',(end-start))detection_graph = tf.Graph()with detection_graph.as_default(): od_graph_def = tf.compat.v1.GraphDef() with tf.compat.v1.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='')label_map = label_map_util.load_labelmap(PATH_TO_LABELS)categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)category_index = label_map_util.create_category_index(categories)cap = cv2.VideoCapture(0)with detection_graph.as_default(): with tf.compat.v1.Session(graph=detection_graph) as sess: writer = tf.compat.v1.summary.FileWriter("logs/", sess.graph) sess.run(tf.compat.v1.global_variables_initializer()) while(1): start = time.clock() ret, frame = cap.read() if cv2.waitKey(1) & 0xFF == ord('q'): break image_np=frame # the array based representation of the image will be used later in order to prepare the # result image with boxes and labels on it. # Expand dimensions since the model expects images to have shape: [1, None, None, 3] image_np_expanded = np.expand_dims(image_np, axis=0) image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') # Each box represents a part of the image where a particular object was detected. boxes = detection_graph.get_tensor_by_name('detection_boxes:0') # Each score represent how level of confidence for each of the objects. # Score is shown on the result image, together with the class label. scores = detection_graph.get_tensor_by_name('detection_scores:0') classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') # Actual detection. (boxes, scores, classes, num_detections) = sess.run( [boxes, scores, classes, num_detections], feed_dict={image_tensor: image_np_expanded}) # Visualization of the results of a detection. vis_util.visualize_boxes_and_labels_on_image_array( image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=6) final_score = np.squeeze(scores) final_class = np.squeeze(classes) for i in range(num_detections[0].astype(np.int32)): if final_score[i] > 0.5: index = final_class[i].astype(np.int32) print("object is %s %s" % (index, category_index[index]['name'])) #print(category_index) #print(scores) #print(classes) #print(boxes) #print(num_detections) end = time.clock() #print('frame:',1.0/(end - start)) # print 'One frame detect take time:',end - start cv2.imshow("capture", image_np) print('after cv2 show') cv2.waitKey(1)cap.release()cv2.destroyAllWindows()
注意需要在图像界面下执行以上python代码,效果如下图:
识别结果
我们看到在上图中,两把椅子的识别率分别是95%和65%,还是比较准确的。
小派终于能认识到这个世界了,学习的还真快,有了这个本领,后面可以好好发挥呢,以后给孩子搞个早教机,拿个小物件,识别出来就读出对应的单词,或者做个智能小车,满房间找东西,哈哈,想想空间很大呢!大家如果有什么好主意,都可以提出来,我来一一实现:)
声明:本站部分文章及图片源自用户投稿,如本站任何资料有侵权请您尽早请联系jinwei@zod.com.cn进行处理,非常感谢!