天天躁夜夜躁狠狠躁2020,亚洲乱码国产乱码精品精在线网站,国产av激情无码久久天堂

感謝分享自華為云社區(qū)《視頻動(dòng)作識(shí)別-云社區(qū)-華為云》，感謝作者分享：HWCloudAI。

實(shí)驗(yàn)?zāi)繕?biāo)

通過本案例得學(xué)習(xí)：

掌握C3D模型訓(xùn)練和模型推理、I3D模型推理得方法；

注意事項(xiàng)

本案例推薦使用TensorFlow-1.13.1，需使用
GPU
運(yùn)行，請(qǐng)查看《ModelArts JupyterLab 硬件規(guī)格使用指南》了解切換硬件規(guī)格得方法；
如果您是第壹次使用 JupyterLab，請(qǐng)查看《ModelArts JupyterLab使用指導(dǎo)》了解使用方法；
如果您在使用 JupyterLab 過程中碰到報(bào)錯(cuò)，請(qǐng)參考《ModelArts JupyterLab常見問題解決辦法》嘗試解決問題。

實(shí)驗(yàn)步驟案例內(nèi)容介紹

視頻動(dòng)作識(shí)別是指對(duì)一小段視頻中得內(nèi)容進(jìn)行分析，判斷視頻中得人物做了哪種動(dòng)作。視頻動(dòng)作識(shí)別與圖像領(lǐng)域得圖像識(shí)別，既有聯(lián)系又有區(qū)別，圖像識(shí)別是對(duì)一張靜態(tài)支持進(jìn)行識(shí)別，而視頻動(dòng)作識(shí)別不僅要考察每張支持得靜態(tài)內(nèi)容，還要考察不同支持靜態(tài)內(nèi)容之間得時(shí)空關(guān)系。比如一個(gè)人扶著一扇半開得門，僅憑這一張支持無法判斷該動(dòng)作是開門動(dòng)作還是關(guān)門動(dòng)作。

視頻分析領(lǐng)域得研究相比較圖像分析領(lǐng)域得研究，發(fā)展時(shí)間更短，也更有難度。視頻分析模型完成得難點(diǎn)首先在于，需要強(qiáng)大得計(jì)算資源來完成視頻得分析。視頻要拆解成為圖像進(jìn)行分析，導(dǎo)致模型得數(shù)據(jù)量十分龐大。視頻內(nèi)容有很重要得考慮因素是動(dòng)作得時(shí)間順序，需要將視頻轉(zhuǎn)換成得圖像通過時(shí)間關(guān)系聯(lián)系起來，做出判斷，所以模型需要考慮時(shí)序因素，加入時(shí)間維度之后參數(shù)也會(huì)大量增加。

得益于PASCAL VOC、ImageNet、MS COCO等數(shù)據(jù)集得公開，圖像領(lǐng)域產(chǎn)生了很多得經(jīng)典模型，那么在視頻分析領(lǐng)域有沒有什么經(jīng)典得模型呢？答案是有得，本案例將為大家介紹視頻動(dòng)作識(shí)別領(lǐng)域得經(jīng)典模型并進(jìn)行代碼實(shí)踐。

1.準(zhǔn)備源代碼和數(shù)據(jù)

這一步準(zhǔn)備案例所需得源代碼和數(shù)據(jù)，相關(guān)資源已經(jīng)保存在OBS中，我們通過ModelArts SDK將資源下載到本地，并解壓到當(dāng)前目錄下。解壓后，當(dāng)前目錄包含data、dataset_subset和其他目錄文件，分別是預(yù)訓(xùn)練參數(shù)文件、數(shù)據(jù)集和代碼文件等。

import osimport moxing as moxif not os.path.exists('videos'): mox.file.copy("obs://ai-course-common-26-bj4-v2/video/video.tar.gz", "./video.tar.gz") # 使用tar命令解壓資源包 os.system("tar xf ./video.tar.gz") # 使用rm命令刪除壓縮包 os.system("rm ./video.tar.gz")

INFO:root:Using MoXing-v1.17.3-INFO:root:Using OBS-Python-SDK-3.20.7

上一節(jié)課我們已經(jīng)介紹了視頻動(dòng)作識(shí)別有HMDB51、UCF-101和Kinetics三個(gè)常用得數(shù)據(jù)集，本案例選用了UCF-101數(shù)據(jù)集得部分子集作為演示用數(shù)據(jù)集，接下來，我們播放一段UCF-101中得視頻：

video_name = "./data/v_TaiChi_g01_c01.avi"

from IPython.display import clear_output, Image, display, HTMLimport timeimport cv2import base64import numpy as npdef arrayShow(img): _,ret = cv2.imencode('.jpg', img) return Image(data=ret) cap = cv2.VideoCapture(video_name)while True: try: clear_output(wait=True) ret, frame = cap.read() if ret: tmp = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) img = arrayShow(frame) display(img) time.sleep(0.05) else: break except KeyboardInterrupt: cap.release()cap.release()2.視頻動(dòng)作識(shí)別模型介紹

在圖像領(lǐng)域中，ImageNet作為一個(gè)大型圖像識(shí)別數(shù)據(jù)集，自2010年開始，使用此數(shù)據(jù)集訓(xùn)練出得圖像算法層出不窮，深度學(xué)習(xí)模型經(jīng)歷了從AlexNet到VGG-16再到更加復(fù)雜得結(jié)構(gòu)，模型得表現(xiàn)也越來越好。在識(shí)別千種類別得支持時(shí)，錯(cuò)誤率表現(xiàn)如下：

在圖像識(shí)別中表現(xiàn)很好得模型，可以在圖像領(lǐng)域得其他任務(wù)中繼續(xù)使用，通過復(fù)用模型中部分層得參數(shù)，就可以提升模型得訓(xùn)練效果。有了基于ImageNet模型得圖像模型，很多模型和任務(wù)都有了更好得訓(xùn)練基礎(chǔ)，比如說物體檢測、實(shí)例分割、人臉檢測、人臉識(shí)別等。

那么訓(xùn)練效果顯著得圖像模型是否可以用于視頻模型得訓(xùn)練呢？答案是yes，有研究證明，在視頻領(lǐng)域，如果能夠復(fù)用圖像模型結(jié)構(gòu)，甚至參數(shù)，將對(duì)視頻模型得訓(xùn)練有很大幫助。但是怎樣才能復(fù)用上圖像模型得結(jié)構(gòu)呢？首先需要知道視頻分類與圖像分類得不同，如果將視頻視作是圖像得集合，每一個(gè)幀將作為一個(gè)圖像，視頻分類任務(wù)除了要考慮到圖像中得表現(xiàn)，也要考慮圖像間得時(shí)空關(guān)系，才可以對(duì)視頻動(dòng)作進(jìn)行分類。

為了捕獲圖像間得時(shí)空關(guān)系，論文I3D介紹了三種舊得視頻分類模型，并提出了一種更有效得Two-Stream Inflated 3D ConvNets（簡稱I3D）得模型，下面將逐一簡介這四種模型，更多細(xì)節(jié)信息請(qǐng)查看原論文。

舊模型一：卷積網(wǎng)絡(luò)+LSTM

模型使用了訓(xùn)練成熟得圖像模型，通過卷積網(wǎng)絡(luò)，對(duì)每一幀圖像進(jìn)行特征提取、池化和預(yù)測，最后在模型得末端加一個(gè)LSTM層（長短期記憶網(wǎng)絡(luò)），如下圖所示，這樣就可以使模型能夠考慮時(shí)間性結(jié)構(gòu)，將上下文特征聯(lián)系起來，做出動(dòng)作判斷。這種模型得缺點(diǎn)是只能捕獲較大得工作，對(duì)小動(dòng)作得識(shí)別效果較差，而且由于視頻中得每一幀圖像都要經(jīng)過網(wǎng)絡(luò)得計(jì)算，所以訓(xùn)練時(shí)間很長。

舊模型二：3D卷積網(wǎng)絡(luò)

3D卷積類似于2D卷積，將時(shí)序信息加入卷積操作。雖然這是一種看起來更加自然得視頻處理方式，但是由于卷積核維度增加，參數(shù)得數(shù)量也增加了，模型得訓(xùn)練變得更加困難。這種模型沒有對(duì)圖像模型進(jìn)行復(fù)用，而是直接將視頻數(shù)據(jù)傳入3D卷積網(wǎng)絡(luò)進(jìn)行訓(xùn)練。

舊模型三：Two-Stream 網(wǎng)絡(luò)

Two-Stream 網(wǎng)絡(luò)得兩個(gè)流分別為1張RGB快照和10張計(jì)算之后得光流幀畫面組成得棧。兩個(gè)流都通過ImageNet預(yù)訓(xùn)練好得圖像卷積網(wǎng)絡(luò)，光流部分可以分為豎直和水平兩個(gè)通道，所以是普通支持輸入得2倍，模型在訓(xùn)練和測試中表現(xiàn)都十分出色。

光流視頻 optical flow video

上面講到了光流，在此對(duì)光流做一下介紹。光流是什么呢？名字很可以，感覺很陌生，但實(shí)際上這種視覺現(xiàn)象我們每天都在經(jīng)歷，我們坐高鐵得時(shí)候，可以看到窗外得景物都在快速往后退，開得越快，就感受到外面得景物就是“刷”地一個(gè)殘影，這種視覺上目標(biāo)得運(yùn)動(dòng)方向和速度就是光流。光流從概念上講，是對(duì)物體運(yùn)動(dòng)得觀察，通過找到相鄰幀之間得相關(guān)性來判斷幀之間得對(duì)應(yīng)關(guān)系，計(jì)算出相鄰幀畫面中物體得運(yùn)動(dòng)信息，獲取像素運(yùn)動(dòng)得瞬時(shí)速度。在原始視頻中，有運(yùn)動(dòng)部分和靜止得背景部分，我們通常需要判斷得只是視頻中運(yùn)動(dòng)部分得狀態(tài)，而光流就是通過計(jì)算得到了視頻中運(yùn)動(dòng)部分得運(yùn)動(dòng)信息。

下面是一個(gè)經(jīng)過計(jì)算后得原視頻及光流視頻。

原視頻

光流視頻

新模型：Two-Stream Inflated 3D ConvNets

新模型采取了以下幾點(diǎn)結(jié)構(gòu)改進(jìn)：

拓展2D卷積為3D。直接利用成熟得圖像分類模型，只不過將網(wǎng)絡(luò)中二維$ N × N得 filters 和 pooling kernels 直接變成得filters和poolingkernels直接變成 N × N × N $；

用 2D filter 得預(yù)訓(xùn)練參數(shù)來初始化 3D filter 得參數(shù)。上一步已經(jīng)利用了圖像分類模型得網(wǎng)絡(luò)，這一步得目得是能利用上網(wǎng)絡(luò)得預(yù)訓(xùn)練參數(shù)，直接將 2D filter 得參數(shù)直接沿著第三個(gè)時(shí)間維度進(jìn)行復(fù)制N次，最后將所有參數(shù)值再除以N；

調(diào)整感受野得形狀和大小。新模型改造了圖像分類模型Inception-v1得結(jié)構(gòu)，前兩個(gè)max-pooling層改成使用$ 1 × 3 × 3kernels and stride 1 in time，其他所有max-pooling層都仍然使用對(duì)此得kernel和stride，最后一個(gè)average pooling層使用kernelsandstride1intime，其他所有max?pooling層都仍然使用對(duì)此得kernel和stride，最后一個(gè)averagepooling層使用 2 × 7 × 7 $得kernel。

延續(xù)了Two-Stream得基本方法。用雙流結(jié)構(gòu)來捕獲支持之間得時(shí)空關(guān)系仍然是有效得。

最后新模型得整體結(jié)構(gòu)如下圖所示：

好，到目前為止，我們已經(jīng)講解了視頻動(dòng)作識(shí)別得經(jīng)典數(shù)據(jù)集和經(jīng)典模型，下面我們通過代碼來實(shí)踐地跑一跑其中得兩個(gè)模型：C3D模型（ 3D卷積網(wǎng)絡(luò)）以及I3D模型（Two-Stream Inflated 3D ConvNets）。

C3D模型結(jié)構(gòu)

我們已經(jīng)在前面得“舊模型二：3D卷積網(wǎng)絡(luò)”中講解到3D卷積網(wǎng)絡(luò)是一種看起來比較自然得處理視頻得網(wǎng)絡(luò)，雖然它有效果不夠好，計(jì)算量也大得特點(diǎn)，但它得結(jié)構(gòu)很簡單，可以構(gòu)造一個(gè)很簡單得網(wǎng)絡(luò)就可以實(shí)現(xiàn)視頻動(dòng)作識(shí)別，如下圖所示是3D卷積得示意圖：

a)中，一張支持進(jìn)行了2D卷積， b)中，對(duì)視頻進(jìn)行2D卷積，將多個(gè)幀視作多個(gè)通道， c)中，對(duì)視頻進(jìn)行3D卷積，將時(shí)序信息加入輸入信號(hào)中。

ab中，output都是一張二維特征圖，所以無論是輸入是否有時(shí)間信息，輸出都是一張二維得特征圖，2D卷積失去了時(shí)序信息。只有3D卷積在輸出時(shí)，保留了時(shí)序信息。2D和3D池化操作同樣有這樣得問題。

如下圖所示是一種C3D網(wǎng)絡(luò)得變種：（如需閱讀原文描述，請(qǐng)查看I3D論文 2.2 節(jié)）

C3D結(jié)構(gòu)，包括8個(gè)卷積層，5個(gè)蕞大池化層以及2個(gè)全連接層，最后是softmax輸出層。

所有得3D卷積核為$ 3 × 3 × 3$ 步長為1，使用SGD，初始學(xué)習(xí)率為0.003，每150k個(gè)迭代，除以2。優(yōu)化在1.9M個(gè)迭代得時(shí)候結(jié)束，大約13epoch。

數(shù)據(jù)處理時(shí)，視頻抽幀定義大小為：$ c × l × h × w，c為通道數(shù)量，為通道數(shù)量，l為幀得數(shù)量，h為幀畫面得高度，w為幀畫面得寬度。3D卷積核和池化核得大小為 d × k × k，d是核得時(shí)間深度，k是核得空間大小。網(wǎng)絡(luò)得輸入為視頻得抽幀，預(yù)測出得是類別標(biāo)簽。所有得視頻幀畫面都調(diào)整大小為128 × 171 $，幾乎將UCF-101數(shù)據(jù)集中得幀調(diào)整為一半大小。視頻被分為不重復(fù)得16幀畫面，這些畫面將作為模型網(wǎng)絡(luò)得輸入。最后對(duì)幀畫面得大小進(jìn)行裁剪，輸入得數(shù)據(jù)為$16 × 112 × 112 $

3.C3D模型訓(xùn)練

接下來，我們將對(duì)C3D模型進(jìn)行訓(xùn)練，訓(xùn)練過程分為：數(shù)據(jù)預(yù)處理以及模型訓(xùn)練。在此次訓(xùn)練中，我們使用得數(shù)據(jù)集為UCF-101，由于C3D模型得輸入是視頻得每幀支持，因此我們需要對(duì)數(shù)據(jù)集得視頻進(jìn)行抽幀，也就是將視頻轉(zhuǎn)換為支持，然后將支持?jǐn)?shù)據(jù)傳入模型之中，進(jìn)行訓(xùn)練。

在本案例中，我們隨機(jī)抽取了UCF-101數(shù)據(jù)集得一部分進(jìn)行訓(xùn)練得演示，感興趣得同學(xué)可以下載完整得UCF-101數(shù)據(jù)集進(jìn)行訓(xùn)練。

UCF-101下載

數(shù)據(jù)集存儲(chǔ)在目錄dataset_subset下

如下代碼是使用cv2庫進(jìn)行視頻文件到支持文件得轉(zhuǎn)換

import cv2import os# 視頻數(shù)據(jù)集存儲(chǔ)位置video_path = './dataset_subset/'# 生成得圖像數(shù)據(jù)集存儲(chǔ)位置save_path = './dataset/'# 如果文件路徑不存在則創(chuàng)建路徑if not os.path.exists(save_path): os.mkdir(save_path)

# 獲取動(dòng)作列表action_list = os.listdir(video_path)# 遍歷所有動(dòng)作for action in action_list: if action.startswith(".")==False: if not os.path.exists(save_path+action): os.mkdir(save_path+action) video_list = os.listdir(video_path+action) # 遍歷所有視頻 for video in video_list: prefix = video.split('.')[0] if not os.path.exists(os.path.join(save_path, action, prefix)): os.mkdir(os.path.join(save_path, action, prefix)) save_name = os.path.join(save_path, action, prefix) + '/' video_name = video_path+action+'/'+video # 讀取視頻文件 # cap為視頻得幀 cap = cv2.VideoCapture(video_name) # fps為幀率 fps = int(cap.get(cv2.CAP_PROP_frame_COUNT)) fps_count = 0 for i in range(fps): ret, frame = cap.read() if ret: # 將幀畫面寫入支持文件中 cv2.imwrite(save_name+str(10000+fps_count)+'.jpg',frame) fps_count += 1

此時(shí)，視頻逐幀轉(zhuǎn)換成得支持?jǐn)?shù)據(jù)已經(jīng)存儲(chǔ)起來，為模型訓(xùn)練做準(zhǔn)備。

4.模型訓(xùn)練

首先，我們構(gòu)建模型結(jié)構(gòu)。

C3D模型結(jié)構(gòu)我們之前已經(jīng)介紹過，這里我們通過keras提供得Conv3D，MaxPool3D，ZeroPadding3D等函數(shù)進(jìn)行模型得搭建。

from keras.layers import Dense,Dropout,Conv3D,Input,MaxPool3D,Flatten,Activation, ZeroPadding3Dfrom keras.regularizers import l2from keras.models import Model, Sequential# 輸入數(shù)據(jù)為 112×112 得支持，16幀， 3通道input_shape = (112,112,16,3)# 權(quán)重衰減率weight_decay = 0.005# 類型數(shù)量，我們使用UCF-101 為數(shù)據(jù)集，所以為101nb_classes = 101# 構(gòu)建模型結(jié)構(gòu)inputs = Input(input_shape)x = Conv3D(64,(3,3,3),strides=(1,1,1),padding='same', activation='relu',kernel_regularizer=l2(weight_decay))(inputs)x = MaxPool3D((2,2,1),strides=(2,2,1),padding='same')(x)x = Conv3D(128,(3,3,3),strides=(1,1,1),padding='same', activation='relu',kernel_regularizer=l2(weight_decay))(x)x = MaxPool3D((2,2,2),strides=(2,2,2),padding='same')(x)x = Conv3D(128,(3,3,3),strides=(1,1,1),padding='same', activation='relu',kernel_regularizer=l2(weight_decay))(x)x = MaxPool3D((2,2,2),strides=(2,2,2),padding='same')(x)x = Conv3D(256,(3,3,3),strides=(1,1,1),padding='same', activation='relu',kernel_regularizer=l2(weight_decay))(x)x = MaxPool3D((2,2,2),strides=(2,2,2),padding='same')(x)x = Conv3D(256, (3, 3, 3), strides=(1, 1, 1), padding='same', activation='relu',kernel_regularizer=l2(weight_decay))(x)x = MaxPool3D((2, 2, 2), strides=(2, 2, 2), padding='same')(x)x = Flatten()(x)x = Dense(2048,activation='relu',kernel_regularizer=l2(weight_decay))(x)x = Dropout(0.5)(x)x = Dense(2048,activation='relu',kernel_regularizer=l2(weight_decay))(x)x = Dropout(0.5)(x)x = Dense(nb_classes,kernel_regularizer=l2(weight_decay))(x)x = Activation('softmax')(x)model = Model(inputs, x)

Using TensorFlow backend./home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)])/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)])/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)])/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)])/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)])/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)])WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.Instructions for updating:Colocations handled automatically by placer.WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.Instructions for updating:Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.

通過keras提供得summary()方法，打印模型結(jié)構(gòu)?？梢钥吹侥Ｐ偷脤訕?gòu)建以及各層得輸入輸出情況。

model.summary()

此處輸出較長，省略

通過keras得input方法可以查看模型得輸入形狀，shape分別為( batch size, width, height, frames, channels) 。

model.input

<tf.Tensor 'input_1:0' shape=(?, 112, 112, 16, 3) dtype=float32>

可以看到模型得數(shù)據(jù)處理得維度與圖像處理模型有一些差別，多了frames維度，體現(xiàn)出時(shí)序關(guān)系在視頻分析中得影響。

接下來，我們開始將支持文件轉(zhuǎn)為訓(xùn)練需要得數(shù)據(jù)形式。

# 引用必要得庫from keras.optimizers import SGD,Adamfrom keras.utils import np_utilsimport numpy as npimport randomimport cv2import matplotlib.pyplot as plt# 自定義callbacksfrom schedules import onetenth_4_8_12

INFO:matplotlib.font_manager:font search path ['/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/matplotlib/mpl-data/fonts/ttf', '/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/matplotlib/mpl-data/fonts/afm', '/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/matplotlib/mpl-data/fonts/pdfcorefonts']INFO:matplotlib.font_manager:generated new fontManager

參數(shù)定義

img_path = save_path # 支持文件存儲(chǔ)位置results_path = './results' # 訓(xùn)練結(jié)果保存位置if not os.path.exists(results_path): os.mkdir(results_path)

數(shù)據(jù)集劃分，隨機(jī)抽取4/5 作為訓(xùn)練集，其余為驗(yàn)證集。將文件信息分別存儲(chǔ)在train_list和test_list中，為訓(xùn)練做準(zhǔn)備。

cates = os.listdir(img_path)train_list = []test_list = []# 遍歷所有得動(dòng)作類型for cate in cates: videos = os.listdir(os.path.join(img_path, cate)) length = len(videos)//5 # 訓(xùn)練集大小，隨機(jī)取視頻文件加入訓(xùn)練集 train= random.sample(videos, length*4) train_list.extend(train) # 將余下得視頻加入測試集 for video in videos: if video not in train: test_list.append(video)print("訓(xùn)練集為：") print( train_list)print("共%d 個(gè)視頻\n"%(len(train_list)))print("驗(yàn)證集為：") print(test_list)print("共%d 個(gè)視頻"%(len(test_list)))

此處輸出較長，省略

接下來開始進(jìn)行模型得訓(xùn)練。

首先定義數(shù)據(jù)讀取方法。方法process_data中讀取一個(gè)batch得數(shù)據(jù)，包含16幀得支持信息得數(shù)據(jù)，以及數(shù)據(jù)得標(biāo)注信息。在讀取支持?jǐn)?shù)據(jù)時(shí)，對(duì)支持進(jìn)行隨機(jī)裁剪和翻轉(zhuǎn)操作以完成數(shù)據(jù)增廣。

def process_data(img_path, file_list,batch_size=16,train=True): batch = np.zeros((batch_size,16,112,112,3),dtype='float32') labels = np.zeros(batch_size,dtype='int') cate_list = os.listdir(img_path) def read_classes(): path = "./classInd.txt" with open(path, "r+") as f: lines = f.readlines() classes = {} for line in lines: c_id = line.split()[0] c_name = line.split()[1] classes[c_name] =c_id return classes classes_dict = read_classes() for file in file_list: cate = file.split("_")[1] img_list = os.listdir(os.path.join(img_path, cate, file)) img_list.sort() batch_img = [] for i in range(batch_size): path = os.path.join(img_path, cate, file) label = int(classes_dict[cate])-1 symbol = len(img_list)//16 if train: # 隨機(jī)進(jìn)行裁剪 crop_x = random.randint(0, 15) crop_y = random.randint(0, 58) # 隨機(jī)進(jìn)行翻轉(zhuǎn) is_flip = random.randint(0, 1) # 以16 幀為單位 for j in range(16): img = img_list[symbol + j] image = cv2.imread( path + '/' + img) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.resize(image, (171, 128)) if is_flip == 1: image = cv2.flip(image, 1) batch[i][j][:][:][:] = image[crop_x:crop_x + 112, crop_y:crop_y + 112, :] symbol-=1 if symbol<0: break labels[i] = label else: for j in range(16): img = img_list[symbol + j] image = cv2.imread( path + '/' + img) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.resize(image, (171, 128)) batch[i][j][:][:][:] = image[8:120, 30:142, :] symbol-=1 if symbol<0: break labels[i] = label return batch, labels

batch, labels = process_data(img_path, train_list)print("每個(gè)batch得形狀為：%s"%(str(batch.shape)))print("每個(gè)label得形狀為：%s"%(str(labels.shape)))

每個(gè)batch得形狀為：(16, 16, 112, 112, 3)每個(gè)label得形狀為：(16,)

定義data generator，將數(shù)據(jù)批次傳入訓(xùn)練函數(shù)中。

def generator_train_batch(train_list, batch_size, num_classes, img_path): while True: # 讀取一個(gè)batch得數(shù)據(jù) x_train, x_labels = process_data(img_path, train_list, batch_size=16,train=True) x = preprocess(x_train) # 形成input要求得數(shù)據(jù)格式 y = np_utils.to_categorical(np.array(x_labels), num_classes) x = np.transpose(x, (0,2,3,1,4)) yield x, ydef generator_val_batch(test_list, batch_size, num_classes, img_path): while True: # 讀取一個(gè)batch得數(shù)據(jù) y_test,y_labels = process_data(img_path, train_list, batch_size=16,train=False) x = preprocess(y_test) # 形成input要求得數(shù)據(jù)格式 x = np.transpose(x,(0,2,3,1,4)) y = np_utils.to_categorical(np.array(y_labels), num_classes) yield x, y

定義方法preprocess，對(duì)函數(shù)得輸入數(shù)據(jù)進(jìn)行圖像得標(biāo)準(zhǔn)化處理。

def preprocess(inputs): inputs[..., 0] -= 99.9 inputs[..., 1] -= 92.1 inputs[..., 2] -= 82.6 inputs[..., 0] /= 65.8 inputs[..., 1] /= 62.3 inputs[..., 2] /= 60.3 return inputs

# 訓(xùn)練一個(gè)epoch大約需4分鐘# 類別數(shù)量num_classes = 101# batch大小batch_size = 4# epoch數(shù)量epochs = 1# 學(xué)習(xí)率大小lr = 0.005# 優(yōu)化器定義sgd = SGD(lr=lr, momentum=0.9, nesterov=True)model感謝原創(chuàng)分享者pile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])# 開始訓(xùn)練history = model.fit_generator(generator_train_batch(train_list, batch_size, num_classes,img_path), steps_per_epoch= len(train_list) // batch_size, epochs=epochs, callbacks=[onetenth_4_8_12(lr)], validation_data=generator_val_batch(test_list, batch_size,num_classes,img_path), validation_steps= len(test_list) // batch_size, verbose=1)# 對(duì)訓(xùn)練結(jié)果進(jìn)行保存model.save_weights(os.path.join(results_path, 'weights_c3d.h5'))

WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.Instructions for updating:Use tf.cast instead.Epoch 1/120/20 [==============================] - 442s 22s/step - loss: 28.7099 - acc: 0.9344 - val_loss: 27.7600 - val_acc: 1.00005.模型測試

接下來我們將訓(xùn)練之后得到得模型進(jìn)行測試。隨機(jī)在UCF-101中選擇一個(gè)視頻文件作為測試數(shù)據(jù)，然后對(duì)視頻進(jìn)行取幀，每16幀畫面?zhèn)魅肽Ｐ瓦M(jìn)行一次動(dòng)作預(yù)測，并且將動(dòng)作預(yù)測以及預(yù)測百分比打印在畫面中并進(jìn)行視頻播放。

首先，引入相關(guān)得庫。

from IPython.display import clear_output, Image, display, HTMLimport timeimport cv2import base64import numpy as np

構(gòu)建模型結(jié)構(gòu)并且加載權(quán)重。

from models import c3d_modelmodel = c3d_model()model.load_weights(os.path.join(results_path, 'weights_c3d.h5'), by_name=True) # 加載剛訓(xùn)練得模型

定義函數(shù)arrayshow，進(jìn)行支持變量得編碼格式轉(zhuǎn)換。

def arrayShow(img): _,ret = cv2.imencode('.jpg', img) return Image(data=ret)

進(jìn)行視頻得預(yù)處理以及預(yù)測，將預(yù)測結(jié)果打印到畫面中，最后進(jìn)行播放。

# 加載所有得類別和編號(hào)with open('./ucfTrainTestlist/classInd.txt', 'r') as f: class_names = f.readlines() f.close()# 讀取視頻文件video = './videos/v_Punch_g03_c01.avi'cap = cv2.VideoCapture(video)clip = []# 將視頻畫面?zhèn)魅肽Ｐ蛍hile True: try: clear_output(wait=True) ret, frame = cap.read() if ret: tmp = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) clip.append(cv2.resize(tmp, (171, 128))) # 每16幀進(jìn)行一次預(yù)測 if len(clip) == 16: inputs = np.array(clip).astype(np.float32) inputs = np.expand_dims(inputs, axis=0) inputs[..., 0] -= 99.9 inputs[..., 1] -= 92.1 inputs[..., 2] -= 82.6 inputs[..., 0] /= 65.8 inputs[..., 1] /= 62.3 inputs[..., 2] /= 60.3 inputs = inputs[:,:,8:120,30:142,:] inputs = np.transpose(inputs, (0, 2, 3, 1, 4)) # 獲得預(yù)測結(jié)果 pred = model.predict(inputs) label = np.argmax(pred[0]) # 將預(yù)測結(jié)果繪制到畫面中 cv2.putText(frame, class_names[label].split(' ')[-1].strip(), (20, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 1) cv2.putText(frame, "prob: %.4f" % pred[0][label], (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 1) clip.pop(0) # 播放預(yù)測后得視頻 lines, columns, _ = frame.shape frame = cv2.resize(frame, (int(columns), int(lines))) img = arrayShow(frame) display(img) time.sleep(0.02) else: break except: print(0)cap.release()6.I3D 模型

在之前我們簡單介紹了I3D模型，I3D自家github庫提供了在Kinetics上預(yù)訓(xùn)練得模型和預(yù)測代碼，接下來我們將體驗(yàn)I3D模型如何對(duì)視頻進(jìn)行預(yù)測。

首先，引入相關(guān)得包

import numpy as npimport tensorflow as tfimport i3d

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.For more information, please see: * 感謝分享github感謝原創(chuàng)分享者/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * 感謝分享github感謝原創(chuàng)分享者/tensorflow/addonsIf you depend on functionality not listed there, please file an issue.

進(jìn)行參數(shù)得定義

# 輸入支持大小_IMAGE_SIZE = 224# 視頻得幀數(shù)_SAMPLE_V發(fā)布者會(huì)員賬號(hào)EO_frameS = 79# 輸入數(shù)據(jù)包括兩部分：RGB和光流# RGB和光流數(shù)據(jù)已經(jīng)經(jīng)過提前計(jì)算_SAMPLE_PATHS = { 'rgb': 'data/v_CricketShot_g04_c01_rgb.npy', 'flow': 'data/v_CricketShot_g04_c01_flow.npy',}# 提供了多種可以選擇得預(yù)訓(xùn)練權(quán)重# 其中，imagenet系列模型從ImageNet得2D權(quán)重中拓展而來，其余為視頻數(shù)據(jù)下得預(yù)訓(xùn)練權(quán)重_CHECKPOINT_PATHS = { 'rgb': 'data/checkpoints/rgb_scratch/model.ckpt', 'flow': 'data/checkpoints/flow_scratch/model.ckpt', 'rgb_imagenet': 'data/checkpoints/rgb_imagenet/model.ckpt', 'flow_imagenet': 'data/checkpoints/flow_imagenet/model.ckpt',}# 記錄類別文件_LABEL_MAP_PATH = 'data/label_map.txt'# 類別數(shù)量為400NUM_CLASSES = 400

定義參數(shù)：

imagenet_pretrained ：如果為True，則調(diào)用預(yù)訓(xùn)練權(quán)重，如果為False，則調(diào)用ImageNet轉(zhuǎn)成得權(quán)重

imagenet_pretrained = True

# 加載動(dòng)作類型kinetics_classes = [x.strip() for x in open(_LABEL_MAP_PATH)]tf.logging.set_verbosity(tf.logging.INFO)

構(gòu)建RGB部分模型

rgb_input = tf.placeholder(tf.float32, shape=(1, _SAMPLE_V發(fā)布者會(huì)員賬號(hào)EO_frameS, _IMAGE_SIZE, _IMAGE_SIZE, 3))with tf.variable_scope('RGB', reuse=tf.AUTO_REUSE): rgb_model = i3d.InceptionI3d(NUM_CLASSES, spatial_squeeze=True, final_endpoint='Logits') rgb_logits, _ = rgb_model(rgb_input, is_training=False, dropout_keep_prob=1.0)rgb_variable_map = {}for variable in tf.global_variables(): if variable.name.split('/')[0] == 'RGB': rgb_variable_map[variable.name.replace(':0', '')] = variable rgb_saver = tf.train.Saver(var_list=rgb_variable_map, reshape=True)

構(gòu)建光流部分模型

flow_input = tf.placeholder(tf.float32,shape=(1, _SAMPLE_V發(fā)布者會(huì)員賬號(hào)EO_frameS, _IMAGE_SIZE, _IMAGE_SIZE, 2))with tf.variable_scope('Flow', reuse=tf.AUTO_REUSE): flow_model = i3d.InceptionI3d(NUM_CLASSES, spatial_squeeze=True, final_endpoint='Logits') flow_logits, _ = flow_model(flow_input, is_training=False, dropout_keep_prob=1.0)flow_variable_map = {}for variable in tf.global_variables(): if variable.name.split('/')[0] == 'Flow': flow_variable_map[variable.name.replace(':0', '')] = variableflow_saver = tf.train.Saver(var_list=flow_variable_map, reshape=True)

將模型聯(lián)合，成為完整得I3D模型

model_logits = rgb_logits + flow_logitsmodel_predictions = tf.nn.softmax(model_logits)

開始模型預(yù)測,獲得視頻動(dòng)作預(yù)測結(jié)果。
預(yù)測數(shù)據(jù)為開篇提供得RGB和光流數(shù)據(jù)：

with tf.Session() as sess: feed_dict = {} if imagenet_pretrained: rgb_saver.restore(sess, _CHECKPOINT_PATHS['rgb_imagenet']) # 加載rgb流得模型 else: rgb_saver.restore(sess, _CHECKPOINT_PATHS['rgb']) tf.logging.info('RGB checkpoint restored') if imagenet_pretrained: flow_saver.restore(sess, _CHECKPOINT_PATHS['flow_imagenet']) # 加載flow流得模型 else: flow_saver.restore(sess, _CHECKPOINT_PATHS['flow']) tf.logging.info('Flow checkpoint restored') start_time = time.time() rgb_sample = np.load(_SAMPLE_PATHS['rgb']) # 加載rgb流得輸入數(shù)據(jù) tf.logging.info('RGB data loaded, shape=%s', str(rgb_sample.shape)) feed_dict[rgb_input] = rgb_sample flow_sample = np.load(_SAMPLE_PATHS['flow']) # 加載flow流得輸入數(shù)據(jù) tf.logging.info('Flow data loaded, shape=%s', str(flow_sample.shape)) feed_dict[flow_input] = flow_sample out_logits, out_predictions = sess.run( [model_logits, model_predictions], feed_dict=feed_dict) out_logits = out_logits[0] out_predictions = out_predictions[0] sorted_indices = np.argsort(out_predictions)[::-1] print('Inference time in sec: %.3f' % float(time.time() - start_time)) print('Norm of logits: %f' % np.linalg.norm(out_logits)) print('\nTop classes and probabilities') for index in sorted_indices[:20]: print(out_predictions[index], out_logits[index], kinetics_classes[index])

WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.Instructions for updating:Use standard file APIs to check for files with this prefix.INFO:tensorflow:Restoring parameters from data/checkpoints/rgb_imagenet/model.ckptINFO:tensorflow:RGB checkpoint restoredINFO:tensorflow:Restoring parameters from data/checkpoints/flow_imagenet/model.ckptINFO:tensorflow:Flow checkpoint restoredINFO:tensorflow:RGB data loaded, shape=(1, 79, 224, 224, 3)INFO:tensorflow:Flow data loaded, shape=(1, 79, 224, 224, 2)Inference time in sec: 1.511Norm of logits: 138.468643Top classes and probabilities1.0 41.813675 playing cricket1.497162e-09 21.49398 hurling (sport)3.8431236e-10 20.13411 catching or throwing baseball1.549242e-10 19.22559 catching or throwing softball1.1360187e-10 18.915354 hitting baseball8.801105e-11 18.660116 playing tennis2.4415466e-11 17.37787 playing kickball1.153184e-11 16.627766 playing squash or racquetball6.1318893e-12 15.996157 shooting goal (soccer)4.391727e-12 15.662376 hammer throw2.2134352e-12 14.9772005 golf putting1.6307096e-12 14.67167 throwing discus1.5456218e-12 14.618079 javelin throw7.6690325e-13 13.917259 pumping fist5.1929587e-13 13.527372 shot put4.2681337e-13 13.331245 celebrating2.7205462e-13 12.880901 applauding1.8357015e-13 12.487494 throwing ball1.6134511e-13 12.358444 dodgeball1.1388395e-13 12.010078 tap dancing

感謝閱讀下方，第壹時(shí)間了解華為云新鮮技術(shù)~

華為云博客_大數(shù)據(jù)博客_AI博客_云計(jì)算博客_開發(fā)者中心-華為云

• 分享3個(gè)好用的文字識(shí)別APP_學(xué)會(huì)再也不會(huì)手動(dòng)	• 5月9日銅鉛鋁鋅等原材料價(jià)格
• 五種方法實(shí)現(xiàn)降本增效_詳細(xì)講解飼料原料替代_	• 兩大核心優(yōu)勢_助力中柏EZbook_S5_ma
• 全印為什么文_傳統(tǒng)印刷和數(shù)碼印花的碰撞	• 包裝印刷稿的繪制與輸出
• 家里蚊子很多“不要慌”_教你一個(gè)土方法_來一只	• 3個(gè)識(shí)別數(shù)量的APP_準(zhǔn)確率高_(dá)一鍵識(shí)為什么計(jì)算出
• 模式/為什么像識(shí)別技術(shù)在智能制造中的應(yīng)用	• AI能準(zhǔn)確識(shí)別癌癥？仍需更多測試改進(jìn)

国产高清吹潮免费视频,老熟女@tubeumtv,粉嫩av一区二区三区免费观看,亚洲国产成人精品青青草原

VIP

推廣服務(wù)

詳解可以嗎中動(dòng)作識(shí)別模型與代碼實(shí)踐