Parallel Computing for video compression

2 次查看(过去 30 天)
Hi guys,
I need some help with parallel programming in MATLAB. To be clear, I have never implemented parallelization techniques in any of my codes before.
I have a video compression engine, developed as part of my university project. It is a basic verion of H.264 video compression engine. I have to implement the parallel proceesing techniques available in MATALB to this engine. Basically, I have a function which divides an image frame into a number of blocks (predtermined by the size of the block). I'm trying to partially or fully parallelize this block of the code. I have used "parfor" when there was no dependency between the blocks, and this worked out well. I have uploaded this implementation. Now I'm trying to parallalize a case were there are dependencies between blocks.
function [reconstructed_frames, residual_blocks, encoded_data_cell, bit_count_coeff_per_frame, bit_count_mv_per_frame_cell, real_avg_bit_count_per_row_per_frame, total_bit_count_per_frame, QP_used_in_row, scene_change_frames, SAD_value_per_frame] = block_prediction_parallalized(Y, block_size, srch_rng, QP, I_period,pathToResiduals, no_ref_frames, VBS_enable, Fast_ME_enable,Frac_ME_enable,lambda, RC_flag, avg_bit_count_row_vary_QP, target_bits_per_frame)
%Function to predcit frames based on inter prediction and intra prediction,
%with the given I-period
Y = int64(Y);
[no_rows, no_cols, no_frames] = size(Y);
no_blocks_in_row = (no_cols*block_size)/(block_size*block_size);
no_blocks_in_col = (no_rows*block_size)/(block_size*block_size);
total_blocks_per_frame = (no_rows*no_cols)/(block_size*block_size);
encoded_data_cell = cell(1,total_blocks_per_frame,no_frames);
encoded_data_per_frame = cell(1, total_blocks_per_frame);
ref_frame_inter = zeros(no_rows, no_cols, 1, 'int64') + 128;
bit_count_coeff_per_frame = 0;
bit_count_mv_per_frame_cell = 0;
real_avg_bit_count_per_row_per_frame = 0;
QP_used_in_row = zeros(1,no_blocks_in_col,no_frames);
QP_used_in_row(:,:,:) = QP;
scene_change_frames = [];
SAD_value_per_frame = 0;
ref_frame_index_count = 1;
for k = 1:no_frames
if k>1
ref_frame_inter(:,:,1) = Y(:,:,k-1);
end
block_segment = 0;
bitCountMV = 0;
for row = 1 : block_size : no_rows - block_size + 1
for col = 1 : block_size : no_cols - block_size + 1
block_segment = block_segment + 1;
row_start = row;
row_end = row_start + block_size - 1;
col_start = col;
col_end = col_start + block_size - 1;
row_end = min(row_end, no_rows);
col_end = min(col_end, no_cols);
% Making an array of blocks of size block_size
block_list_currframe(:,:,block_segment) = Y(row_start:row_end, col_start:col_end, k);
location_pointers(block_segment,:) = [row_start row_end col_start col_end];
end
end
%Parallelizing the block encoding process
max_index = size(block_list_currframe,3);
%Loop for processing blocks concurrently
parfor block_index = 1:max_index
% Funtion for inter-prediction
[encoded_data, reconstructed_block, residual_block, bit_count_per_block] = paral_debug_funct(block_index, location_pointers, block_list_currframe, ref_frame_inter, block_size, srch_rng, QP, no_rows, no_cols, ref_frame_index_count, VBS_enable, Fast_ME_enable, Frac_ME_enable, lambda);
%Buffering the output of each worker
reconstructed_blocks(:,:,block_index) = reconstructed_block;
residual_blocks_in_frame(:,:,block_index) = residual_block;
encoded_data_per_frame(:,:, block_index) = encoded_data;
total_bit_count_per_block(block_index) = bit_count_per_block;
end
%Processing the buffered outputs obtained after processing all the
%blocks.
for block_index = 1:size(block_list_currframe,3)
% [row_start, row_end, col_start, col_end] = location_pointers(block_index,:);
row_start = location_pointers(block_index, 1);
row_end = location_pointers(block_index, 2);
col_start = location_pointers(block_index, 3);
col_end = location_pointers(block_index, 4);
reconstructed_frames(row_start:row_end, col_start:col_end, k) = reconstructed_blocks(:,:,block_index);
residual_blocks(:,:,block_index,k) = residual_blocks_in_frame(:,:,block_index);
encoded_data_cell(:,:,block_index,k) = encoded_data_per_frame(:,:,block_index);
end
total_bit_count_per_frame(k) = sum(total_bit_count_per_block, 'all');
end
In the above code, the blocks dont have to communicate with each other. Now, I require them to communicate with each other at some point, as the processing of some blocms will have to wait for a previous block to finish.
I think the image below will help make it clearer.
I have come to know that there are two type of parallel processing available, multi-threading and multi-processing. I think multi-threading is what is apt for my use case. I have read about spmd and parfeval but, the examples I've come across are usually not very detailed. As I am new to parallel processing, these options feel very confusing and it is difficult to choose which one to focus on. I think what I want is that the workers to be able to communicate with each other during exection?, I'm not sure. If you need a general idea of the data size: video_frame size = 288x352(CIF format)
block size = 16
no of frames = 21

采纳的回答

Vishnu Pradeep
Vishnu Pradeep 2021-12-2
Someone on another forum helped me with this answer, if it's any help to anyone, I'm posting it here. Feel free to ask any questions! :)
You can use a parfor inside a non parallel for, something like this:
previous_blocks = {};
for color : ["green", "red", "blue"]
input_blocks = extract cell array of blocks with same color from the image
processed_blocks = cell(1, numel(input_blocks));
parfor i=1:numel(input_blocks)
processed_blocks{i} = process_based_on_previous_blocks (i, input_blocks{i}, previous_blocks);
end
previous_blocks = processed_blocks;
place processed_blocks in their original position in the image;
end

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Image Processing and Computer Vision 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by