I am embedding pdf into an Image. But after extraction i am getting blank page pdf. How to extract the correct pdf file whatever i have inserted?

Question

Ramya 2023-7-1

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1990548-i-am-embedding-pdf-into-an-image-but-after-extraction-i-am-getting-blank-page-pdf-how-to-extract-t

评论： Ramya 2023-7-6

% Embedding PDF file into an image using LSB substitution
% Set the file names for the PDF file and the cover image
pdfFileName = 'UMNwriteup.pdf';
imageFileName = 'glioma.jpg';
% Read the PDF file as binary
pdfData = fileread(pdfFileName);
pdfData = uint8(pdfData);
% Read the cover image
coverImage = imread(imageFileName);
% Get the dimensions of the cover image
[rows, columns, ~] = size(coverImage);
% Calculate the maximum number of bytes that can be embedded
maxBytes = (rows * columns * 3) / 8;
% Check if the PDF file size exceeds the maximum embedding capacity
if numel(pdfData) > maxBytes
    error('PDF file size exceeds the maximum embedding capacity of the cover image.');
end
% Convert the PDF data into binary format
pdfBinary = de2bi(pdfData, 8, 'left-msb');
pdfBinary = pdfBinary(:);
% Get the number of bits to be embedded
numBits = numel(pdfBinary);
% Reshape the cover image to match the number of bits
coverImage = reshape(coverImage, [], 1);
% Embed the PDF data into the cover image using LSB substitution
coverImage(1:numBits) = bitset(coverImage(1:numBits), 1, pdfBinary);
% Reshape the modified cover image back to the original dimensions
coverImage = reshape(coverImage, rows, columns, 3);
% Save the stego image with the embedded PDF data
stegoImageFileName = 'stego_image.png';
imwrite(coverImage, stegoImageFileName);
% Extraction of PDF file from the stego image
% Read the stego image
stegoImage = imread(stegoImageFileName);
% Reshape the stego image into a single column
stegoImage = reshape(stegoImage, [], 1);
% Extract the embedded PDF data from the stego image
extractedPDFBinary = bitget(stegoImage(1:numBits), 1);
% Reshape the extracted binary data into bytes
extractedPDFData = reshape(extractedPDFBinary, [], 8);
extractedPDFData = bi2de(extractedPDFData, 'left-msb');
% Convert the extracted PDF data from uint8 to char
extractedPDFData = char(extractedPDFData);
% Write the extracted PDF data to a file
outputFileName = 'extracted.pdf';
fid = fopen(outputFileName, 'w');
fwrite(fid, extractedPDFData, 'uint8');
fclose(fid);
disp('Extraction complete.');
disp(['The extracted PDF file has been saved as: ' outputFileName]);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Image Analyst 2023-7-3

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1990548-i-am-embedding-pdf-into-an-image-but-after-extraction-i-am-getting-blank-page-pdf-how-to-extract-t#answer_1266458

See my attached stego/hiding/watermarking demos. Maybe there is something there that you can use or adapt. Good luck.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Ramya 2023-7-6

在 MATLAB Online 中打开

% Demo by Image Analyst to hide an audio signal in a uint8 gray scale image by encoding it in the least significant bit.
%============================================================================================================================================
% Initialization Steps.
clc;    % Clear the command window.
close all;  % Close all figures (except those of imtool.)
clear;  % Erase all existing variables. Or clearvars if you want.
workspace;  % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 15;
markerSize = 4;
%============================================================================================================================================
% Read in image
% grayImage = imread('moon.tif'); % This image is too small to contain the sound file.
baseFileName = 'glioma.jpg';
grayImage = imread(baseFileName); % A uint8 image.
numPixels = numel(grayImage);
subplot(2, 2, 1);
imshow(grayImage, []);
impixelinfo;
caption = sprintf('Original Image : "%s"', baseFileName);
title(caption, 'FontSize', fontSize)
%============================================================================================================================================
% Read in demo audio file that ships with MATLAB.
[y, fs] = audioread("button-1.wav");
y = y(:); % If stereo, stack left channel on top of right channel.
% Get the time axis
t = linspace(0, length(y) / fs, length(y))';
subplot(2, 2, [3,4]);
plot(t, y, 'b-');
grid on;
yline(0, 'Color', 'k'); % Draw line along x axis.
title('Original Audio Signal vs. Time that is Encoded into Above Right Image', 'FontSize', fontSize)
xlabel('Time (seconds)', 'FontSize', fontSize)
ylabel('Audio Signal', 'FontSize', fontSize)
ylim([-1, 1]);
xlim([0, max(t)]);
drawnow;	% Force immediate screen painting so we can see the inputs while the encoding and decoding process go on.
% Play the sound.
% playerObject = audioplayer(y, fs);
% play(playerObject)
% Convert y to 16 bits to have enough resolution so that the sound signal value won't be changed much due to round off error.
y16 = uint16(rescale(y, 0, 65535));
% Sound was converted to uint16 so we will need 16 pixels to store to store one sound value.
% Make an output image initialized to the same as the original image.
stegoImage = grayImage;
% See if the image is big enough to hide all bits of the audio signal.
numPixelsRequired = length(y) * 16;
if numPixels < numPixelsRequired
	errorMessage = sprintf('Cannot fit image.\nThe image is %d elements long.\nThe sound is %d elements long.\nThe sound file needs the image to have at least %d pixels (= 16 * %d) to contain the entire sound.\n', ...
		numPixels, length(y), numPixelsRequired, length(y))
	uiwait(errordlg(errorMessage));
else
	fprintf('The image is %d elements long.\nThe sound is %d elements long.\nThe sound file needs the image to have at least %d pixels (= 16 * %d) to contain the entire sound.\n', ...
		numPixels, length(y), numPixelsRequired, length(y));	
end
%============================================================================================================================================
% Now encode the audio signal in the least significant bit of the uint8 gray scale image.
for k = 1 : numel(y16)
	binaryNumberString = dec2bin(y16(k), 16);
	if mod(k, 10000) == 0
		fprintf('Changing pixel #%d of %d.\n', k, numel(y16));
	end
% 	fprintf('%d in uint16 is %s in binary', y16(k), binaryNumberString);
	imageIndex = (k - 1) * 16 + 1;
	these16GrayLevels = stegoImage(imageIndex : imageIndex + 15);
	for k2 = 1 : length(binaryNumberString)
		gl = these16GrayLevels(k2);
		if binaryNumberString(k2) == '1'
			% Image gray level needs to be odd.
			% If the gray level is not already odd, make it odd.
			if rem(gl, 2) == 0 % if it's even...
				% It's even.  Add 1 to it to make it odd.
				gl = gl + 1;
				stegoImage(imageIndex + k2 - 1) = gl;
			end
		else % binaryNumberString(k2) == '0'
			% Image gray level needs to be even.
			% If the gray level is not already even, make it even.
			if rem(gl, 2) == 1 % If it's odd...
				% It's odd.  Add 1 to it to make it even, unless it's already 255 because we can't have a value of 256 for a uint8 variable.
				if gl <= 254
					gl = gl + 1;
				else
					% Value is 255 initially.  Make it even by making it 254, since we can't do 256.
					gl = 254;
				end
				stegoImage(imageIndex + k2 - 1) = gl;
			end
		end
	end
% 	these16GrayLevelsNow = stegoImage(imageIndex : imageIndex + 15)
end
subplot(2, 2, 2);
imshow(stegoImage, []);
impixelinfo;
title('With embedded sound file', 'FontSize', fontSize)
drawnow;
% Maximize the figure.
g = gcf;
g.WindowState = 'maximized';
g.Name = 'Demo by Image Analyst';
g.NumberTitle = 'off';
%============================================================================================================================================
% Now undo the encoding process and recover the hidden sound 
% by looking at the last (least significant) bit of the image gray levels and assigning that to a new sound.
yExtracted = zeros(length(y16), 1);
counter = 1;
for k = 1 : 16 : numel(y16) * 16
	% Get a vector of 16 pixel values.
	these16GrayLevels = stegoImage(k : k+15);
	% Get the least significant bits of those 16 pixel values.
	b = bitget(these16GrayLevels, 1);
	% Convert to a string, then to a decimal number.
	soundValue = bin2dec(sprintf('%c', b+48));
	if mod(counter, 10000) == 0
		fprintf('Assigning sound sample #%d of %d.\n', counter, numel(y16));
	end
	% Assign that sound value to the output sound signal.
	yExtracted(counter) = soundValue;
	counter = counter + 1;
end
% Convert back y from uint16 back to floating point like the original y.
yRecovered = yExtracted / 65535;
%============================================================================================================================================
% Play the recovered sound.
playerObject = audioplayer(yRecovered, fs);
play(playerObject)
% Double check that they're the same.
% If they're the same, the value below should be 1, true.
theyAreTheSame = isequal(y16, yExtracted).
%Am getting the output as The image is 786432 elements long.
The sound is 17640 elements long.
The sound file needs the image to have at least 282240 pixels (= 16 * 17640) to contain the entire sound.
Changing pixel #10000 of 17640.
Assigning sound sample #10000 of 17640.
Warning: No audio outputs were found. > In audiovideo.internal/audioplayerOnline/hasNoAudioHardware (line 491)
In audiovideo.internal/audioplayerOnline/initialize (line 327)
In audiovideo.internal.audioplayerOnline (line 175)
In audioplayer (line 134)
In LSB_hide_audio_in_image (line 135) Warning: No audio outputs were found. > In audiovideo.internal/audioplayerOnline/hasNoAudioHardware (line 491)
In audiovideo.internal/audioplayerOnline/play (line 200)
In audioplayer/play (line 349)
In LSB_hide_audio_in_image (line 136) 
theyAreTheSame =
  logical
   1
   y?

请先登录，再进行评论。

Answer 2

DGM 2023-7-2

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1990548-i-am-embedding-pdf-into-an-image-but-after-extraction-i-am-getting-blank-page-pdf-how-to-extract-t#answer_1266033

编辑：DGM 2023-7-2

在 MATLAB Online 中打开

fileread() is really just a convenience wrapper for fread() meant for reading text files. Up until R2020-something, it didn't even have a means to even specify the encoding. It just blindly read the file using a default presumed encoding. In this case, it's likely reading the data using a two-byte encoding, so casting the char vector as uint8 destroys the data.

If you want to read a binary file strictly bytewise, just use fread(...,'*uint8') instead of trying to work around the automatic encoding detection used by fileread() or by fread(...,'*char')

% Read the PDF file as binary
fid = fopen(pdfFileName,'r');
pdfData = fread(fid,'*uint8');
fclose(fid);

See the bottom of the table here for the comments on how char inputs are handled:

https://www.mathworks.com/help/matlab/ref/fread.html#btp1twt-1-precision

3 个评论
显示 1更早的评论隐藏 1更早的评论

Ramya 2023-7-3

在 MATLAB Online 中打开

% Embedding larger PDF file into an image using chunk-wise LSB substitution
% Set the file names for the PDF file and the cover image
pdfFileName = 'UMNwriteup.pdf';
imageFileName = 'glioma.jpg';
% Embedding larger PDF file into an image using chunk-wise LSB substitution
% Read the PDF file as binary
pdfData = fileread(pdfFileName);
pdfData = uint8(pdfData);
% Read the cover image
coverImage = imread(imageFileName);
% Get the dimensions of the cover image
[rows, columns, ~] = size(coverImage);
% Calculate the maximum number of bytes that can be embedded per chunk
maxBytesPerChunk = (rows * columns * 3) / 8;
% Divide the PDF data into chunks
numChunks = ceil(numel(pdfData) / maxBytesPerChunk);
chunks = cell(numChunks, 1);
for i = 1:numChunks
    startIndex = (i-1) * maxBytesPerChunk + 1;
    endIndex = min(i * maxBytesPerChunk, numel(pdfData));
    chunks{i} = pdfData(startIndex:endIndex);
end
% Embed each chunk into the cover image using LSB substitution
for i = 1:numChunks
    chunk = chunks{i};
    
    % Convert the chunk data into binary format
    chunkBinary = de2bi(chunk, 8, 'left-msb');
    chunkBinary = chunkBinary(:);
    
    % Get the number of bits to be embedded
    numBits = numel(chunkBinary);
    
    % Reshape the cover image to match the number of bits
    coverImage = reshape(coverImage, [], 1);
    
    % Embed the chunk data into the cover image using LSB substitution
    coverImage(1:numBits) = bitset(coverImage(1:numBits), 1, chunkBinary);
    
    % Reshape the modified cover image back to the original dimensions
    coverImage = reshape(coverImage, rows, columns, 3);
    
    % Save the stego image with the embedded chunk data
    stegoImageFileName = sprintf('stego_image_chunk%d.png', i);
    imwrite(coverImage, stegoImageFileName);
end
disp('Embedding complete.');
% Extraction of PDF file from the stego image
% Initialize the extracted PDF data
extractedPDFData = [];
% Extract each chunk from the stego images and append to the extracted PDF data
for i = 1:numChunks
    % Read the stego image
    stegoImageFileName = sprintf('stego_image_chunk%d.png', i);
    stegoImage = imread(stegoImageFileName);
    
    % Reshape the stego image into a single column
    stegoImage = reshape(stegoImage, [], 1);
    
    % Extract the embedded chunk data from the stego image
    extractedChunkBinary = bitget(stegoImage, 1);
    extractedChunkBinary = reshape(extractedChunkBinary, [], 8);
    
    % Convert the extracted binary data to uint8
    extractedChunkData = uint8(bi2de(extractedChunkBinary, 'left-msb'));
    
    % Append the extracted chunk data to the complete PDF data
    extractedPDFData = [extractedPDFData; extractedChunkData];
end
% Convert the extracted PDF data to a character array
charPDFData = char(extractedPDFData.');
% Write the character array to a temporary file
tempFileName = 'temp_pdf_file.bin';
fid = fopen(tempFileName, 'w');
fwrite(fid, charPDFData, 'char');
fclose(fid);
% Convert the temporary file to PDF using MATLAB's built-in function
outputFileName = 'extracted.pdf';
systemCommand = ['java -jar pdfbox-app-2.0.25.jar ExtractText -console "' tempFileName '" > "' outputFileName '"'];
[status, result] = system(systemCommand);
if status == 0
    disp('Extraction complete.');
    disp(['The extracted PDF file has been saved as: ' outputFileName]);
else
    disp('Extraction failed.');
end

DGM 2023-7-3

I don't know. I don't know where that jar file comes from, but I can't run that on my installation.

Using fileread() like that with the PDFs I've tested on my system in R2019b will reliably result in the read data being corrupted (values are different, wrong number of bytes returned). A char is not necessarily 1 byte.

请先登录，再进行评论。

I am embedding pdf into an Image. But after extraction i am getting blank page pdf. How to extract the correct pdf file whatever i have inserted?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（1 个）

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

I am embedding pdf into an Image. But after extraction i am getting blank page pdf. How to extract the correct pdf file whatever i have inserted?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

更多回答（1 个）

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论