Wednesday, November 26, 2008

Split and Assemble large file (around 2GB) in C# dot net Programming


Hi friends after a long time I’m back again. Now with a quite different coding flavor in different area. I will express about some operation with File. I will give some code to split and assemble large file. These codes can split up to 2 GB (approximately) file to any number of small file (minimum 1 MB) and also can assemble these small files to get the original file. Be sure that in assembling all small files need to keep in a folder otherwise it will fail to make original file and throw an error.

To do this in C# dot net , I’ve taken help of System.IO namespace and BinaryReader and BinaryWriter class. To learn about these classes please see msdn site. Here I’ve follow very simple algorithm.

For slice a file steps are:
(i) Open a large file in read mode by binary reader stream.
(ii) Execute step iii and step v, until file read reach at end of file.
(iii) Read from that stream and set these bytes in byte array
(iv) Make a new file name from original file name with slice number, for last file slice add ‘E’ after slice number.
(v) Save these bytes from array to a new file with ‘File’ class’s ‘WriteAllBytes’ method.
(vi) close the dot net binary stream.

Download source code from here:
http://alap.me/blog/All_Source_Codes.rar

These code as follows –

(i) BinaryReader br=new BinaryReader(File.Open(filename, FileMode.Open));

(ii) while (br.BaseStream.Length > sliceLen * counter)

(iii) br.BaseStream.Read(buffer, 0, sliceLen);
(iv) curFileName = filename + "." + counter.ToString();
curFileName = filename + "." + counter.ToString() + ".E";
(v) File.WriteAllBytes(curFileName, buffer);
(vi) br.Close();

For assemble these files need to follow these steps –

(i) Create a binary writer stream and open a binary file in append mode.
(ii) Execute from step iii to v
(iii) Generate file name in runtime depends on pervious file name.
(iv) Check for last file slice, last file slice name ends with last character ‘E’. If present file is the last slice then exit from that loop.
(v) Read all bytes from file and set in a byte array then write these byte data by binary writer.
(vi) Close the binary writer.

These code as follows-

(i) BinaryWriter bw = new BinaryWriter(File.Open(orgFile, FileMode.Append))
(ii) while(true)
(iii) nextFileName = orgFile + "." + counter.ToString();
(iv) if (File.Exists(nextFileName + ".E"))

(v) buffer = File.ReadAllBytes(nextFileName + ".E");
bw.Write(buffer);
(vii) bw.Close();

I’ve not covered everything in algorithm but I believe that you will understand these easily.

The complete codes are given below:

// File cutter assembler in microsoft c# dot net
//This code has written by Suman Biswas in 2008.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using System.IO;

namespace FileCutter
{
//This class is used to call the actual file operation class.
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
this.Text = "File cutter & assembler (upto 1.96 GB) by Suman Biswas";
}
FileHandling obj = new FileHandling();
private void btnSelectFile_Click(object sender, EventArgs e)
{

obj.SplitUp(SelectFile(),int.Parse(textBox1.Text));


}

private void button1_Click(object sender, EventArgs e)
{
obj.MargeUp(SelectFile());
}
private string SelectFile()
{
OpenFileDialog fbd = new OpenFileDialog();
if (fbd.ShowDialog() != DialogResult.OK)
{
MessageBox.Show("No file selected");
return "";
}
else
return fbd.FileName;
}
}

//Main file operation is done here.
class FileHandling
{
int sliceLen = 1024 * 1024;
int counter = 0;

public void SplitUp(string filename,int fileSizeInMB)
{
if (fileSizeInMB < slicelen =" 1024" counter =" 0;" buffer="new" br="new" slicelen =" (int)br.BaseStream.Length;"> sliceLen * counter)
{
if (br.BaseStream.Length > sliceLen * (counter + 1))
{
br.BaseStream.Read(buffer, 0, sliceLen);
curFileName = filename + "." + counter.ToString();
}
else
{
int remainLen = (int)br.BaseStream.Length - sliceLen * counter;
buffer = new byte[remainLen];
br.BaseStream.Read(buffer, 0, remainLen);
curFileName = filename + "." + counter.ToString() + ".E";
}

if (File.Exists(curFileName))
File.Delete(curFileName);

File.WriteAllBytes(curFileName, buffer);
counter++;
}
br.Close();
MessageBox.Show("File spilitted successfully");
}

public void MargeUp(string firstFileName)
{
if (firstFileName.Length < 1)
return;

string endPart = firstFileName;
string orgFile = "";

orgFile = endPart.Substring(0, endPart.LastIndexOf("."));
endPart = endPart.Substring(endPart.LastIndexOf(".") + 1);

if (endPart == "E")//If only one slice is there
{
orgFile = orgFile.Substring(0, orgFile.LastIndexOf("."));
endPart = "0";
}

if (File.Exists(orgFile))
{
if (MessageBox.Show(orgFile + " already exists, do you want to delete it", "", MessageBoxButtons.YesNo) == DialogResult.Yes)
File.Delete(orgFile);
else
{
MessageBox.Show("File not assembled. Operation cancelled by user.");
return;
}
}

//Assembling starts from here
BinaryWriter bw = new BinaryWriter(File.Open(orgFile, FileMode.Append));
string nextFileName = "";
byte []buffer=new byte [bw.BaseStream.Length];


int counter=int.Parse(endPart);
while(true)
{
nextFileName = orgFile + "." + counter.ToString();
if (File.Exists(nextFileName + ".E"))
{
//Last slice
buffer = File.ReadAllBytes(nextFileName + ".E");
bw.Write(buffer);
break;
}
else
{
buffer = File.ReadAllBytes(nextFileName);
bw.Write(buffer);
}
counter++;
}
bw.Close();
MessageBox.Show("File assebled successfully");
}

}

}

6 comments:

Unknown said...

Hi There,

Great Article - Thanks.

I have a question though. How can I do string operation ***efficiently*** while I have file in byte array. My file size is 270 MB and I am trying to read every line to decide whether that line is good for inserting into DB or not.

When I compute the array size, it comes to around 220000000. When I take a chunk of 10000, my app throws memory exception.

Thanks,
Johnny.

Suman Biswas said...

Hi,
You are trying to read these bytes at a time in a byte array, so getting these problem. You need to read these in a Binary Stream and then read each chunk one by one. For these you can use BinaryReader class, then use Read() method of that class' object.

Suman

Unknown said...

Good Day, I know this is old post but I would like to inquire with regards to this program, cause I need to send a file from a client to a server that is above the regular 850kb, unfortunately it is not enough so I need to be able to send at least 250mb so I thought of using your method, but unfortunately I cannot access your source code download, wanted to ask if you have a new link for this. best regards
Norberto.

Suman Biswas said...

Sorry for break link for code downloading, I shall upload files again and shall update link ASAP.

Suman

Unknown said...
This comment has been removed by the author.
Unknown said...

Hi,

The source code is not available anymore. Can you please put up a new link to it?

Thanks.