Big Data Transfer Operation

How to move email from the old archive system to the private email cloud

When one corporate decides to use the private email cloud instead of the old archive system, moving data to the cloud without losing any important data during the process is a critical challenge. However, it's not as difficult as you may think, if you got the know-how.

As we know, .eml is the standard format for RFC822. All information and attachments can be transferred universally in this format to different email servers, including the private email cloud. Thus, .eml is the main format used for implementing and moving emails. The following is a suggested method:

Format transfer → EML import → Index & storage

1. Format transfer

If the original email format is not .eml, for instance, zipped files; the first step would be to unzip the files and make sure that the files are in .eml format. You should use some tools to convert .pst files to .eml files as well.

2. Import .eml files to the private email cloud

Using a SMTP agent to import .eml files to the private email cloud is the most efficient and simplest way. It can read and process all .eml files directly and transfer them to the private email cloud's server via the SMTP protocol. It's a one-step process and the integrity of the .eml files is guaranteed.

3. Build indices and raw data storage

When .eml files are imported, a full-text index will be built for each email instantly and these indices are then allocated by distribution algorithms to different GSAs for searching. The index-building process is the basis for speedy searches. Take note that .eml files will be saved in different storage devices depending on the user's ELM settings. All mentioned steps above run automatically without any manual operations. Administrators don't need to record each storage's content, therefore monitoring and managing become simple.


1. High Scalability

The private email cloud can be easily expanded by increasing the quantity of GSAs. The more GSAs available the quicker the search. The private email cloud also supports network storage devices or resources.

2. Grid Search System

The private email cloud has an innovative grid search system. The system's search technique enhances the speed of obtaining search results and presents the results with a user-friendly interface. When the GC receives a query, it splits the indices into N requests (by the time period in the search query) and sends a request to each GSA for searching. Compared to the traditional mail archive system, the grid search's speed is 1/N (N is the number of GSAs). Therefore, more GSAs will increase the speed of the search. This shows that the search speed is directly proportional to the quantity of GSAs.

3. Unified Management and Interface

Although the grid search concept seems complicated, it is actually convenient for administrators. They just need to log into the GC 's mail reporter, send user queries via the search page and the GC will start delegating search jobs to the GSAs. Users can then navigate the results in a timely manner. It doesn't matter whether the raw data is saved on the local disk or near-line storage, the search will proceed at the same time. This unified management and interface make the users feel that the search is performed on the local disk. As for storage management, you can log into the GC's mail gateway to change the settings.