Database Tour Documentation Contents Index

Exporting Large Datasets

Top Previous Next

Exporting a very large table or query result from a database to another database may require non-standard approaches.

By default, all databases records from the source table are allocated in the operating memory. For some database types or used database engines, it is done immediately after opening the table; for others, the records are fetched from the server and kept in the local computer memory during the exporting. And if you don't have enough RAM to allocate all source records, you run into out of memory error, the the application crashes and you can loose some part of already exported data.

Another problem can be with the target part of your export process. If you turn off the Memory saving mode option, the exported records are allocated in the operating memory too! But this problem is easy solvable: just turn the mentioned option on.

Solution

The solution can be very efficient, but requires some preparation. The idea is to divide the source records to several parts and export them separately, each part at a time (i.e. iteratively). You have to write a SQL query for each iteration. This way, you can export either one table or even results of a complex SQL query.

The key point is choosing how to divide. The recommended way is to use the primary key, i.e. specify a filter condition based on values in the table primary key column(s), for example:

/*1st iteration*/ SELECT * FROM MyTable WHERE Id BETWEEN 0 AND 10000;
/*2nd iteration*/ SELECT * FROM MyTable WHERE Id BETWEEN 10001 AND 20000;
/*3rd iteration*/ SELECT * FROM MyTable WHERE Id BETWEEN 20001 AND 30000;

and so on.

If the source table does not have a primary key, or when you are exporting the result of a complex query, try to use some other filter condition, which effectively selects different parts of the source records:

/*1st iteration*/ SELECT * FROM MyTable WHERE ClientName LIKE 'A%';
/*2nd iteration*/ SELECT * FROM MyTable WHERE ClientName LIKE 'B%';
/*3rd iteration*/ SELECT * FROM MyTable WHERE ClientName LIKE 'C%';

and so on. Consider using indexes etc.

Some databases (e.g. Oracle, SQL Server, PostgreSQL etc) allow to effectively divide the source table by internal row numbers. Here is how to do this for SQL Server:

/*1st iteration*/ SELECT * FROM MyTable ORDER BY SomeColumn OFFSET 0 ROWS FETCH NEXT 2000000 ROWS ONLY;
/*2nd iteration*/ SELECT * FROM MyTable ORDER BY SomeColumn OFFSET 2000000 ROWS FETCH NEXT 2000000 ROWS ONLY;
/*3rd iteration*/ SELECT * FROM MyTable ORDER BY SomeColumn OFFSET 4000000 ROWS FETCH NEXT 2000000 ROWS ONLY;

and so on.

Important: For the target part, in 2nd and next iterations, do not forget to specify APPEND export mode. It allows to add new records to the existing target table.

Executing via GUI

  1. Open the source database.
  2. Click New SQL Window button. Execute the next steps as many times as number of iterations you need.
    1. In SQL editor, type the SQL query of the current iteration and click Execute Query button.
    2. Click Export button.
    3. Go to the Database tab and select your target database.
    4. Specify a target table. Turn on Memory saving mode option.
    5. Choose the correct Export mode: for the first iteration, if the target table already exists and must be cleared before the exporting, choose EMPTY+INSERT; for all other cases, choose APPEND.
    6. Click Export.

Executing through the command line

Exporting data from the command line is available only in Database Tour Professional Edition (Database Tour Pro).

You need to create a separate SQL file and a separate command line for each export iteration. So, create the SQL files with correct queries for each iteration and name them, for example, MyQuery1.sql, MyQuery2.sql, and so on, according to the numbers of iterations.

You can build your command line either manually using specification from the documentation, or in a few clicks from the GUI. In the latter case, choose Tools | Build Command Line | Export / Import Data.... Save the built command line, for example, to the .bat file.

Then, you can repeat this for each iteration. Or just copy the built command line as many times as number of iterations; in each copied command line, replace the SQL file name with correct path for its iteration and specify correct export mode as described above. Place the created command lines to the .bat file and start your export process.

Please remember, that you can also create your command lines as action files, one action file per iteration. Below, there are examples of action files for exporting a large dataset from PostgreSQL database to Firebird database:

;1st iteration
/export
/ExportType=DATABASE
;for the 1st iteration, if the existing target must be cleared, use EMPTY+INSERT export mode
;otherwise, use APPEND export mode
/ExportMode=EMPTY+INSERT
/CommitInterval=1000
/MemorySaving
/UseSQLParameters
/UseBatchMode
;source:
/SrcDBInterface=fd
/SrcDBKind=DSN
/SrcDBDriver=POSTGRESQL
/SrcDB=clients
/SrcServer=MyPgServer
/SrcPort=5432
/SrcVendorLibrary=C:\Program Files (x86)\PostgreSQL\9.6\bin\libpq.dll
/SrcSQLFile=c:\MyExportFiles\Postgres-to-Firebird\Iteration1.sql
;target:
/TrgDBInterface=fd
/TrgDBKind=FILE
/TrgDBDriver=Firebird
/TrgDBUserName=sysdba
/TrgPort=3050
/TrgProtocol=TCPIP
/TrgOSAuthentication=No
/TrgVendorLibrary=C:\Program Files (x86)\Firebird\Firebird_3_0\fbclient.dll
/TrgDB=C:\My Firebird database\CLIENTS.FDB
/TrgTableName=MyTable
;2nd iteration
/export
/ExportType=DATABASE
;for 2nd and next iterations, use APPEND export mode
/ExportMode=APPEND
/CommitInterval=1000
/MemorySaving
/UseSQLParameters
/UseBatchMode
;source:
/SrcDBInterface=fd
/SrcDBKind=DSN
/SrcDBDriver=POSTGRESQL
/SrcDB=clients
/SrcServer=MyPgServer
/SrcPort=5432
/SrcVendorLibrary=C:\Program Files (x86)\PostgreSQL\9.6\bin\libpq.dll
/SrcSQLFile=c:\MyExportFiles\Postgres-to-Firebird\Iteration2.sql
;target:
/TrgDBInterface=fd
/TrgDBKind=FILE
/TrgDBDriver=Firebird
/TrgDBUserName=sysdba
/TrgPort=3050
/TrgProtocol=TCPIP
/TrgOSAuthentication=No
/TrgVendorLibrary=C:\Program Files (x86)\Firebird\Firebird_3_0\fbclient.dll
/TrgDB=C:\My Firebird database\CLIENTS.FDB
/TrgTableName=MyTable
...
;Nth iteration
/export
/ExportType=DATABASE
;for 2nd and next iterations, use APPEND export mode
/ExportMode=APPEND
/CommitInterval=1000
/MemorySaving
/UseSQLParameters
/UseBatchMode
;source:
/SrcDBInterface=fd
/SrcDBKind=DSN
/SrcDBDriver=POSTGRESQL
/SrcDB=clients
/SrcServer=MyPgServer
/SrcPort=5432
/SrcVendorLibrary=C:\Program Files (x86)\PostgreSQL\9.6\bin\libpq.dll
/SrcSQLFile=c:\MyExportFiles\Postgres-to-Firebird\IterationN.sql
;target:
/TrgDBInterface=fd
/TrgDBKind=FILE
/TrgDBDriver=Firebird
/TrgDBUserName=sysdba
/TrgPort=3050
/TrgProtocol=TCPIP
/TrgOSAuthentication=No
/TrgVendorLibrary=C:\Program Files (x86)\Firebird\Firebird_3_0\fbclient.dll
/TrgDB=C:\My Firebird database\CLIENTS.FDB
/TrgTableName=MyTable

And your .bat file should look like this:

REM 1st iteration
"C:\Program Files (x86)\Vitaliy Levchenko\Database Tour Pro 9\dbtour.exe" /ActionFile=C:\MyExportFiles\Postgres-to-Firebird\Action1.txt
REM 2nd iteration
"C:\Program Files (x86)\Vitaliy Levchenko\Database Tour Pro 9\dbtour.exe" /ActionFile=C:\MyExportFiles\Postgres-to-Firebird\Action2.txt
...
REM Nth iteration
"C:\Program Files (x86)\Vitaliy Levchenko\Database Tour Pro 9\dbtour.exe" /ActionFile=C:\MyExportFiles\Postgres-to-Firebird\ActionN.txt