Relational databases are typically used for two distinct purposes – transaction processing and analysis. As these are two very different use-cases, the optimal underlying architecture is very different between the two.
Transactions (everyday events stored in rows as they occur) are similar to how you would naturally store data in Excel. However, analysis is typically performed over long time periods, and it would be very inefficient to traverse millions or even billions of rows for each analytical query.
Data warehouses use columnar storage (storing data column by column), which allows them exceptionally strong compression and quick data aggregation.
However, this means that loading data into data warehouses require a different process than standard row-by-row insertion coming from relational databases optimized for transaction processing. One should prepare the data in a Blob storage and instruct the warehouse to ingest those files. The warehouse does so in a massively parallel manner.
Such files must be of an optimal size, usually in the range of 100 to 200 megabytes each, which means that data from a single large table will be spread over many files.
One can prepare data in several different formats depending on what the data warehouse supports. Typically, it can be CSV, Parquet, Orc, or Avro. Omni Loader supports CSV and Parquet file formats in both uncompressed and compressed variants. Specifically for CSV, we support GZIP compression, while for Parquet, we support both GZIP compression and Snappy.
Migrating row-based data from the source is a multi-step process. Data has to be read sequentially, data types mapped into the types supported by the target database (and the intermediate file format), row-based structure transformed into columnar-based one, compressed, uploaded to the data lake, and only then can the target warehouse be notified there is data to ingest.
Omni Loader hides the complexity of that process so loading the data into a data warehouse looks as if one simply copies the data over from the standard OLTP source to the standard OLTP target. That is exactly what one wants - a boring and predictable copying process.
Omni Loader reads the data, transforms it into a columnar format, compresses it, uploads it into the blob storage, then has the data warehouse ingest the data files as they become ready.
All that without you having to set up any ETL pipelines.
Ready to get started?
We support over 20 databases at this point and are very likely to support yours. Take a look at the whole list on our databases page.
Omni Loader easily migrates absolutely huge databases. Our competitors either require a lot of time-consuming setup time or the performance is not suitable for very large databases.
Please contact us and we will discuss your needs.
Absolutely. Please head over to our demo request page.
For on-premises installation please contact us to obtain a license. Once you do, you will get native executables for Windows and Linux. Run as many agents as your license allows to scale to the perfromance levels you need.
Absolutely - that's the core functionality of Omni Loader.
CDC is available for Oracle, SQL Server, and MySQL/MariaDB source databases. All target databases automatically support data ingestion from CDC-enabled sources.