Most DWA integrations refresh data one or more times per day. Depending on how the vendor's systems work, this is either by making incremental updates to the data in DWA (adding new records and updating existing records) or by totally replacing all records in DWA with a "full pull". Although DWA integrations generally provide historical data, this does not include the prior states of a record which has changed over time. In some cases, you may want to preserve a snapshot of certain data as it is now for future comparisons.
Upon request, KORE can automatically create daily snapshots of selected tables and/or views. To avoid increasing your Redshift storage costs, we store the snapshots in Amazon S3 instead of Redshift. You can access the snapshots as external tables in the
snapshotx external schema for analysis.
Note: KORE provides a large storage allowance for data snapshots, but excessive storage use may result in additional charges—contact your Success Manager for specific details. You can delete unneeded data from S3 at any time.
Example use cases
Marketing—Compare past snapshots with current data to measure the results of a multichannel campaign.
Sponsorship—Use snapshots to observe how a deal moves through different stages throughout its life and the effects on revenue, or search for year-over-year and month-over-month trends.
Secondary ticket market—Use snapshots to aggregate seat listings by section, then measure pricing fluctuations in the days or weeks leading up to an event to reveal trends in market demand.
To begin, connect to your data warehouse and create a view in the
custom schema. We recommend naming it
custom.v_snapshot_data_set_name. Use the view's SQL query to select a list of columns representing the data you wish to snapshot. This data may optionally come from more than one table or schema.
We discourage using the star (
*) operator in this query. If new columns are later added to the referenced table, this view will not include them. This could cause confusion in the future.
For example, suppose you have a data source named
texzt and you want to snapshot specific details from one of its tables:
CREATE OR REPLACE VIEW custom.v_snapshot_texzt_user_details AS ( SELECT user_id, given_name, family_name, cell_phone, response_rate FROM texzt.user_details; );
Verify that the view displays the desired results, then ask your KORE Success Manager to enable data snapshots for the view. You may create as many views and snapshots as you wish.
Warning: Do not delete this view after configuring snapshots. The system cannot take snapshots without it.
Once the first snapshot is taken, you can access your snapshots in the
snapshotx schema. This schema contains external tables which will be loaded from Amazon S3. You can write SQL queries to analyze the data from your snapshots.
All of the daily snapshots for a particular view are stored in the same external table. The external table is partitioned by date, and the
partition_date field contains a timestamp indicating the date when the snapshot was taken. (Currently, the time of day portion is always set to midnight. However, this could change in the future.)
Important: It is very important to limit queries using the
partition_date field. As the number of partitions to search increases, performance will greatly slow. For example, suppose you want to see how a record has changed over the past year—it would be dramatically faster to search for the record in one snapshot per month (12 partitions) instead of all the daily snapshots (365 partitions).
Suppose we want to observe how the record of a particular user has changed over the past three years. To limit the number of partitions to search, we can use the modulo operator (
%) to choose partition dates 30 days apart. (If the number of days between
partition_date and today is divisible by 30 with no remainder, we use that
SELECT * FROM snapshotx.texzt_user_details WHERE user_id = '12345abc' AND (DATEDIFF(DAY, GETDATE(), partition_date) < (365*3)) --search within past 3 years AND ((DATEDIFF(DAY, GETDATE(), partition_date) % 30) = 0); --one partition per 30 days
Change which data to capture
Every time KORE's system takes a snapshot, it refers to the view you created in the
custom schema. You can make changes to that view and thus change the data to be included in your snapshots without KORE's intervention. However, this has implications for your previous snapshots because a table can only have one definition.
If you change your view to include additional columns, those columns will be added to the table definition and your future snapshots will capture this data. Your existing snapshots will also gain this column, but the field will be
NULL in every record since we can't know what the value would have been when the snapshot was created.
If you change your view to remove columns, those columns will also be removed from the table definition during the next snapshot. This means that the data in that column will be inaccessible from Redshift although it will still be present in the S3 files. We recommend against doing this. Instead, please open a support ticket if you need a column to be deleted from existing snapshots.
To keep the existing data from a column but stop collecting it in future snapshots, we suggest creating a new view in the
custom schema for future snapshots. Once enabled, ask your KORE Success Manager to disable snapshots for the older view.
Contact your KORE Success Manager if you wish to stop taking snapshots of a view. We can deactivate the feature either temporarily or permanently. Please also specify if we should preserve or delete the existing snapshots.
If you instead delete the associated view in the
custom schema, the daily snapshot function will fail with errors. Your existing snapshots will remain accessible from Redshift but no additional snapshots will be taken. KORE will not be able to restore the deleted view if you change your mind, so we strongly advise against deleting the view.