Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

We are building a custom windows based application that is responsible for extracting data from key tables in each of our clients on-premise databases and uploading to our cloud database for use in our Web Application.

This essentially 'syncs' the on-premise dataset with our cloud dataset for each user hourly. It is a one-way sync - no changes are made in the cloud.

Both source and destination are different types of SQL databases. We also have limited access to the source to extract the specific information we need via read only access to certain stored procedures. This is also a 'many-to-one' scenario, as we are syncing data from many source databases, so true 'replication' using existing products doesn't seem like an option for us.

Unfortunately the dataset does not contain unique identifiers for all records, so to process updates we cannot do a comparison of records. Our current process is to delete and re-import the most recent 7 days worth of records to ensure our data is current. (We do not care about changes to data > 7 days). We delete existing records first, then re-upload the data - a process which takes up to 5 minutes.

Depending on when the user of our Web App loads a particular page, they may see none/some/all of the data, based on the status of this delete/re-import process.

We are wondering how we can improve this - and have come up with a few ideas:

  • Have an 'active' and 'passive' data set. The "active" dataset would be used by our Web App, whilst the sync process adds a second "passive" copy of the most recent data. Once the sync is complete, we perform a swap and make the "passive" copy the "active" copy. This should greatly reduce our '5 minute' window of concern to a matter of seconds.

  • Display a notification to the user that sync is in progress and they need to check later (this is not optimal, as the sync process can take up to 5 minutes - so this leads to poor user experience).

Assuming we cannot have unique identifiers added to all records in the source on-premise database - are there any other suggestions on how we can handle this that will provide our users with a good experience?

question from:https://stackoverflow.com/questions/65840895/data-sync-between-two-databases-ensure-accuracy-in-application

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
696 views
Welcome To Ask or Share your Answers For Others

1 Answer

Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...