PASI Synchronization Overview

This entire page is under construction and the content is in flux.

One of the goals of the PASI project is to have a shared view of student data across the province. In order to accomplish this goal some mechanism is required in order to enable PASI clients (school and school authorities) to be made aware of and synchronize with the most current data available in the PASI Core. The Is Data Available service is that mechanism and it is used heavily by the synchronization processes for a PASI Client to determine if the PASI Core contains data that is newer than what they have.

A key non-functional requirement for the data synchronization capability is that PASI clients must be made aware of changes to data in the PASI Core near real-time basis.

A key challenge to meeting this requirement is that the PASI Core has no way of notifying PASI Clients directly (the PASI Core is a set of services that PASI Clients call - not the other way around). Another challenge is determining which PASI Clients are actively interested in being made aware of updates in near-real time (if a PASI Client is not currently online the Core should not expend effort attempting to notify the client.)

With this in mind, The Is Data Available service uses a “long-polling” approach 1).

This service uses a number of architectural techniques including long polling, version tracking, and MD5 hashes. For more information about these techniques please review the Technical API documentation for Data Synchronization.

In this synchronization model, the Is Data Available service is used to request updates to each entity within PASI. Each of these entities has an associated Get service. The Is Data Available service is used to determine if there are any updates available to be retrieved, and the Get service is used to retrieve the updated information.

Is Data Available returns back to the PASI Client a list of record IDs and Versions Numbers identifying the records that have been updated. The PASI Client then passes those record IDs and Version numbers into the Get service to retrieve the updated details.

This process is centered around the PASI Is Data Available service, which is used to identify updates to the data in PASI, while specific services are used to retrieve the updated information depending on the information being requested.

Note: As the PASI solution is running in a multi-node environment to support load balancing processes, there are rare race conditions where one node may know about information that another node does not know about (yet). In the event that this condition is realized, PASI will stop returning information in response to the Get service and the PASI Client will need to try again, as per the following diagram.

Synchronization Model

Using "Max PASI Core Version"

At a high level, clients will do the following to retrieve any changes that they have not yet seen that PASI has:

  1. The PASI Client calls Is Data Available to check if new data is available.
  2. If there is new data available the core would have returned the Ids of the new data.
  3. For each Id returned, The PASI client will:
    1. Call the approriate “get” PASI Core service service to retrieve the data.
    2. Store the retrieved data in the local copy of the PASI data.
    3. Capture the returned PASI Core Version if it is highest.
  4. The PASI Client calls Is Data Available starting the process again using the highest PASI Core Version that was returned.

The Max PASI Core Version is used by PASI to determine the records that is “newer” than what the client has seen previously.

Using "PASI Core Version Hash"

In order to handle the scenario where a client is potentially missing data that it should have (or have data the core isn't aware of) a second strategy is used to identify differences between the client and the core. With this optional approach the client sends with its request a MD5 hash of the all object versions a client has of a specific data notification type (i.e. loop through all the students and create a hash of the object version). This hash value is sent to the core and to determine if the calculated hash from the client matches what PASI Core calculates. If these hash values differ (and the MaxPASICoreVersion match) then the client is out of sync with the core.

In this scenario the core will return a full list of Id and PASICoreVersions for the requested type. The client can then compare each Id and PASICoreVersion to the client's local list and determine what record(s) is out of sync

Note: Records that are marked as deleted should not be included in the version hash calculation.

Special Note on the "Status" Notification Types

The Notification Types that are targeted to status record (e.g. CourseEnrolmentStatus), can be used as a way to do a synchronization on the status and the record the status is related to. With this knowledge the PASI Client can just use the Status variant of the Notification Type with full confidence that they are being kept up to date on the main record as well as the status record.

The status record is always calculated after the main record has been updated in PASI. If not careful, a PASI Client could synchronize an updated record, and an updated status for that record, but because of the delay in processing status, the status may not consider the most recent updates to the underlying record.

For example:

  1. A user updates a Course Enrolment record in PASI. After the save, the record has a PASI Core Version of 101.
  2. PASI triggers the Course Enrolment Status Processor to calculate the status for the Course Enrolment.
  3. The Course Enrolment Status Processor calculates status for the Course Enrolment and does the following:
    1. If needed, update the status value for the Course Enrolment
    2. Update the PASI Core Version for the Course Enrolment Status. In this example it will be 102.
  4. A PASI Client uses the Is Data Available service with a Notification Type of “CourseEnrolmentStatus.YYYY” and a PASI Core Version of 100.
  5. PASI returns a result that contains the Course Enrolment Status record's Id and a PASI Core Version of 102.
  6. The PASI Client uses the Get Course Enrolment Status service to retrieve the updated status. The results from the service call contain the updates Course Enrolment Status (PASI Core Version 102) as well as the updated Course Enrolment record (PASI Core Version 101).

Pros:

  • The PASI Client receives the updated Course Enrolment at the exact same time as they receive the updated Course Enrolment Status.
  • Only one Is Data Available call is necessary to receive these updates.

Cons:

  • The PASI Client will not receive the updated Course Enrolment until PASI has recalculated the Course Enrolment Status. Normally this is done almost immediately but it could be longer if the status processor gets backlogged.
1)
See Data Synchronization for more information