Core Concepts
This page explains multi‑device synchronization at a conceptual level. It avoids code and SQL and focuses on models, roles, and flows you can apply to any stack.
The Multi‑Device Synchronization Problem
You want the same data to appear consistently across a user’s devices (phone, laptop, tablet), even when devices go offline and reconnect at different times.
Challenges to solve:
- Conflicts: two devices may edit the same record while offline
- Ordering: changes must be applied in a consistent order
- Reliability: uploads can be partially applied; retries must be safe (idempotent)
- Performance: large datasets must sync incrementally and predictably
Key Players in the Sync System
- user_id
- The person/tenant who owns the data; all sync operations are scoped to a single user
- Guarantees isolation: one user’s changes never mix with another’s
- source_id
- A stable, opaque identifier for one installation of your app on one device
- Lets the system attribute each change to a device; enables excluding a device’s own changes from downloads (to avoid echo)
- We often use
device_idas the term for this on the client side, the both represent the same concept
- source_change_id
- A per‑device monotonically increasing sequence number (1, 2, 3, …) assigned by the client to each local change
- Combined with source_id, it forms a unique idempotency key: (source_id, source_change_id)
- server_version
- A monotonically increasing version maintained by the server for each row/object
- Enables optimistic concurrency: clients submit expected version; server increments on success and reports conflicts when expectations do not match
Terminology note
source_id and device_id refer to the same concept. We use source_id on the service/server side; on the client side it’s commonly called device id. It must be a stable, opaque identifier for one app installation on one device.
Deep dive: identity, source_id (aka device_id), and idempotency
- What is source_id?
- A random, persistent identifier for one device/app instance (typically UUIDv4). Generate on first launch and store in durable local storage.
- Unique per user: two active devices for the same user must not share a source_id.
- Why it matters
- The idempotency key (user_id, source_id, source_change_id) lets the server accept safe retries without duplicating changes.
- If two devices share a source_id, their changes can collide and be treated as duplicates.
- Lifecycle best practices
- New install: generate a fresh source_id; initialize source_change_id = 1
- Reinstall/restore: reuse the old source_id only if no other active device uses it; otherwise, create a new one
- Device replacement: prefer a new source_id; idempotency still holds because the key is per device
- Rotation: if rotated, reset source_change_id to 1; never reuse the same (source_id, source_change_id) pair
- Privacy considerations
- Keep source_id opaque and free of PII; it exists for sync semantics, not tracking
- Self‑exclusion
- By default, downloads exclude rows produced by the same source_id (include_self=false)
- Temporarily set include_self=true for bootstrap/hydration or device recovery when a device intentionally wants all server changes, including ones it created in the past
The Big Picture: How the Flow Works
The system supports three core operations:
- Hydration (Bootstrap): initialize or recover a device by downloading a consistent snapshot
- Upload: send the device’s local changes to the server
- Download: fetch new server changes since the device’s last checkpoint
High‑level flow (end‑to‑end)
sequenceDiagram
autonumber
participant D as Device (Client)
participant A as Adapter (API boundary, e.g. HTTP server)
participant S as Sync Service
participant P as Persistence (DB)
Note over D: First install or recovery
D->>A: Hydration request (include_self=true)
A->>S: Request snapshot window and page
S->>P: Freeze window upper bound and page changes
P-->>S: Page of changes, next cursor, has_more
S-->>A: Page + window_until
A-->>D: Apply page and repeat until has_more=false
Note over D: Normal operation
D->>A: Upload local changes (idempotent keys)
A->>S: Process changes per-change (savepoints)
S->>P: Apply and attribute to user_id/source_id
S-->>A: Per-change statuses (applied/conflict/...)
A-->>D: Report statuses + highest sequence
D->>A: Download since last cursor (include_self=false)
A->>S: Request page within frozen window
S->>P: Page query using user isolation and cursor
P-->>S: Page of changes
S-->>A: Page + next cursor + has_more
A-->>D: Apply page and repeat until done
Hydration (client-side concept)
- Goal: bring a fresh or recovered device to an exact, consistent state
- Device requests a frozen window and downloads in pages
- include_self=true so the device receives all history relevant to the user, including records originally uploaded by itself (on older devices)
- The device applies each page atomically and advances its local cursor after each successful page
Upload (Idempotent, Per‑change)
- Each local change is identified by (source_id, source_change_id) for safe retry
- The server processes each change in isolation (savepoint per change) to allow partial success
- Server returns a per‑change status:
- applied: accepted and version advanced
- conflict: version expectation failed; client must pull latest and re‑apply
- invalid: bad payload or validation failure
- materialize_error: accepted into sync state but failed secondary materialization (recorded for admin retry)
Download (Windowed and Paged)
- Device requests changes after its last known cursor
- Server freezes an upper bound (window_until) to avoid a moving target while paging
- By default, exclude own device’s changes (include_self=false) to prevent echo
- Device applies each page atomically, then advances its cursor; repeat until has_more=false
Materialization
Materialization is the process of converting sync data into your application’s business tables. When changes are uploaded to the server, they go through two phases:
-
Sync Storage: Changes are first stored in the sync metadata tables (sidecar schema) where they’re versioned, validated, and prepared for distribution to other devices.
-
Business Materialization: Optionally, changes can be materialized into your application’s business tables on the server side. This allows:
- Server-side queries: Your backend can query business data directly without parsing sync metadata
- Reporting and analytics: Business intelligence tools can access clean, structured data
- API endpoints: REST APIs can serve data from materialized tables
- Integrations: Other systems can consume data in familiar table formats
Key Points:
- Materialization is optional - sync works perfectly without it
- If materialization fails, the change is still accepted into sync storage and distributed to devices
- Failed materializations are logged for admin review and retry
- The sync system remains the source of truth; materialized tables are derived views
This separation ensures that sync reliability is never compromised by business logic complexity.
Summary
- Identity and attribution (user_id, source_id) power isolation and echo‑avoidance
- Idempotency (source_id, source_change_id) makes retries safe and predictable
- Optimistic concurrency via server_version detects conflicts early
- Hydration uses a frozen window to establish a clean baseline; regular sync uses smaller uploads and paged downloads to stay up‑to‑date