Skip to content

Channel Data Model

Adam Fraser edited this page Dec 9, 2015 · 5 revisions

This is a high-level overview of how channel and access information is stored in the _sync metadata.

What are channels?

Functionally, a channel defines read-side security for documents. Documents are assigned to zero or more channels, and users are granted access to zero or more channels. In order for a user to read a document through the Sync Gateway REST API (and by extension, via client replication), the document needs to be assigned to a channel that the user has access to.

Channels are associated with a particular revision of a document. All channel information for a document is stored in Sync Gateway's private _sync metadata block in the document itself. The _sync metadata isn't accessible to clients for read or write - it's managed by Sync Gateway, and is stripped from documents before they are replicated.

Channel information stored in Sync Gateway metadata

1. Channel information in the revision tree

Sync Gateway stored channel assignment in the revision tree for the document. For each revision in the rev tree, we store the list of channels that are assigned to that revision of the document.

2. Channel presence history

In addition to the information in the rev tree, we also store channel history for the active revision independently. The channels property in the _sync metadata lists of all channels the document has ever been assigned. For channels that aren't assigned to the active revision, the channels property includes the revision when the document was removed from that channel.

This property simplifies tracking of document channel assignment without requiring a full rebuild of the rev tree - particularly for use by views and on performance-sensitive Sync Gateway processing.

Sample data:

"channels": {
      "channel_B": null,
      "channel_A": {
        "seq": 22,
        "rev": "2-f38e39675f20d50803192f14232f2226"
      }
    },

In the above example, the current (active) revision of the document belongs to channel channel_B. It previously was assigned to channel_A, but that assignment was removed at revision 2-f38e39675f20d50803192f14232f2226.

If a document is added/removed from a channel multiple times, only the most recent removal is tracked in the _channels metadata.

Channel metadata usage

The channels property has several use cases.

  1. Channel Index processing

Sync Gateway maintains it's own channel index (i.e. identifying the set of documents that belong to a particular channel). This is done based on the Couchbase Server DCP feed - Sync Gateway nodes monitor the feed, and use the channel metadata included inline in the document to build the channel index. Prior to SG 1.2, this is an in-memory index that's supplemented by MR views (see #2). Post-1.2, SG also supports a persistent index that's shared across a SG cluster. In both cases, the processing of the DCP feed and indexing of documents into channels is a very performance sensitive operation. The channel index is the main driver for replication to Couchbase Lite.

  1. Sync Gateway internal views Sync Gateway maintains a set of internal MR views. The channels view is used to supplement the indexing work described in #1 - it returns the set of documents belonging to a given channel, ordered by sequence.

  2. Custom views When end users create a view through the Sync Gateway REST API, we need to apply channel security to the results of that view. Currently this is done by modifying any user created views to also emit the channel metadata, and then filter on that metadata when returning view results.

Clone this wiki locally