Blockhead: Spaces and Blocks
Overview
Blockhead is responsible for everything related to blocks, including the
Block, BlockData, and BlockView entities as well as the Space entity and
sharing entities like SpaceMember and BlockSharingRule. Blockhead receives
BinaryBlockUpdate records and turns those into updated blocks.
This means that Blockhead has all the information it needs to authorize access to blocks, because it has the space, the block, the block sharing rules, and space memberships. The only exception to this is blocks in rooms, as the Messenger service has to provide Blockhead with room membership information if the room is not a space room.
Blockhead is also responsible for BlockResponse, which means it handles
comment-style experiences that are actually similar to the Message entity in
the Messenger service. Block and BlockResponse records can relate to files
in general and audio/video messages specifically, so Blockhead uses the
Blobby and
Stagehand gRPC APIs.
Blockhead is also responsible for storing user presence on the UserBlockData
and SpaceMember entities. This presence data is published to NATS by Dealer.
Blockhead/Messenger Integration
See Messenger for a description of how these services interact.
API
Blockhead exposes a synchronous gRPC API for CRUD operations on its entities.
Atomic Updates
Most block properties are editable via the UpdateBlockProperty gRPC method.
This mutation is designed to maximize conflict-free concurrency by accepting
only a single new property value, so the risk of conflict is low. Postgres
transactions are used to ensure that the block state at the time of read is
still the state at the time of write.
BinaryBlockUpdates and Pilot Tokens
The gRPC API includes a method for creating BinaryBlockUpdates. The fact that
Blockhead batches those into fewer database updates is an kind of an
implementation detail. The point is that it receives a single BinaryBlockUpdate
via gRPC and saves it to disk before returning OK.
Blockhead is not designed to handle character-by-character
BinaryBlockUpdates. The expection is that end-user clients debounce their updates to some degree. Clients can send more frequent updates through Pilot/Dealer, but those aren't persisted.
When Blockhead receives and processes BinaryBlockUpdate, it does no
authorization checks, unlike with its UpdateBlockProperty method. The idea
being that BinaryBlockUpdates are intended to be authorized with tokens, whereas
all other methods that mutate blocks are understood to be on behalf of an
authenticated but not necessarilly authorized user, so Blockhead runs an authz
check.
Therefore, users need these tokens, so Blockhead provides a
BinaryBlockUpdateToken as a field on the Block Protobuf message, if the user
has edit access.
Additionally, Blockhead provides Pilot Token whenever it
returns a Space or Block Protobuf message, regardless of whether the user
has edit access (permissions are embedded in the token), if the block is
Pilot-eligible.
A block type is generally considered Pilot-eligible if it:
- Is viewable in full-screen AND
- Has the ability to render an avatar stack and/or multiplayer cursors.
The following block types are Multi eligible:
- Page
- Goal
- Database
- Database Item
- Canvas
- Event
- Calendar
- Chart
Note that a block type may changed after a Pilot Token is created. For example, a Database Item could be converted to a Page by it being re-parented to a block type that is not a Database. This does not affect anything from Pilot's point of view as neither it nor the Dealer service that it fronts actually have any block-type-specific logic. Blocks cannot be converted from Pilot-eligible types to ineligible types. That is to say, a Page cannot be converted to a Heading, and Find a Time block cannot be converted to a Canvas. Even if they could, this would have no effect on Pilot or Dealer.
Implementation of BinaryBlockUpdate Processing
One Blockhead instance is responsible for processing BinaryBlockUpdates at any one time. The 'leader election' for this is described at the bottom of this article.
The steps in BinaryBlockUpdate processing can be summerized as follows:
-
Get all
BinaryBlockUpdaterecords from Amazon Keyspaces grouped byblockId. -
Get the
Blockrow from Postgres for the firstblockId(repeat this for all other blocks that had at least one BBU) -
Use Yjs via a gRPC call to Fusion to combine the existing value from the
Blockrow with each value from theBinaryBlockUpdaterecords. -
Save both the binary and JSON values to the
Blockrow in Postgres. -
Delete the processed
BinaryBlockUpdatesfrom Keyspaces, explicitly providing each primary key.
A transaction when reading and writing to each
Blockrow in Postgres is necessary to ensure that two Blockhead instances don't both attempt to update the sameBlockrow with different merged binary values resulting in an overwrite. The whole point of using Yjs is to avoid conflicts.
AV Features
Blockhead provides multiple gRPC methods that wrap Stagehand and Blobby, and provide Blockhead specific authorization, for the purpose of supporting various live audio and video featuers. Blobby must also be called seperately to facilitate the creation of direct upload URLs.
GenerateBlockAvTokenβ This allows users to join an audio room inside the block UI.CreateFile(Blobby RPC with type set tovideo_stream) β This allows users to use Mux RTMP in order to create video message attachments for BlockResponses. The client still has to attach the File later withCreateBlockResponseAttachment, which forces the client to know whether they even tried to do so and allows Blockhead to verify that the file exists and was uploaded by this user.CreateFile(Blobby RPC with type set tovideo_stream) β This allows users to use Mux RTMP directly in order to create a video message attachment for a Block. The client still has to attach the File later withCreateBlockAttachment, which forces the client to know whether they even tried to do so and allows Blockhead to verify that the file exists and was uploaded by this user.CreateFile(Blobby RPC with type set toaudioor other) β This allows user to upload files via Blobby that are then attached to to a Block. This is used for audio messages as well as general files. This RPC returns either an S3 pre-signed URL or a Mux direct upload URL, depending on the file type stated in the request. The client still has to attach the File later withCreateBlockAttachment, which forces the client to know whether they even tried to do so and allows Blockhead to verify that the file exists and was uploaded by this user.CreateFile(Blobby RPC with type set toaudioor other) β This allows user to upload files via Blobby that are attached to to a BlockResponse. This is used for audio messages as well as general files. This RPC returns either an S3 pre-signed URL or a Mux direct upload URL, depending on the file type stated in the request. The client still has to attach the File later withCreateBlockResponseAttachment, which forces the client to know whether they even tried to do so and allows Blockhead to verify that the file exists and was uploaded by this user.
AvRoom Presence
To support rendering presence for an AvRoom related to a block, Blockhead's
Block entity includes an AvRoomPresence field, which it pulls from
Stagehand. Clients don't have to poll for this, as new presence values are
pushed directly via Dealer events. Note that Blockhead will often return this as
a null value if the block doesn't actually support having an associated an
AvRoom.
NATS
Publication
Blockhead publishes to blockhead.change_feed.{entity}.created/.updated as
whenever an entity it owns is changed.
Consumption
dealer.presence.blockV1, and.spaceV1, to update each SpaceMember and UserBlockDatalast_seen_here_atvalue, as described below.facebox.change_feed.group_member.*to update SpaceMember records based on changes in group membership.
Application presence (last_seen_here_at)
After updating a UserBlockData and/or SpaceMember, Blockhead does not
publish to change_feed.block.updated or space_member.updated.
last_seen_here_at updates can be consumed directly from Dealer like Blockhead
does, but are not republished by Blockhead.
When processing .blockV1 presence, Blockhead must also potentially update the
related SpaceMember, if there is one. If a user is present in a block, then
they are also present in a space, if they are a member of it.
Note that Blockhead might get messages that list timestamps prior to the current
value in Blockhead β it should do nothing in those cases, the
last_seen_here_at goes forward, not back.
Databases
Blockhead uses Postgres to store relational data representing blocks, spaces,
and many types of block metadata, with the exception of the BinaryBlockUpdate
data, which is stored in Amazon Keyspaces (managed Cassandra) for optimized
write performance.
-
Space β A space represents a context for users to collaborate. It has settings, members, and blocks. Each space is owned by a Facebox organization, so when a user attempts to read or write to the Space entity but isn't a space member, Blockhead has to check Facebox for that user's OrganizationRoles.
-
SpaceMember β A SpaceMember connects a specific user to a specific space, and allows one or more space roles to apply to that user in that space. A user can only have one active space membership in a given space at a time.
-
GroupSpaceMember β A Facebox group can be added to a space, such that Blockhead will create/update SpaceMember rows for each user who is in the Facebox group.
-
SpaceRole β A space role is defined for a specific space and assigned to one or more space members. It provides permissions to the members it is assigned to and therefore determines whether a user is a guest or full member.
-
SpaceRoleMemberAssignment β This is the join table between SpaceMembers and SpaceRoles.
-
SpaceRoleGroupAssignment - This connects a SpaceRole to a GroupSpaceMember.
-
SpaceMembershipTier β When self-service space joining is enabled, one or more tiers can be defined to represent the various prices/capabilities that a user could select when joining a space. Each SpaceMembershipTier can have an array of
featurescorresponding to the features defined on the Space. Features are entirely for UI purposes in the context of paid space membership comparisions between different SpaceMembershipTiers. -
SpaceRoleMembershipTierAssignment β This is the join table between SpaceRoles and SpaceMemershipTiers. The lifecycle of a SpaceMember being created for a specific SpaceMembershipTier (and future cancellation of the subscription tied to the Spacemember corresponding to that Tier) includes adding/removing SpaceRoles from the SpaceMember based on what SpaceRoleMembershipTierAssignments exist for the MembershipTier.
-
Block β A block is a visible object in a Space (or attached to a Messenger message) which may have a parent that is another block or may not. Pivot is 'block-based' in a fundamental sense; this is the most important database table in the system.
-
BlockRelationship β Blocks can have parents using the
parent_block_idfield on the block, however for other types of relationships, we use a dedicated join table. -
BlockAttachment β A block attachment represents the association between a block and a
fileIdwhich is managed by Blobby. This file ID is resolved into a URL by Blobby upon request from Blockhead. -
BlockData β Many block types have distinct data types, where a block could have an arbitrary number of related records of various types. This could be represented by JSON columns on the
blocktable, but that puts a vague upper limit on the number of related data for a single block as well as limits Postgres' ability to help with data integrity. The BlockData table enables us to store distinct block values such as the names of properties associated with a Goal block as well as the values for those properties associated with child Goal blocks. -
BlockView β A block view is distinct from BlockData because it represents a very specific type of metadata about a block, views of it. These views represent the ways in which the BlockData and/or child blocks of the related block are visualized in lists, timelines, etc. This is central to Database and Goal blocks, for example.
-
BlockVersion β A block version is a copy of the block and all its associated metadata (views, etc.) at a specific point in time, with the changes made by some number of users since the last version was created. Block versions enable 'version history', but leave the heavy lifting in terms of diffing and rendering to the client β on the server it is simply a copy of the data as JSON.
-
BlockSharingRule β By default, members of a space have access to the blocks in that space. However, sharing rules can be created to adjust set sharing on a block and its descendants. When determining if a user has access to a block, it is therefore necessary to look at the block and its ancestors' SharingRule records, and compare them to the user's space memberships, group memberships, space role(s), and membership tier.
-
BlockResponse β Comments on text blocks, RSVPs to Find a Time blocks, and many other block-oriented messaging use cases are handled with this table. It is similar to the Message entity in Messenger, but optimized for block use cases.
-
BlockResponseAttachment β This table represents the association between a BlockResponse and a
fileIdwhich is managed by Blobby. This file ID is resolved into a URL by Blobby upon request from Blockhead. -
UserBlockData β A user might have a historical record of various actions related to a block, such as when they last viewed it, whether they have 'checked it off' in the personal list of blocks, etc. A single user has at most a single
UserBlockDatafor a specific blockId. -
BinaryBlockUpdate β When clients make binary block updates, as described above, we add them to this table in Amazon Keyspaces. This way, instead of making an update to the main
Blocktable in Postgres each time a client makes a small change to a block's rich text content, we debounce them by storing them temporarily in this Keyspaces table before processing. -
BlockAutomationRun β Some blocks need a history of syncs with an external system or other recurring processes (such as a Workflow run). This is represented by a row per attempt (successful or not) at such an automation running.
-
BlockAuthorizedConnection β In the future, this table will be used to relate the
AuthorizedConnectionin Facebox to a block that uses that connection. For example, Synced Database block could use a GitHub OAuth connection represented by a specificAuthorizedConnectionin Facebox. -
BlockheadLocks β This table is described later.
Block Tree
-
The column
pathin the Blocks table must be calculated on block creation and recalculated for a block when a block'sdescriptionSpaceIdhomeSpaceIdorparentBlockIdattachedToMessageIdvalue is changed (every block only has one defined). Thepathmust be the list of ancestor blockidvalues from top to self, inclusive. This applies also to all of the blocks descendents. -
We also need to consider deleted blocks. We keep them in the same tree β they are not treated differently, because the descendents of a deleted block are still its descendents.
-
If a block is more than 100 UUIDs from the top most ancestor, we skip the lower ancestors in the
pathcolumn and add|OVERFLOW|right before the final value, this block's ID. That ensures that all blocks can fit in this field and be queried fairly performantly, but allows Blockhead to have logic for looping through queries to build a tree in such edge cases. -
The write amplification in our tree model is unfortunate, but we are optimizing for reading the tree, not updating it, and for avoiding things like recurring CTEs that are a slow option in SQL databases or purpose-built graph databases which would mean optimizing for moving blocks at the expense of all other block operations, or running a seperate database for the tree.
Temporal Workflows
N/A
Deployment
Observability
Security
Distributed Locking Mechanism in Postgres
The mechanism ensures that a particular operation (processing BinaryBlockUpdates β BBU) is performed by only one instance of the service at any given time in a horizontally scaled environment.
Database Schema
- Table Name:
blockhead_locks - Columns:
lock_key: VARCHAR, a constant value identifying the lock (e.g., 'bbu_lock').instance_id: UUID, the unique identifier of the Blockhead instance holding the lock.timestamp: TIMESTAMP, the time when the lock was last confirmed (updated).
SQL Schema Creation
CREATE TABLE blockhead_locks (
lock_key VARCHAR PRIMARY KEY,
instance_id UUID NOT NULL,
timestamp TIMESTAMP NOT NULL
);Lock Acquisition
-
Initialization: Each Blockhead instance generates a unique
InstanceId(UUID) at startup. -
Acquiring the Lock:
- The instance attempts to insert its
InstanceIdinto theblockhead_lockstable. - The lock is considered acquired if the insert operation is successful.
DO $$ BEGIN -- Try to insert the lock record INSERT INTO blockhead_locks (lock_key, instance_id, timestamp) VALUES ('bbu_lock', '<InstanceId>', CURRENT_TIMESTAMP) ON CONFLICT (lock_key) DO NOTHING; -- Check if the insert was successful IF FOUND THEN RAISE NOTICE 'Lock acquired'; ELSE RAISE NOTICE 'Lock already held by another instance'; END IF; END $$; - The instance attempts to insert its
-
Maintaining the Lock:
- The instance holding the lock updates the
timestampcolumn to indicate it's still active, every 5 seconds. - This update acts as a keep-alive signal.
UPDATE blockhead_locks SET timestamp = CURRENT_TIMESTAMP WHERE lock_key = 'bbu_lock' AND instance_id = '<InstanceId>'; - The instance holding the lock updates the
Lock Expiry and Failover
-
Expiry Check:
- Other instances periodically check the
timestampof thelock_key. - If the
timestampis older than a predefined threshold (e.g., 10 seconds), the lock is considered expired.
- Other instances periodically check the
-
Acquiring an Expired Lock:
- An instance can attempt to acquire the lock if it's deemed expired.
DO $$ BEGIN -- Check if the lock is expired and try to acquire it UPDATE blockhead_locks SET instance_id = '<InstanceId>', timestamp = CURRENT_TIMESTAMP WHERE lock_key = 'bbu_lock' AND timestamp < CURRENT_TIMESTAMP - INTERVAL '10 seconds'; IF FOUND THEN RAISE NOTICE 'Expired lock acquired'; ELSE RAISE NOTICE 'Lock not acquired'; END IF; END $$;