Asimov: AI Content Generation
Overview
Asimov is responsible for generative AI and other ML capabilities, including text generation such as summerization and chapters, image generation, and speech transcription. Asimov generally uses third-party services rather than self-hosted models, but that is an implementation detail.
Asimov does not store generated data for synchronous requests (gRPC), just the logs associated with the generation job. The expectation is that if Asimov has a calling service, the generated data is simply returned to this service.
For asynchronous / implicit requests (that is, when Asimov has generated content
based on some other data in another service that Asimov became aware of via
consuming NATS messages), Asimov stores the (meta)data it generated
corresponding to the id of that other entity. The key use case here is when
Blobby's NATS change_feed alerts Asimov to a new
audio or video file, Blobby and Asimov have no direct interaction. Asimov simply
chooses sometimes to create metadata for such files.
The decision to centralize AI-generated data within Asimov is predicated on a technical strategy focused on encapsulation and adaptability in the face of the AI field's swift evolution. This architecture segregates AI data handling and processing complexities into Asimov, enabling isolated, rapid modification and enhancement of AI capabilities independent of other services such as Blobby. This separation is useful for accommodating the unpredictable trajectory of AI advancements. This approach mitigates the risk of introducing AI-related dependencies and complexities into Blobby and analogous services, thus maintaining their operational focus and stability amidst the variable nature of AI progress.
Asimov is depended on by Blobby (implicitly) when it comes to creating metadata for audio and video files as well as by Friend (and potentially Blockhead) for block-related AI capabilities. Whether it is Friend or Blockhead that depend on Asimov depends on whether the result of the AI operations is provided to the end user to potentially persist or provided directly to Blockhead for persistence to a block server-side.
Asimov is responsible for authorizing from a rate-limiting and
SubscribableFeature standpoint at the userId and organizationId levels.
This includes whether a user/organization has access to an 'AI feature' and for
rate limiting that access. This creates a gRPC dependency on
Wallstreet to determine access and quotas, but
the usage data itself lives in Asimov.
Third-Party Services
OpenAI
- Room message thread summerization
- Text generation for blocks
- DALLE image generation for blocks
- Search result summerization
- Embeddings for Turbopuffer (Quest handles embeddings, potentially using OpenAI or another service; Asimov is not directly involved in Quest's embedding process)
AssemblyAI
- Audio (and video audio) transcription
- Transcript summarization
- Auto-chapters
API
Asimov allows other backend services to get assets for a given ExternalId but
this shouldnot be used by API services, as they don't know about authorization.
For example, Friend should not have a query to retrieve an audio transcript from
Asimov for a given file, because Friend does not know if the user has access to
that file. Messenger should retrieve the audio transcript from Asimov at the
time it returns the room recording to the client.
NATS
Publication
Asimov publishes a message to asimov.change_feed each time an entity it owns
is created or modified.
Consumption
Asimov consumes blobby.change_feed.file.audio.created and .video.created
using Asimov's own queue-style Jetstream stream.
Databases
Asimov uses Amazon Keyspaces (managed Cassandra) for logging operations / tracking durable rate limits as well as for the actual storage of generated metadata.
-
Asset -
Assetis Asimov's generic name for a thing that it has generated immutable metadata for without being syncronously asked to do so, such as a transcript for a Blobby audioFile. -
Operation - A log of a user (or Asimov itself based on NATS consumption) starting (and possibly succeeding) with an operation, such as generating text or an image.