Snapshot and Publishing Workflow Analysis¶

Overview¶

The system uses two separate but complementary publishing mechanisms:

is_published (per-entity): Individual content item visibility
snapshot.status (per-snapshot): Batch version control

How They Work Together¶

1. Entity-Level Publishing (`is_published`)¶

Purpose: Content moderation and individual item review workflow

Scope: Per entity (domain, trail, concept, spark)
Default Values:
domains: DEFAULT true (published by default)
trails, concepts, sparks: DEFAULT false (unpublished by default)
Set During Import: Values come from the bundle JSON file
Use Case: Allows fine-grained control over which items are visible

Example:

{
  "domains": [
    { "name": "ML", "is_published": true },   // Visible
    { "name": "AI", "is_published": false }   // Hidden
  ],
  "trails": [
    { "title": "Intro ML", "is_published": true },   // Visible
    { "title": "Advanced ML", "is_published": false } // Hidden
  ]
}

2. Snapshot-Level Publishing (`snapshot.status`)¶

Purpose: Batch version control and atomic deployments

Scope: Entire snapshot (all entities imported together)
States:
draft: Import in progress, not visible to default queries
published: Active snapshot, visible to default queries
archived: Old snapshot, only visible via explicit snapshot_id query
Set During Import: Created as draft, promoted to published via --promote flag
Use Case: Atomic batch deployments, version rollback, preview before publish

Example Workflow:

# 1. Import creates draft snapshot
deno task import bundle.json
# → snapshot.status = 'draft'
# → Not visible in default queries

# 2. Preview draft content
GET /discovery/domains?snapshot_id=<draft-id>
# → Shows content even if snapshot is draft

# 3. Promote to published
deno task import bundle.json --promote
# → snapshot.status = 'published'
# → Becomes current active snapshot
# → Visible in default queries

Current Query Logic¶

Default Queries (No `snapshot_id` parameter)¶

// 1. Get current snapshot ID (or null if none)
const snapshotId = await getEffectiveSnapshotId(supabase, null);

// 2. Filter by snapshot (if exists)
if (snapshotId) {
  query = query.eq("snapshot_id", snapshotId);
} else {
  query = query.is("snapshot_id", null); // Legacy data only
}

// 3. ALWAYS filter by is_published
query = query.eq("is_published", true);

Result: Returns only published entities from the current published snapshot (or legacy data if no snapshot).

Explicit Snapshot Queries (`snapshot_id` parameter)¶

// 1. Use requested snapshot ID
const snapshotId = await getEffectiveSnapshotId(supabase, requestedId);

// 2. Filter by snapshot
query = query.eq("snapshot_id", snapshotId);

// 3. ALWAYS filter by is_published
query = query.eq("is_published", true);

Result: Returns only published entities from the specified snapshot (even if snapshot is draft).

Is There Redundancy?¶

Yes, but with Different Purposes¶

Redundancy exists because both mechanisms control visibility, but they serve different use cases:

Aspect	`is_published`	`snapshot.status`
Granularity	Per entity	Per batch/snapshot
Use Case	Content moderation	Version control
Workflow	Individual review	Atomic deployment
Flexibility	Fine-grained	All-or-nothing

Potential Issues¶

Published Snapshot with Unpublished Entities
A snapshot can be status='published' (active) but contain entities with is_published=false
Result: Snapshot is active but some content is hidden
Is this intentional? Yes - allows partial content rollout
Draft Snapshot with Published Entities
A snapshot can be status='draft' but contain entities with is_published=true
Result: Content is ready but snapshot isn't active
Is this intentional? Yes - allows preview before promotion
Double Filtering
Both filters are always applied: snapshot_id AND is_published=true
Is this redundant? Partially - but necessary for:
- Previewing draft snapshots (need is_published filter)
- Legacy data support (need snapshot_id filter)

Recommended Workflow¶

Scenario 1: Batch Import with All Content Ready¶

# 1. Import bundle with all entities published
deno task import bundle.json --promote
# → All entities: is_published=true
# → Snapshot: status='published'
# → Immediately visible in production

Scenario 2: Staged Rollout¶

# 1. Import bundle with some entities unpublished
deno task import bundle.json --promote
# → Some entities: is_published=true (ready)
# → Some entities: is_published=false (not ready)
# → Snapshot: status='published'
# → Only published entities visible

# 2. Later: Publish remaining entities
UPDATE domains SET is_published=true WHERE id IN (...);
# → Now all entities visible

Scenario 3: Preview Before Publish¶

# 1. Import as draft
deno task import bundle.json
# → All entities: is_published=true
# → Snapshot: status='draft'
# → Preview via: GET /discovery/domains?snapshot_id=<draft-id>

# 2. Verify and promote
deno task import bundle.json --promote
# → Snapshot: status='published'
# → Now visible in default queries

Potential Simplification¶

Option 1: Remove `is_published` from Snapshots¶

Idea: If all content in a snapshot should be published together, use only snapshot.status.

Pros: - Simpler model - Atomic publishing (all or nothing) - Less redundancy

Cons: - No fine-grained control - Can't do staged rollouts - Can't preview individual items

Option 2: Remove `snapshot.status` (Use Only `is_published`)¶

Idea: Use only entity-level publishing, no snapshot versioning.

Pros: - Simpler model - More flexible per-entity control

Cons: - No atomic batch deployments - No version rollback - No preview before publish - Harder to manage large imports

Option 3: Keep Both (Current Approach) ✅¶

Recommendation: Keep both mechanisms because they serve different purposes:

is_published: Content moderation workflow (individual items)
snapshot.status: Version control workflow (batch deployments)

Benefits: - Fine-grained content control - Atomic batch deployments - Preview before publish - Staged rollouts - Version history

Conclusion¶

While there is some redundancy between is_published and snapshot.status, they serve complementary purposes:

is_published = "Is this individual piece of content ready?"
snapshot.status = "Is this entire version/batch active?"

The current design allows for: 1. ✅ Fine-grained content moderation 2. ✅ Atomic batch deployments 3. ✅ Preview before publish 4. ✅ Staged rollouts 5. ✅ Version history

Recommendation: Keep both mechanisms as they provide flexibility for different use cases.