Sync Sacra research & data

The Sacra API supports time-windowed queries across /documents, /news/company, and /events, so you can poll for changes on a schedule instead of re-fetching everything. This guide walks through setting up a daily sync that fetches only what’s changed in the last 24 hours.

How sync works

All three endpoints accept updated_at_gte and updated_at_lte query parameters to filter results to a time window. Combine these with cursor pagination to walk through all changes since your last sync.

Endpoint	Response key	Pagination	Time filter support
`/api/v1/documents`	`documents`	Cursor (default)	`created_at_gte/lte`, `updated_at_gte/lte`
`/api/v1/news/company`	`company_news`	Cursor (default)	`created_at_gte/lte`, `updated_at_gte/lte`
`/api/v1/events`	`events`	Cursor (opt-in: `pagination=cursor`)	`created_at_gte/lte`, `updated_at_gte/lte`

Domain-less news queries (no company_domain parameter) require a date window of 14 days or less. A daily 24-hour window is well within this limit.

Query structure

Each sync request uses the same pattern: a 24-hour window on updated_at with cursor pagination to handle multiple pages.

Documents

curl -H "Authorization: Token YOUR_API_KEY" \
  "https://sacra.com/api/v1/documents?updated_at_gte=2026-04-16T00:00:00Z&updated_at_lte=2026-04-17T00:00:00Z"

News

curl -H "Authorization: Token YOUR_API_KEY" \
  "https://sacra.com/api/v1/news/company?updated_at_gte=2026-04-16T00:00:00Z&updated_at_lte=2026-04-17T00:00:00Z"

Events

Events require pagination=cursor to use cursor pagination:

curl -H "Authorization: Token YOUR_API_KEY" \
  "https://sacra.com/api/v1/events?pagination=cursor&updated_at_gte=2026-04-16T00:00:00Z&updated_at_lte=2026-04-17T00:00:00Z"

Pagination

All three endpoints return a pagination object with next_link when there are more results. Follow next_link until it’s null to fetch all pages.

{
  "pagination": {
    "page_size": 30,
    "current_page_items": 30,
    "total_items": 85,
    "next_link": "https://sacra.com/api/v1/documents?updated_at_gte=...&page_after=abc123",
    "prev_link": null,
    "after_cursor": "abc123",
    "before_cursor": null
  }
}

Use the next_link URL directly — it includes all your original query parameters plus the cursor.

Complete sync script

This Node.js script polls all three endpoints with a 24-hour window and collects every changed record. You can wire the results into your database, search index, or file system.

const API_BASE = "https://sacra.com/api/v1";
const API_KEY = process.env.SACRA_API_TOKEN;

async function fetchAllPages(url) {
  const results = [];
  let nextUrl = url;

  while (nextUrl) {
    const res = await fetch(nextUrl, {
      headers: { Authorization: `Token ${API_KEY}` },
    });

    if (!res.ok) {
      throw new Error(`${res.status} ${res.statusText}: ${nextUrl}`);
    }

    const json = await res.json();

    // Each endpoint wraps results in a different key
    const items =
      json.documents || json.company_news || json.events || [];
    results.push(...items);

    nextUrl = json.pagination?.next_link || null;
  }

  return results;
}

function buildTimeWindow() {
  const now = new Date();
  const yesterday = new Date(now.getTime() - 24 * 60 * 60 * 1000);

  // Format as ISO 8601 for the API
  const gte = yesterday.toISOString();
  const lte = now.toISOString();

  return { gte, lte };
}

async function sync() {
  const { gte, lte } = buildTimeWindow();
  const timeParams = new URLSearchParams({
    updated_at_gte: gte,
    updated_at_lte: lte,
  });
  const eventsParams = new URLSearchParams({
    pagination: "cursor",
    updated_at_gte: gte,
    updated_at_lte: lte,
  });

  console.log(`Syncing changes from ${gte} to ${lte}`);

  const [documents, news, events] = await Promise.all([
    fetchAllPages(`${API_BASE}/documents?${timeParams.toString()}`),
    fetchAllPages(`${API_BASE}/news/company?${timeParams.toString()}`),
    fetchAllPages(`${API_BASE}/events?${eventsParams.toString()}`),
  ]);

  console.log(
    `Fetched ${documents.length} documents, ${news.length} news items, ${events.length} events`
  );

  // Replace this with your storage logic
  await saveDocuments(documents);
  await saveNews(news);
  await saveEvents(events);
}

// Placeholder functions — replace with your database/storage logic
async function saveDocuments(docs) {
  for (const doc of docs) {
    console.log(`  doc: ${doc.slug} — ${doc.title} (updated ${doc.updated_at})`);
    // e.g. db.documents.upsert({ id: doc.id, ...doc })
  }
}

async function saveNews(items) {
  for (const item of items) {
    console.log(`  news: ${item.short_headline} — ${item.company?.domain} (${item.release_date})`);
    // e.g. db.news.upsert({ id: item.id, ...item })
  }
}

async function saveEvents(events) {
  for (const event of events) {
    console.log(`  event: ${event.event_name} — ${event.company?.domain} (${event.event_date})`);
    // e.g. db.events.upsert({ id: event.event_id, ...event })
  }
}

sync().catch((err) => {
  console.error("Sync failed:", err);
  process.exit(1);
});

Run it manually to verify:

SACRA_API_TOKEN=your_key_here node sync.js

Expected output:

Syncing changes from 2026-04-16T00:00:00.000Z to 2026-04-17T00:00:00.000Z
Fetched 3 documents, 12 news items, 5 events
  doc: kraken — Kraken One-Pager (updated 2026-04-16T19:03:59.691Z)
  doc: owner — Owner One-Pager (updated 2026-04-16T21:01:20.492Z)
  ...

Schedule with cron

Once the script works, schedule it to run daily. A common choice is once per day at midnight UTC.

Linux/macOS crontab

crontab -e

Add:

0 0 * * * SACRA_API_TOKEN=your_key_here /usr/local/bin/node /path/to/sync.js >> /var/log/sacra-sync.log 2>&1

GitHub Actions

name: Sacra Sync
on:
  schedule:
    - cron: "0 0 * * *" # midnight UTC daily
  workflow_dispatch: # allow manual trigger

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: node sync.js
        env:
          SACRA_API_TOKEN: ${{ secrets.SACRA_API_TOKEN }}

Store your API key in environment variables or a secret manager. Never hardcode it in the script or commit it to source control.

Tracking sync state

The script above uses a rolling 24-hour window, which is simple but may miss items if a sync run is skipped or delayed. For more robust syncing, persist the timestamp of your last successful sync and use that as the gte boundary.

const { readFileSync, writeFileSync } = require("fs");

const STATE_FILE = ".sacra-sync-state.json";

function getDefaultLastSyncTime() {
  return new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString();
}

function getLastSyncTime() {
  try {
    const state = JSON.parse(readFileSync(STATE_FILE, "utf-8"));
    const lastSync = state?.last_sync;

    if (typeof lastSync === "string") {
      const parsedLastSync = new Date(lastSync);

      if (!Number.isNaN(parsedLastSync.getTime())) {
        return parsedLastSync.toISOString();
      }
    }
  } catch {
    // Ignore read/parse errors and fall back to a safe default below
  }

  // First run or invalid state — default to 24 hours ago
  return getDefaultLastSyncTime();
}

function saveLastSyncTime(timestamp) {
  writeFileSync(STATE_FILE, JSON.stringify({ last_sync: timestamp }));
}

async function sync() {
  const gte = getLastSyncTime();
  const lte = new Date().toISOString();
  const timeParams = `updated_at_gte=${gte}&updated_at_lte=${lte}`;

  console.log(`Syncing changes from ${gte} to ${lte}`);

  const [documents, news, events] = await Promise.all([
    fetchAllPages(`${API_BASE}/documents?${timeParams}`),
    fetchAllPages(`${API_BASE}/news/company?${timeParams}`),
    fetchAllPages(`${API_BASE}/events?pagination=cursor&${timeParams}`),
  ]);

  console.log(
    `Fetched ${documents.length} documents, ${news.length} news items, ${events.length} events`
  );

  await saveDocuments(documents);
  await saveNews(news);
  await saveEvents(events);

  // Only update the checkpoint after everything succeeds
  saveLastSyncTime(lte);
  console.log(`Checkpoint saved: ${lte}`);
}

This way, if a run fails midway, the next run retries from the same starting point.

If you’re running on GitHub Actions or a stateless environment, store the checkpoint in your database or an S3 object instead of a local file.

Using `created_at` vs `updated_at`

The endpoints support both created_at and updated_at filters. Which you use depends on what you’re trying to capture:

Filter	Catches	Misses
`updated_at_gte/lte`	New items and edits to existing items	Nothing — this is the recommended default
`created_at_gte/lte`	Only newly created items	Updates to previously synced items

Use updated_at for sync workflows. It catches both new and modified records in a single pass.

Scoping to specific companies

If you only care about a subset of companies, add company_domain to scope results:

# Only Kraken documents
curl -H "Authorization: Token YOUR_API_KEY" \
  "https://sacra.com/api/v1/documents?company_domain=kraken.com&updated_at_gte=2026-04-16T00:00:00Z&updated_at_lte=2026-04-17T00:00:00Z"

This works for all three endpoints. Company-scoped queries don’t have the 14-day window limit that domain-less news queries have.

Tips

Upsert, don’t insert. Use id (documents, news) or event_id (events) as the primary key and upsert on each sync. An item updated twice in 24 hours will appear once in the response with its latest state.
Overlap your windows slightly. If you’re using a fixed 24-hour window rather than checkpoint-based sync, consider a small overlap (e.g., 25 hours) to avoid missing items updated right at the boundary.
Handle empty pages. Some days may have no changes for a given endpoint. The response will have an empty array and total_items: 0 — this is normal.
Log what you sync. The script above logs each synced item. In production, log the counts and any errors so you can audit sync health.
Rate limiting. The Sacra API has rate limits. If you’re syncing large volumes, add a small delay between paginated requests. For daily syncs with a 24-hour window, you’re unlikely to hit limits.
See the changelog. For full details on the sync-related API additions, see Content Synchronization Updates.

Getting started

Core concepts

Guides

Sync Sacra research & data

How sync works

Query structure

Documents

News

Events

Complete sync script

Schedule with cron

Linux/macOS crontab

GitHub Actions

Tracking sync state

Using `created_at` vs `updated_at`

Scoping to specific companies

Tips

Getting started

Core concepts

Guides

Documentation Index

​How sync works

​Query structure

​Documents

​News

​Events

​Pagination

​Complete sync script

​Schedule with cron

​Linux/macOS crontab

​GitHub Actions

​Tracking sync state

​Using created_at vs updated_at

​Scoping to specific companies

​Tips

How sync works

Query structure

Documents

News

Events

Pagination

Complete sync script

Schedule with cron

Linux/macOS crontab

GitHub Actions

Tracking sync state

Using `created_at` vs `updated_at`

Scoping to specific companies

Tips