Skip to Content

Extract

TL;DR

The extract service uses AI to pull structured data from URLs or raw text content, returning fully typed results via Zod schemas or JSON Schema. Define your output shape with Zod, point it at a URL or paste in text, and get typed data back. Every extraction includes usage metrics (input/output tokens, latency). Supports complex nested schemas with arrays and optional fields.

Extract structured data from content using AI and Zod schemas. Get typed output with usage metrics.

How do I set up the extract service?

import { CMDOPClient } from '@cmdop/node'; import { z } from 'zod'; // Connect to the cloud relay (no machine needed -- extraction runs server-side) const client = await CMDOPClient.remote({ apiKey: 'cmdop_xxx' });

How do I extract data with a Zod schema?

// Define a Zod schema describing the shape of data to extract const ProductInfo = z.object({ name: z.string(), price: z.number(), currency: z.string(), features: z.array(z.string()), inStock: z.boolean(), }); // Run extraction against a URL -- AI reads the page and fills the schema const result = await client.extract.runSchema({ prompt: 'Extract product information', schema: ProductInfo, url: 'https://example.com/product/123', }); // result.data is fully typed as { name: string; price: number; ... } console.log(result.data.name); console.log(result.data.price); console.log(result.data.features);

How do I extract data with a JSON Schema?

// Use a plain JSON Schema object instead of Zod (output is untyped) const result = await client.extract.runSchema({ prompt: 'Extract contact details', jsonSchema: { type: 'object', properties: { name: { type: 'string' }, email: { type: 'string', format: 'email' }, phone: { type: 'string' }, }, required: ['name', 'email'], }, content: 'John Doe, [email protected], +1-555-0123', // Raw text input });

How do I extract from raw text content?

// Extract from a raw text string instead of a URL const result = await client.extract.runSchema({ prompt: 'Extract all dates mentioned', schema: z.object({ dates: z.array(z.object({ date: z.string(), // The extracted date string context: z.string(), // Surrounding context explaining the date })), }), content: longTextContent, // Pass raw text via the content parameter });

How do I access extraction metrics?

Every extraction result includes usage metrics:

const result = await client.extract.runSchema({ prompt: 'Extract product info', schema: ProductInfo, url: 'https://example.com/product', }); // Access token usage and latency from the metrics object console.log(result.metrics.inputTokens); // Tokens sent to the AI model console.log(result.metrics.outputTokens); // Tokens generated by the AI model console.log(result.metrics.latencyMs); // Total extraction time in milliseconds

What parameters does runSchema() accept?

runSchema(options)

ParameterTypeDescription
options.promptstringExtraction instructions
options.schemaZodTypeZod schema for typed output
options.jsonSchemaobjectJSON Schema (alternative to Zod)
options.urlstringURL to extract from
options.contentstringRaw content to extract from

Provide either url or content, not both.

Result

FieldTypeDescription
dataTTyped extracted data
metrics.inputTokensnumberInput tokens used
metrics.outputTokensnumberOutput tokens used
metrics.latencyMsnumberExtraction latency in ms
Last updated on