Skip to main content

Recipes API

Overview

Recipes are the foundation of Scrape Loop, defining the structure and rules for data extraction. Each recipe contains:

  • Target URL and selectors
  • Data extraction rules
  • Pagination configuration
  • Execution settings

Endpoints

List Recipes

Retrieve a list of all your scraping recipes.

GET
/recipes

Query Parameters

ParameterTypeDescription
statusesstringComma-separated list of statuses (active, paused, deleted)
limitintegerNumber of items to return
skipintegerNumber of items to skip

Get Recipe

Retrieve a single recipe by its ID.

GET
/recipes/{id}

Path Parameters

ParameterTypeDescription
idstringRecipe ID

Get Recipe Jobs

Retrieve all jobs associated with a recipe.

GET
/recipes/{id}/jobs

Path Parameters

ParameterTypeDescription
idstringRecipe ID

Query Parameters

ParameterTypeDescription
statusstringFilter by job status (pending, running, completed, failed)
limitintegerNumber of items to return
skipintegerNumber of items to skip

Recipe Components

Property Types

TypeDescriptionExample
textExtract text contentProduct title, description
linkExtract href attributeProduct URL, next page link
htmlExtract raw HTMLRich text content
srcExtract source URLImage URL, video source

Pagination Types

TypeDescriptionConfiguration
load-moreClick button to load more itemsRequires nextPageSelector
scroll-downInfinite scroll paginationOptional maxPages
next-buttonTraditional pagination with next buttonRequires nextPageSelector
noneSingle page scrapingNo additional config

Extractor Types

TypeDescriptionBest For
selectorUses CSS selectorsStructured content
llmAI-powered extractionDynamic/complex content

Recipe Status

graph LR
A[Created] --> B[Active]
B --> C[Paused]
C --> B
B --> D[Deleted]
StatusDescription
activeRecipe is enabled and can be run
pausedRecipe is temporarily disabled
deletedRecipe is permanently disabled

Best Practices

  1. Selectors

    • Use specific CSS selectors
    • Test selectors across different pages
    • Handle missing data gracefully
  2. Pagination

    • Set reasonable page limits
    • Handle rate limiting
    • Test edge cases (last page, empty results)
  3. Maintenance

    • Monitor recipe success rate
    • Update selectors when sites change
    • Archive unused recipes