Sitemap and URL Discovery: Find Every URL on Any Site, $1 per 1,000 Sitemaps

Parse robots.txt, sitemap.xml, and nested sitemap indexes to discover every published URL on any website. Returns each URL with its last-modified date, change frequency, and priority. Supports gzipped sitemaps and sitemap-index chains. Built for bulk: pass a list of domains and it returns the full URL inventory for each. Ideal for SEO audits, content monitoring, competitive intelligence, and pre-crawl URL discovery.

Open on Apify → Try it now
Pricing
$0.001/sitemap
RAM
128MB
Coverage
Any domain
Output fields
8+
Proxy
Apify datacenter
Tech
Native XML + gzip

What you get

Primary use cases

API example

# Start a run via the Apify API
curl -X POST "https://api.apify.com/v2/acts/santamaria-automations~sitemap-url-discovery/runs?token=YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "domains": [
      "https://example.com",
      "https://competitor.com",
      "https://news-site.com"
    ],
    "maxUrlsPerSite": 10000,
    "followSitemapIndexes": true,
    "parseRobotsTxt": true
  }'

# Or use with AI agents via MCP:
# https://mcp.apify.com?tools=santamaria-automations/sitemap-url-discovery

Integrations

Output fields

FieldTypeExample
source_domainstringexample.com
sitemap_urlstringhttps://example.com/sitemap.xml
total_urlsinteger12,486
urlstringhttps://example.com/blog/post-1
lastmodstring2026-06-10T14:00:00Z
changefreqstringweekly
prioritynumber0.8
robots_sitemapsarray["https://example.com/sitemap.xml"]
is_gzippedbooleanfalse
scraped_atstring2026-06-13T10:15:42Z

Related Actors

Open on Apify → Try it now (free tier available)