Robots.txt Generator

Generate a robots.txt file with custom Allow/Disallow rules, user-agent targeting, sitemap URL, and crawl-delay. Download or copy the result.

User-agent: *
Allow: /

lightbulbTip

robots.txt prevents crawling but does not prevent indexing. To truly block a page from search results, use a noindex meta tag.

What is the Robots.txt Generator?

A robots.txt file sits at the root of your site (yoursite.com/robots.txt) and tells search-engine and AI crawlers which paths they can fetch. It's plain text following the Robots Exclusion Protocol, the de-facto standard since 1994. The file controls crawling, not indexing; for true de-indexing, use a noindex meta tag instead.

How to use the Robots.txt Generator

  1. 1

    Pick a user-agent

    * covers all bots. Pick a specific one (Googlebot, Bingbot, Yandex) when you want different rules per crawler. Most sites stay with *.

  2. 2

    Add Allow and Disallow rules

    Click + Add Rule for each path. Disallow: /admin blocks crawlers from /admin. Allow: /admin/public carves out an exception. Empty Disallow: means "allow everything".

  3. 3

    Add your sitemap and (optional) crawl delay

    The sitemap URL helps crawlers discover all your pages. Crawl-delay throttles request rate (Googlebot ignores this; Bing and Yandex respect it).

  4. 4

    Download or copy

    Click Download to save as robots.txt. Upload to the root of your domain. Copy if you'll add it manually to a server config or static-site repo.

Frequently Asked Questions

What is robots.txt?

A plain-text file at https://yoursite.com/robots.txt that tells crawlers (search engines and AI bots) which URLs they're allowed to fetch. Format is simple: a User-agent line, then Allow and Disallow rules. The file follows the Robots Exclusion Protocol, originally proposed in 1994 and now standardized as RFC 9309. Most well-behaved bots respect it; malicious bots ignore it.

Can robots.txt block pages from Google?

Not reliably. robots.txt blocks crawling, but Google can still index a page from external links pointing to it (the title shows up in results with no description). To actually keep a page out of Google's index, use a noindex meta tag or X-Robots-Tag: noindex HTTP header. Counter-intuitively, those work only if the page is not blocked in robots.txt (Google has to crawl the page to see the noindex).

Where do I put robots.txt?

At the root of your domain, exactly: https://yoursite.com/robots.txt. Subdirectories don't work; subdomains need their own file. The URL is case-sensitive (lowercase only). For a static site, drop the file in your public/ or static/ directory; for a server-rendered site, serve it at the root path.

Related Tools