Skip to main content
短.be

Robots.txt

A text file placed at a website's root that instructs search engine crawlers which pages they are allowed or disallowed to access.

Jul 18, 2025 · About 1 min read

SEO

Robots.txt is a plain text file placed at the root of a website that provides instructions to web crawlers about which parts of the site they should or should not access. The file follows the Robots Exclusion Protocol, a convention that well-behaved crawlers respect, though it is advisory rather than enforceable.

The syntax is straightforward: User-agent directives specify which crawler the rules apply to, and Disallow directives list paths that should not be crawled. An Allow directive can override a Disallow for specific paths. The file can also include a Sitemap directive pointing to the site's XML sitemap. Web crawling books on Amazon explain the specification.

For URL shortening services, robots.txt plays a dual role. The service's own website uses robots.txt to guide crawlers to important pages while blocking administrative areas. The redirect endpoints, however, should generally be accessible to crawlers so that search engines can follow short URLs and discover the destination content.

Important considerations include not using robots.txt to hide sensitive content (it does not prevent indexing if pages are linked from elsewhere), testing the file with Google's robots.txt tester, and being aware that different crawlers may interpret edge cases differently. Search engine books on Amazon discuss these nuances.

Share on XHatena

Was this article helpful?

Related Terms

Related Articles

FAQ

Can pages blocked by robots.txt still be indexed?
Crawling is blocked, but if other pages link to it, the URL alone may remain in the index. To fully exclude a page from the index, use a noindex meta tag.
What problems can robots.txt misconfigurations cause?
Blocking important pages from crawling can cause them to disappear from search results. Blocking CSS or JavaScript files can also affect search engine rendering.

Ready to create a short URL?

Shorten a URL for Free