What is Robots.txt? In Plain English
Robots.txt is a small file that tells search engines which parts of your site they can or cannot crawl. Here's what it means in plain English.
In Plain English
Robots.txt is a small text file that tells search engines which parts of your website they can or cannot crawl.
Think of it like a sign on a shop door that says “Staff Only” or “Open to Public.” It doesn’t stop people from entering, but it gives clear instructions.
How Robots.txt Works
- Search engines (like Google) look for
robots.txt
at the root of your domain:- Example:
https://www.example.com/robots.txt
- Example:
- The file contains simple rules called directives, such as:
Allow
– tells search engines they can crawl.Disallow
– tells them not to crawl a page or folder.
Example:
User-agent: *
Disallow: /private/
Allow: /public/
This means: “All search engines can crawl the /public/ folder, but not the /private/ folder.”
Why Robots.txt Matters for SEO
- Controls crawling: Stops search engines wasting time on unimportant pages.
- Protects sensitive areas: Prevents admin or test pages from appearing in search.
- Improves efficiency: Helps search engines focus on your important content.
⚠️ Note: Robots.txt only controls crawling, not indexing. If a page is linked elsewhere, Google may still index it.
FAQs
Q: What is robots.txt?
It’s a text file that tells search engines which pages or folders they can and cannot crawl.
Q: Where do I find my robots.txt file?
At the root of your domain, e.g. https://www.example.com/robots.txt
.
Q: Can robots.txt stop a page from appearing in Google?
Not always. It can block crawling, but a page may still appear in results if other sites link to it.