Suppose, On a WordPress site example.com, I can see these two URLs https://example.com/robots.txt
and https://example.com/?robots=1
exist and they both are robots.txt files.
The code in the robots.txt
file:
# START YOAST BLOCK
# ---------------------------
User-agent: *
Disallow: /wp-admin/
Sitemap: https://example.com/sitemap_index.xml
# ---------------------------
# END YOAST BLOCK
However, the code in ?robots=1
file:
# START YOAST BLOCK
# ---------------------------
User-agent: *
Disallow: /?s=
Disallow: /page/*/?s=
Disallow: /search/
Disallow: /wp-json/
Disallow: /?rest_route=
Sitemap: https://example.com/sitemap_index.xml
# ---------------------------
# END YOAST BLOCK
The code generated in ?robots=1
file is the output of the Yoast plugin to disallow internal search URLs on the website pages.
I want to know what purpose these files robots.txt
and ?robots=1
serve on a WordPress site.
robots.txt
file is placed in the root directory of your WordPress site that provides instructions to the web crawler as to which part of the site is crawled or not.For example, you can always block certain directories, web pages, or other resources on your website from getting indexed by search engines like Google or Bing.
You can see the contents of the
robots.txt
file by going to the pagehttps://example.com/robots.txt
.On the other hand, meta robots tag
?robots=1
is used at a page level to control whether individual pages should be indexed or followed.When you visit a page on your website with the
?robots=1
query parameter, then you can check the robots meta tags settings for that specific page.You can see how indexing instructions like
index
,noindex
,follow
, andnofollow
are applied to your specific page.