Fetching latest headlines…
How to Set Up llms.txt and robots.txt for AI Crawlers on WordPress (2026 Guide)
NORTH AMERICA
πŸ‡ΊπŸ‡Έ United Statesβ€’April 5, 2026

How to Set Up llms.txt and robots.txt for AI Crawlers on WordPress (2026 Guide)

0 views0 likes0 comments
Originally published byDev.to

If AI crawlers cannot reach your WordPress site, your content will never appear in ChatGPT or Perplexity answers. Here is the exact setup, file by file.

I manage WordPress sites for EdTech brands in India. A few months back I noticed something that made no sense: pages ranking well on Google were getting zero citations on Perplexity. Same content, same domain, completely invisible to AI search.

The problem was not the content. It was three technical things I had never thought to check.

Here is exactly what I fixed, with the actual code.

## Step 1: Fix your robots.txt for AI crawlers

Most WordPress robots.txt files either block AI crawlers explicitly or do not mention them at all. Neither is ideal.

Open yourdomain.com/robots.txt in your browser and check what is there. If you see any of these, AI crawlers are blocked:

User-agent: GPTBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

To fix this, you need to edit your robots.txt. On WordPress, the easiest way is using WPCode:

  1. Go to WPCode in your WordPress dashboard
  2. Click Code Snippets, then Add Snippet
  3. Choose Text snippet type
  4. Set the insertion location to "robots.txt"
  5. Paste this:
User-agent: GPTBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: GoogleOther-Extended
Allow: /

User-agent: Applebot-Extended
Allow: /

Save and verify at yourdomain.com/robots.txt.

Alternatively use the Yoast SEO or Rank Math robots.txt editor under their Tools section if you have either plugin installed.

Step 2: Turn off Cloudflare Bot Fight Mode

This one is the most impactful fix and the least obvious. Cloudflare's Bot Fight Mode is enabled by default on all plans.

It blocks automated traffic, which includes legitimate AI crawlers like PerplexityBot and ClaudeBot.

They hit your site, get blocked at the Cloudflare layer, and your server never even sees the request.

I was running this for months without realising it was silently blocking AI citations for our VEGA AI brand.

To fix it:

  1. Log into your Cloudflare dashboard
  2. Select your domain
  3. Go to Security, then Bots
  4. Turn Bot Fight Mode OFF

If you are on a paid Cloudflare plan using Super Bot Fight Mode:

  1. Go to Security, then Bots
  2. Find "Definitely automated"
  3. Change it from Block to Allow

Verify it worked by checking your server access logs a few days later. You should start seeing GPTBot and PerplexityBot in the logs.

Step 3: Create your llms.txt file

llms.txt is a plain text file that tells large language models what your site is about and which pages matter most. Think of it as a sitemap specifically for AI models.

Create a file called llms.txt and place it in your WordPress root directory. The easiest method:

  1. Go to your hosting control panel, open File Manager
  2. Navigate to public_html (or your WordPress root)
  3. Create a new file called llms.txt
  4. Paste your content and save

Here is a template you can adapt:

# Your Site Name

> One sentence describing what your site covers and who it is for.

## Key Pages

- [Page Title](https://yourdomain.com/page-slug/): What this page covers in one sentence.
- [Page Title](https://yourdomain.com/page-slug/): What this page covers in one sentence.
- [Page Title](https://yourdomain.com/page-slug/): What this page covers in one sentence.

## About

Your name, your role, and your expertise in two sentences.

## Contact

[email protected]

Keep it simple. You do not need to list every page, just the most important ones per topic cluster. Verify it is live at yourdomain.com/llms.txt.

Step 4: Add FAQPage schema markup

AI models prioritize structured question and answer content. FAQPage schema tells them exactly where to find it.

In WordPress, add this via WPCode on any page that has FAQ content:

  1. Go to WPCode, Add Snippet
  2. Choose JavaScript snippet type
  3. Set insertion to "Site Wide Footer"
  4. Paste and adapt this for each page (or use a separate snippet per page):
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "Your question here?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Your answer here. Keep it direct and complete."
      }
    },
    {
      "@type": "Question",
      "name": "Your second question here?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Your second answer here."
      }
    }
  ]
}
</script>

If you use Rank Math, you can add FAQ schema directly in the post editor under the Schema tab without touching code.

Select FAQPage as the schema type and fill in your questions there.

Step 5: Add Article schema with author details

AI models weight authorship when deciding what to cite. Anonymous content gets cited less. Add this to your key pages:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Your Page Title",
  "author": {
    "@type": "Person",
    "name": "Your Name",
    "jobTitle": "Your Job Title",
    "url": "https://yourdomain.com/about/"
  },
  "publisher": {
    "@type": "Organization",
    "name": "Your Site Name",
    "url": "https://yourdomain.com"
  }
}
</script>

Again, Rank Math handles this automatically if you set the rich snippet type to Article and fill in the author details in your profile settings.

Verify everything is working

Once all four steps are done:

  1. Test robots.txt: visit yourdomain.com/robots.txt and confirm AI crawler directives are present
  2. Test llms.txt: visit yourdomain.com/llms.txt and confirm it loads
  3. Test schema: paste your URL into Google's Rich Results Test (search.google.com/test/rich-results) and confirm FAQPage and Article schema show clean
  4. Check Cloudflare: confirm Bot Fight Mode is off in your dashboard

Results are not immediate. Give it four to six weeks and check whether your pages start appearing in Perplexity answers for your target topics.

Search your main keywords directly in Perplexity and see if you get cited.

I documented the full setup for each of these in more detail at proaisearch.com/llm-seo/ if you want deeper implementation notes.

For a broader guide on AI search optimization beyond just the technical setup, Pro AI Search covers GEO, AEO, and LLM SEO in one place.

Comments (0)

Sign in to join the discussion

Be the first to comment!