From e4ba68e4e38bb14c959db33820bfa7fe7475c496 Mon Sep 17 00:00:00 2001 From: kim Date: Thu, 24 Apr 2025 13:27:42 +0100 Subject: [PATCH] oop, new-line!! --- docs/admin/robots.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/admin/robots.md b/docs/admin/robots.md index 291f65d06..e4b3d27ce 100644 --- a/docs/admin/robots.md +++ b/docs/admin/robots.md @@ -10,7 +10,6 @@ You can allow or disallow crawlers from collecting stats about your instance fro The AI scrapers come from a [community maintained repository][airobots]. It's manually kept in sync for the time being. If you know of any missing robots, please send them a PR! -A number of AI scrapers are known to ignore entries in `robots.txt` even if it explicitly matches their User-Agent. This means the `robots.txt` file is not a foolproof way of ensuring AI scrapers don't grab your content. In addition to -this you might want to look into blocking User-Agents via [requester header filtering](request_filtering_modes.md), and enabling a proof-of-work [scraper deterrence](scraper_deterrence.md). +A number of AI scrapers are known to ignore entries in `robots.txt` even if it explicitly matches their User-Agent. This means the `robots.txt` file is not a foolproof way of ensuring AI scrapers don't grab your content. In addition to this you might want to look into blocking User-Agents via [requester header filtering](request_filtering_modes.md), and enabling a proof-of-work [scraper deterrence](scraper_deterrence.md). [airobots]: https://github.com/ai-robots-txt/ai.robots.txt/