mirror of
				https://github.com/superseriousbusiness/gotosocial.git
				synced 2025-10-31 15:22:26 -05:00 
			
		
		
		
	[chore] Update our robots.txt (#3033)
This syncs our copy with the current state of the ai.robots.txt repository. Upstream has tightened their scope to be AI-only, whereas before it included a bunch of SEO and "web intelligence" marketing stuff. I've kept those but moved them into their own section.
This commit is contained in:
		
					parent
					
						
							
								c2738474d5
							
						
					
				
			
			
				commit
				
					
						4604224c4d
					
				
			
		
					 1 changed files with 15 additions and 9 deletions
				
			
		|  | @ -34,34 +34,40 @@ const ( | ||||||
| User-agent: AdsBot-Google | User-agent: AdsBot-Google | ||||||
| User-agent: Amazonbot | User-agent: Amazonbot | ||||||
| User-agent: anthropic-ai | User-agent: anthropic-ai | ||||||
| User-agent: Applebot | User-agent: Applebot-Extended | ||||||
| User-agent: AwarioRssBot |  | ||||||
| User-agent: AwarioSmartBot |  | ||||||
| User-agent: Bytespider | User-agent: Bytespider | ||||||
| User-agent: CCBot | User-agent: CCBot | ||||||
| User-agent: ChatGPT-User | User-agent: ChatGPT-User | ||||||
| User-agent: ClaudeBot | User-agent: ClaudeBot | ||||||
| User-agent: Claude-Web | User-agent: Claude-Web | ||||||
| User-agent: cohere-ai | User-agent: cohere-ai | ||||||
| User-agent: DataForSeoBot | User-agent: Diffbot | ||||||
| User-agent: FacebookBot | User-agent: FacebookBot | ||||||
| User-agent: FriendlyCrawler | User-agent: FriendlyCrawler | ||||||
| User-agent: Google-Extended | User-agent: Google-Extended | ||||||
| User-agent: GoogleOther | User-agent: GoogleOther | ||||||
| User-agent: GPTBot | User-agent: GPTBot | ||||||
| User-agent: ImagesiftBot | User-agent: img2dataset | ||||||
| User-agent: magpie-crawler |  | ||||||
| User-agent: Meltwater |  | ||||||
| User-agent: omgili | User-agent: omgili | ||||||
| User-agent: omgilibot | User-agent: omgilibot | ||||||
| User-agent: peer39_crawler | User-agent: peer39_crawler | ||||||
| User-agent: peer39_crawler/1.0 | User-agent: peer39_crawler/1.0 | ||||||
| User-agent: PerplexityBot | User-agent: PerplexityBot | ||||||
| User-agent: PiplBot |  | ||||||
| User-agent: Seekr |  | ||||||
| User-agent: YouBot | User-agent: YouBot | ||||||
| Disallow: / | Disallow: / | ||||||
| 
 | 
 | ||||||
|  | # Marketing/SEO "intelligence" data scrapers | ||||||
|  | User-agent: AwarioRssBot | ||||||
|  | User-agent: AwarioSmartBot | ||||||
|  | User-agent: DataForSeoBot | ||||||
|  | User-agent: ImagesiftBot | ||||||
|  | User-agent: magpie-crawler | ||||||
|  | User-agent: Meltwater | ||||||
|  | User-agent: PiplBot | ||||||
|  | User-agent: scoop.it | ||||||
|  | User-agent: Seekr | ||||||
|  | Disallow: / | ||||||
|  | 
 | ||||||
| # Well-known.dev crawler. Indexes stuff under /.well-known. | # Well-known.dev crawler. Indexes stuff under /.well-known. | ||||||
| # https://well-known.dev/about/ | # https://well-known.dev/about/ | ||||||
| User-agent: WellKnownBot      | User-agent: WellKnownBot      | ||||||
|  |  | ||||||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue