Robots.txt Generator


Predefinito: tutti i robot sono:  
    
Crawl-Delay:
    
Mappa del sito: (lascia vuoto se non hai) 
     
Cerca robot: Google
  Google Image
  Google Mobile
  MSN Search
  Yahoo
  Yahoo MM
  Yahoo Blogs
  Ask/Teoma
  GigaBlast
  DMOZ Checker
  Nutch
  Alexa/Wayback
  Baidu
  Naver
  MSN PicSearch
   
Directory ristrette: Il percorso è relativo a root e deve contenere una barra finale "/"
 
 
 
 
 
 
   



Ora crea il file "robots.txt" nella tua directory principale. Copia sopra il testo e incollalo nel file di testo.


Di Robots.txt Generator

Enter more information about the Robots.txt Generator tool!

Robots.txt File

The robots.txt file is used to communicate with web robots, also known as web crawlers or spiders that crawl the web indexing websites.

Search engines use robots to crawl the web looking for websites to include in search results.

You can use this site to learn what a robots.txt file is, how it works, how to create a robots.txt file, and how you can use it to control how a robot interacts with your website.

What can you do with a robots.txt file?

You can tell robots that it's okay to crawl your website.

User-agent: *
Disallow: 

You can tell robots not to crawl your website.

User-agent: *
Disallow: /

You can tell robots not to crawl certain parts of your website.

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~steve/

How to create a robots.txt file

There are three steps to creating a robots.txt file. You create the file itself, decide what to include in the file, and upload it to the correct location on your server.

Choosing the right program to create a robots.txt file

A robots.txt file is a plain text file. So, you'll want to be sure to create the file using a plain text editor, such as Notepad. Do not use a program such Microsoft Word, because programs like these will add unwanted code to the file.

Pay special attention to the .txt file extension when creating the file.

Deciding what to include in the file

What is your goal in creating the robots.txt file? Do you want to grant access to all robots, or only to certain robots? Do you want to restrict access to certain files or folders?

Head over to the robots.txt examples page for examples of what you can do with a robots.txt file.

There are two main elements that must be included in the file. These two elements are User-agent: and Disallow:

User-agent: 
Disallow: 

User-agent: defines which robot the next line refers to.

Disallow: tells the robot identified above what files or folders not to access.

While most good robots will listen to what the robots.txt file says, bad robots will not. A robots.txt file is not a guarantee that all robots will behave as requested.

To allow all robots access to your entire site, copy and paste the following code into your plain text editor.

User-agent: *
Disallow:

Where to put the robots.txt file

The robots.txt file needs to be placed at the root of your server. That means that it would be visible in your browser when you look here:

http://www.your-site.com/robots.txt

Note that you would replace 'your-site.com' with your domain name.

The file will not work if you place it into a sub-folder, like this:

http://www.your-site.com/somefolder/robots.txt

Robots.txt Examples

There are several reasons you may want to restrict access to either part or all of your website from search engine spiders and other crawlers. These are just a few examples of how to do that with a robots.txt file.

Granting access with a robots.txt file

To tell a robot that it's okay to access either your entire website, or certain sections of your website you really just need to tell it which sections not to access. Anything not explicitly restricted is assumed to be fair game.

How to tell all robots that it's okay to crawl your entire website.

User-agent: *
Disallow: 

Note that * here translates to all robots, and 'Disallow:' followed by a blank line translates to nothing being disallowed, thus everything is allowed.

How to tell a specific robot that it's okay to crawl your entire website.

User-agent: googlebot
Disallow:
User-agent: *
Disallow: /

Note that / by itself here translates to the entire site.

Restricting access with a robots.txt file

If you don't want robots to crawl certain parts of your website, then you can tell them which parts you don't want them to access with the robots.txt file.

It's important to remember, however, that a robot might not listen. And, because your robots.txt file is publicly visible, a bad robot might use your robots.txt file to identify potentially private sections of your site.

How to tell robots not to crawl your entire website.

User-agent: *
Disallow: /

How to tell all robots not to crawl certain folders within your website.

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~steve/

How to tell all robots not to crawl certain files within your website.

User-agent: *
Disallow: /dont-crawl-me-bro.html

How to tell a specific robot not to crawl your entire website.

User-agent: googlebot
Disallow: /

How to tell a specific robot not to crawl specific sections and files within your website.

User-agent: googlebot
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /dont-crawl-me-google.html
Disallow: /dont-crawl-me-either.html

Note that regular expressions are not valid in a robots.txt file. Each file or folder you want to restrict must be included in the file.

 

The meta tag alternative to the robots.txt file

It is possible to place a robots meta tag on a specific page to tell robots not to index the page, and not to follow the links on the page.

The meta tag is placed within the head section of the source code on the page.

<html>
<head>
<title>This is the page I don't want robots to crawl</title>
<meta name="robots" content="noindex, nofollow">
</head>

As with the robots.txt file, some robots may ignore the meta tag and index the page and follow the links even if the robots meta tag is present.

obotlar meta etiketi görmezden gelebilir ve sayfayı dizine ekleyebilir ve robots meta etiketi mevcut olsa bile bağlantıları izleyebilir.

Birkaç tür meta etiketi vardır, bu nedenle name = "robots" kullanarak ne tür meta etiketi olduğunu tanımlamanız gerekir .

İçerik niteliği ne yapacağını robotlar söyler. Olası değerler arasında "index" , "noindex" , "follow" ve "nofollow" sayılabilir.

Sayfanın dizine eklenmesine nasıl izin verilir, ancak linklerin izlenmesini kısıtlayın.

<html> <head> <title>I want this page indexed, but don't want the links followed</title> <meta name="robots" content="index, nofollow"> </head>

How to restrict the page from being indexed, but allow the links to be followed.

<html>
<head>
<title>I don't want this page indexed, but do want the links followed</title>
<meta name="robots" content="noindex, follow">
</head>

How to restrict the page from being indexed, and the links from being followed.

<html>
<head>
<title>I don't want this page indexed, or the links followed</title>
<meta name="robots" content="noindex, nofollow">
</head>

While other combinations are possible, they wouldn't make much sense.