# file: robots.txt,v 1.0 2006/06/15 created by Henry Xu # www.huisun.com # 按照robots.txt的标准写法,规定一些不允许爬虫爬的页面或目录。 # robots.txt 的写法参照 # Format is: # User-agent: # Disallow: | # ----------------------------------------------------------------------------- User-agent: * Disallow: *.asp Disallow: *.gif Disallow: *.jpg Disallow: CheckCode.aspx Disallow: Error_Info.aspx Disallow: sendmail.aspx Disallow: recvsms.aspx Disallow: /images/ Disallow: /image/ Disallow: /livestat/ Disallow: /occ/ Disallow: /member/ Disallow: /member/info.aspx Disallow: /member/OceanShipping.aspx Disallow: /member/login/login.aspx Disallow: /member/conso.aspx #禁止Google图片搜索 User-agent: Googlebot-Image Disallow: / User-agent: EmailCollector Disallow: / User-agent: Teleport Disallow: / User-agent: TeleportPro Disallow: / User-agent: WebZip/4.0 Disallow: / User-agent: NetAnts Disallow: / User-agent: Slurp Crawl-delay: 30