Is .htaccess password protected site hidden from search engines? - SEO

BadHorsie
June 24, 2016
94 views
1 vote
2 Answers

We have a website instance on a domain which is blocked by a .htaccess password. Some IPs, such as the company’s network are allowed through.

There are no inbound links (although obviously cannot guarantee this 100%)
The site has no robots.txt
The robots meta tag is set to follow and index

With all of these conditions, is there any way that search engines could still index the site? I think not but want to make sure there is no loophole I didn’t know about.

Answers

- MichaelBenjamin
- June 24, 2016 at 5:08 pm
- 0 votes
0
Pages that are password-protected will not be accessible to the search
engines.

Search engine robots typically can’t log in to crawl pages,
so content behind a login will not make it into the search index.

_{source: http://www.yourseoplan.com/is-password-protected-content-indexable-by-search-engines/}

Also see this post from a Google employee:

No, our crawlers can’t access login protected pages.

_{source: Gary Illyes, Google, https://productforums.google.com/forum/#!topic/news/2SdcGEWht1o}

Login or Signup to reply.

- omghai_8782
- June 24, 2016 at 5:09 pm
- 0 votes
0
I’m pretty sure any crawler would be stopped before reaching any content, at the point .htaccess demands a password, seeing as how that’s the whole point of having an .htaccess password.

If you wanted to be redundantly sure for educational purposes, you could probably test from various browsers in private tabs, and maybe send a raw request on a socket to see what output you get back. Here’s a page that describes how you’d send a raw HTTP request: https://www3.ntu.edu.sg/home/ehchua/programming/webprogramming/HTTP_Basics.html

Here’s an excerpt from that page, where they describe how you’d go about fetching a page at http://nowhere123.com/docs/index.html:
```
GET /docs/index.html HTTP/1.1
Host: www.nowhere123.com
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
(blank line)
```
You can send raw requests using telnet, which is definitely available in most linux distros, and probably available in windows, too.

I went ahead and issued this request (with modified path and host) to one of my own servers with a known .htaccess password gateway, and got this response:
```
HTTP/1.0 401 Unauthorized
Date: Fri, 24 Jun 2016 15:08:26 GMT
WWW-Authenticate: Basic realm="Restricted Area"
Content-Type: text/plain
Content-Length: 19

Invalid CredentialsConnection closed by foreign host.
```
So … maybe this will help you.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Is .htaccess password protected site hidden from search engines? – SEO

Answers