I have a site with the URL https://example.com/file.php
. I don’t use URL friendly, frameworks, etc. But I see that google take duplicate content from my website, but with URL that not exist like:
https://example.com/file.php/file2.php
https://example.com/file.php/file3.php
https://example.com/file.php/file3.php/hihi/other/other2.php (status 200)
But that URLs does not exist. In both cases show me the content from file.php
. I delete my .htaccess
because I think I have some bad rule, but is not that.
Please help me…. 🙁
2
Answers
That the default behaviour for PHP. It’s useful when implementing the Front controller pattern as you can inspect the full path through the
$_SERVER
superglobal.Make use of the
canonical
link to avoid duplicate content in search engines.As @Quentin has already pointed out – this is the default for PHP. Or, more specifically, the Apache handler that processes PHP, allows path-info (additional pathname information on the URL) by default. Plain
text/html
files do not allow path-info, unless explicitly enabled.For example, given the following URL:
Where
file.php
is a physical file on the filesystem, then/<anything>
is the additional pathname information. And is available to PHP through the$_SERVER['PATH_INFO']
variable.However, you can disable this in
.htaccess
with theAcceptPathInfo
directive:Now any URL that contains path-info will trigger a 404 Not Found.