skip to Main Content

If I understand correctly expression .ht* in the next code will match all that starts with .ht, so my .ht_lalala is safe.

<Files ".ht*">
    Require all denied
</Files>

But what about next one?

(^.ht|~$|back|BACK|backup|BACKUP$)

Is it correct for matching files: .htaccess, back, backup, BACKUP? Or next will be better instead

(^.ht*|back*|BACK*$)

What I’d like to understand is what ~$ actually means in my code (in RegEx pattern). I don’t know why and when I put it there, but I have it in my code, and now I doubt that it’s correct.


I know basic things about RegEx, what is ^ and $, and that * means 0 or N from previous text/token, but ~ doesn’t make sense inside the pattern, unless it’s just a simple character and it does nothing but literally matches ~. I’ve read Apache docs, I guess for multiple matches FilesMatch and DirectoryMatch is better, however regular expressions can also be used on directives: Files and Directory, with the addition of the ~ character, as is stated in the docs examples.

<Files ~ ".(gif|jpe?g|png)$">
    #...
</Files>

And well, what I want exactly is to know how to match different files or directories.

One more thing, should I escape the .? Because default httpd.conf doesn’t do so. Or it’s just different for httpd.conf and .htaccess (which doesn’t make sense to me)


UPDATE

Answering to my own question, how do I match with RegEx any of this .ht, .htaccess, .htpasswd, back, BACK, backup, BACKUP, first at all I decided to use . (dot) in the name of anything I want to hide. Secondly, I found out that laconic pattern ^(..*)$ will do the job, will give me what I need. Or ^. even better! So, if in the future I would like to hide something, I just add the . at the start of the name.

Here we go, next code will deny access from the web to any files and directories which names start with . (tested, works)

RegEx pattern match:

<FilesMatch "^.">
    Require all denied
</FilesMatch>

<DirectoryMatch "^.">
    Require all denied
</DirectoryMatch>

And in brilliant explanation @MrWhite clarified and simplified my method, so I stuck with this (tested, works)

Wild-card string match:

<Files ".*">
    Require all denied
</Files>

<Directory ".*">
    Require all denied
</Directory>

2

Answers


  1. The Apache manual covers this.

    ~ enables regex. Without it, you just get access to wildcards ? and *.

    As far as I know Apache uses the PCRE flavor of regex.

    So once you’ve enabled regex via ~ then use https://regex101.com/r/lPkMHK/1 to test the behavior of the regex you’ve written.

    Login or Signup to reply.
  2. <Files ".ht*">
    

    In this context, .ht* is not a regular expression (regex). It is a "wild-card string", where ? matches any single character, and * matches any sequence of characters. (Whilst this is also a valid regex – a regex would match differently).

    But what about next one?

    (^.ht|~$|back|BACK|backup|BACKUP$)
    

    This is a regex (it cannot be used in the <Files> directive as you have written above, without enabling regex pattern matching with the ~ argument – as you have used later.)

    In this regex, ~$ matches any string that ends with a literal ~ (tilde character). This is sometimes used to mark backup files.

    It also matches…

    • Any string that starts .ht (which naturally includes .htaccess).
    • Any string that contains back or BACK or backup (matching backup is obviously redundant).
    • Any string that ends with BACKUP.

    Consequently, this does not look like it’s doing quite what you think it’s doing.

    Or next will be better instead

    (^.ht*|back*|BACK*$)
    

    Whilst this is a valid regex, you’ve obviously reverted back to a mix of "wild-card" pattern matching. Bear in mind that in regex speak, the * quantifier matches the previous token 0 or more times. It does not match "any characters", as in wild-card pattern matching.

    This still matches ".htaccess", but only because the pattern is not anchored. For example, ^.ht*$ (with an end-of-string anchor) would not match ".htaccess".

    <Files ~ ".(gif|jpe?g|png)$">
    

    With the Files directive, the ~ argument enables regex pattern matching. (As you’ve stated.) This is quite different from when ~ is used inside the regex pattern itself.

    One more thing, should I escape the .? Because default httpd.conf doesn’t do so. Or it’s just different for httpd.conf and .htaccess (which doesn’t make sense to me)

    I think you’re mixing things up. In your first example, it’s not a regex, it’s a "wild-card" pattern (as stated above). In this context, the . must not be backslash-escaped. It matches a literal . (dot). The . carries no special meaning here. The . should only be escaped if you need to match a literal dot in a regular expression.

    For example, the following are equivalent:

    # Wild-card string match
    <Files ".ht*">
    

    and

    # Regex pattern match
    <Files ~ "^.ht">
    

    (However, it is preferable to use FilesMatch instead of Files ~ to avoid any confusion. FilesMatch is "newer" syntax.)

    There is no difference between httpd.conf and .htaccess in this regard.


    UPDATE:

    I found out that laconic pattern ^(..*)$ will do the job …

    Here we go, next code will deny access from the web to any files and
    directories which names start with . (tested, works)

    <FilesMatch "^(..*)$">
        Require all denied
    </FilesMatch>
    

    This can be simplified. You do not need to literally match the entire filename. You simply need to assert that the filename starts with a dot (and this is much more efficient). Consequently, you do not need to capture (parenthesised subpattern) the filename – you are not doing anything with it.

    To assert that the filename starts with a dot using regex, then just use ^. – nothing more. For example:

    <FilesMatch "^.">
    

    Bear in mind that regex quantifiers (eg. *) are greedy by default, so you do not need to follow a pattern like .* with an end-of-string anchor when matching a filename. So, the regex ^.*$ and .* are effectively the same in this context. Both match the entire filename. (There are no newline characters in this context.)

    This can be further "simplified" by not using regex at all and using a wild-card string pattern with a vanilla <Files> directive. For example, this is the same as:

    <Files ".*">
    

    NB: This is not a regex. It is a literal dot followed by any number of characters (wild-card syntax).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search