Take the following complex URI (or path, what have you).
/directory/subdirectory/flashy-seo-directory/?query=123&complexvar=abc/123etc
Take this simpler one.
/directory/?query=123
What methodology would you use to accurate process the URI to seperate the directory from the filename/query/etc?
I know how to do this in simple, expected, and typical case scenarios where everything is formatted "normally" or "favorably" but what I’d like to know is if the following example will accurately cover all possible valid directory names/structures/queries/etc. For example I once seen a URI like this that I don’t quite understand: /directory/index.php/something/?query=123
. Not even sure what’s going on there.
Methodology (not dependent on any specific programming language, though I am using PHP for this)
explode
entire URI by/
placing each bit in a neat array
$bits = explode( '/', $uri );
-
Loop through each array item and determine(?) at what point we’ve "reached" the portion of the URI that is no longer directory structure
-
Note which array key is no longer directory structure and
implode
the prior keys to assemble the directory
—
My ideas for Step 2. was going to be basically check to make sure there are no query specific characters (?, &, =). I haven’t seen any directories with .
s in them, but as you can see you can have a query variable such as ?q=abc/123
so simply checking for /
wouldn’t work. I’ve seen directories with the ~
symbol so it so a simple [A-Za-z0-9-]
regex might not work in every scenario. Wondering how Step 2. can be done accurately.
This is needed seeing as the URI can capture a "virtual directory" the script may be running under that doesn’t actually exist anywhere, perhaps via .htaccess for SEO or what have you. And so needs to be properly and accurately "accounted for" in order to have robust and flexible functionality throughout.
2
Answers
If you are only interested in the
path
part, and there is nohost
involved, then you only need to split (explode) the string at the first valid URI path delimiter.Valid delimiters:
;
#
?
Result:
I suppose you’re looking for parse_url()
https://www.php.net/manual/en/function.parse-url