We have our site built using AEM 6.5. The content structure is something like this- /content/site/en
.
The issue that we are facing is that the content structure is visible in the google search. For example, if we search for denim, the results we get currently-
www.site.com/content/site/en/denim.html
The expected result-
www.site.com/denim.html
What could be the different options to hide the structure from the results? Thanks!
2
Answers
Resource mapping can help you achieve your goal of hiding the content structure to the outside world.
Here is link that might help you more.
You are looking for URL shortening. This article talks exactly how todo for URL shortening.
URL Shortening – When content author curates an internal link using path picker, we want the respective anchor link href to get URL shortened and .html appended. Most common practice todo this is to use LinkRewriter. This , this, this are different examples for same implementation. This will take care of changing from href=/content/site/en/denim.html to href=/en/denim.html. This is outgoing links.
Resource resolution for incoming links – Next we also need reverse also to happen. When someone hits a href=www.site.com/en/denim.html, we want AEM to resolve this to /content/site/en/denim page. To do this, there are (in general) 2 ways: 1) Using apache rewrite rules, 2) Sling Resource mapping. Another possible technique might be using CDN Edge rules, but I haven’t seen anywhere for this.
A usual http request to AEM takes this route: browser -> CDN -> Apache -> AEM publisher. In this path, we can convert /denim.html to /content/site/en/denim at Apache or AEM
Apache rewrite rules: when request reaches apache, we use mod_rewrite module to rewrite incoming traffic to AEM resolvable path. For example a simple rule
RewriteRule ^/en/(.*) /content/site/en/$1 [PT]
will change /en/denim./html to /content/site/en/denim.html. Refer here.So URL is already resolved to AEM understandable path. AEM publisher can easily resolve this path to resource and render.
Sling mapping: Second technique is keep apache a dumb cache machine, send traffic directly to publisher, and ask publisher to resolve.
In /etc/maps, we implement internalRedirectRules. AEM before it begins process request, it will lookup the sling mapping, resolve the incoming request to valid resource path and then begins render business.
Both techniques have pros and cons. But Apache rewrite is preferable coz AEM is already too busy with other rendering work.
Summary: