Here I have the following function to convert the string into a slug to make SEO friendly URL.
stringToSlug: function (title) {
return title.toLowerCase().trim()
.replace(/s+/g, '-') // Replace spaces with -
.replace(/&/g, '-and-') // Replace & with 'and'
.replace(/[^w-]+/g, '') // Remove all non-word chars
.replace(/--+/g, '-') // Replace multiple - with single -
}
var title1 = 'Maoist Centre adamant on PM or party chair’s post';
function stringToSlug1 (title) {
return title.toLowerCase().trim()
.replace(/s+/g, '-') // Replace spaces with -
.replace(/&/g, '-and-') // Replace & with 'and'
.replace(/[^w-]+/g, '') // Remove all non-word chars
.replace(/--+/g, '-') // Replace multiple - with single -
}
console.log(stringToSlug1(title1));
var title2 = 'घर-घरमा ग्यास पाइपः कार्यान्वयनको जिम्मा ओलीकै काँधमा !';
function stringToSlug2 (title) {
return title.toLowerCase().trim()
.replace(/s+/g, '-') // Replace spaces with -
.replace(/&/g, '-and-') // Replace & with 'and'
.replace(/[^w-]+/g, '') // Remove all non-word chars
.replace(/--+/g, '-') // Replace multiple - with single -
}
console.log(stringToSlug2(title2));
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
Here I have implemented the above mentioned function with two different languages. Function stringToSlug1
with English and stringToSlug2
with Nepali language. With English text the function is working fine but when the text is in other language above mentioned functions return only -. Result I want to achieve from function stringToSlug2 is घर-घरमा-ग्यास-पाइप-कार्यान्वयनको-जिम्मा-ओलीकै-काँधमा
2
Answers
Based on a answer https://stackoverflow.com/a/18936783/5740382.
I have come up with a solution, though it is not good solution (I guess). I will filter some specials character with
.replace(/([~!@#$%^&*()_+=
{}[]|:;'<>,./? ])+/g, '-')regex instead of filtering all non-word chars with
.replace(/[^w-]+/g, '')`. So here is my jQuery function.Unfortunately, the designers of regular expressions (the ones in JavaScript, anyway) did not think much about internationalization when designing them.
w
only matchesa-z
,A-Z
, and_
, and so[^w-]+
means[^a-zA-Z_-]+
. Other dialects of regular expressions have a unicode-enabled word pattern, but your best bet for JavaScript is to have a blacklist of symbols (you mentioned:!#@$$#@^%#^
. You can do that with something like[:!#@$$#@^%#^]+
(instead of[^w-]+
).