QUESTION:
How to check if a url is valid and actually loads a page ?
With my current code, only the status code is checked, which means that a url like http://fsd.com/ will be considered as valid although it does not load anything.
How can I check that the url actually points to a website that can be loaded ?
CODE:
$.ajax({
url: link,
dataType: 'jsonp',
statusCode: {
200: function() {
console.log( "status code 200 returned");
validURL = true;
},
404: function() {
console.log( "status code 404 returned");
validURL = false;
}
},
error:function(){
console.log("Error");
}
});
EDIT: By valid, I mean that the page is at last partially loaded (as in at least the html & css are loaded) instead of loading forever or somehow failing without the status code being 404.
EDIT2: http://fsd.com actually returns a 404 now as it should…
EDIT3: Another example: https://dsd.com loads an empty page (status code 200) and http://dsd.com actually loads a page with content (status code 200). On my Node.js backend, the npm package “url-exists” indicates that https://dsd.com is invalid, while my frontend with the code shown in my question indicates it is a valid url. This is what the package code looks like: https://github.com/boblauer/url-exists/blob/master/index.js but I wanted to know what would be the best way according to SO users.
EDIT4:
Sadly, the request provided by Addis is apparently blocked by CORS which blocks the execution of the rest of my code while my original request did not.
$.ajax({
type: "HEAD",
url: link,
dataType: 'jsonp',
}).done(function(message,text,response){
const size = response.getResponseHeader('Content-Length');
const status = response.status;
console.log("SIZE: "+size);
console.log("STATUS: "+status);
if(size > 0 && status == "200") {
$("#submitErrorMessage").css("display","none");
$('#directoryForm').submit();
}
else {
$("#submitErrorMessage").css("display","block");
$("#submitLoading").css("display","none");
}
});
EDIT 5:
To be more precise, both requests trigger a warning message in the browser console indicating that the response has been blocked because of CORS but my original code is actually executed in its entirety while the the other request doesn’t get to the console.log().
EDIT 6:
$.ajax({
async: true,
url: link,
dataType: 'jsonp',
success: function( data, status, jqxhr ){
console.log( "Response data received: ", data );
console.log("Response data length: ", data.length);
console.log("Response status code: ", status);
if (status == "200" && data.length > 0) {
$("#submitErrorMessage").css("display","none");
$('#directoryForm').submit();
}
else {
$("#submitErrorMessage").css("display","block");
$("#submitLoading").css("display","none");
}
},
error:function(jqXHR, textStatus, errorThrown){
console.log("Error: ", errorThrown);
}
});
Error:
Error: Error: jQuery34108117853955031047_1582059896271 was not called
at Function.error (jquery.js:2)
at e.converters.script json (jquery.js:2)
at jquery.js:2
at l (jquery.js:2)
at HTMLScriptElement.i (jquery.js:2)
at HTMLScriptElement.dispatch (jquery.js:2)
at HTMLScriptElement.v.handle (jquery.js:2)
4
Answers
A successful response without content “should” return a 204: No Content but it doesn’t mean that every developer implements the spec correctly. I guess it really depends on what you consider “valid” to mean for your business case.
Valid = 200 && body has some content?
If so you can the test this in the success callback.
I think the word “valid” is used a bit wrongly here. Looking at the code snippet, I can see that you are using
HTTP
error codes to decide whether theURL
is valid or not. However, based on the description, it is clear that you consider the resource (pointed by the URL) to be valid only if it is a web page. I would like to urge the fact thatHTTP
can be used to access resources which need not have aweb page
representation.I think you need to go a bit deeper and retrieve that info (whether it is a web-page representation) from the
HTTP
response that you receive and just relying on the status code would be misleading for you. One clear indicator would be looking at the response header forcontent-type: text/html
.Sample response from accessing http://www.google.com:
What you are trying to accomplish is not very specific, I’m not going to give you a code example on how to do this but here are some pointers.
There are different ways you could get a response: the status code is not tied to the response you get, you could have a 200 response and have no data, or have a 500 error with some data, this could be an html page showing the error or a json object, or even a string specifying what went wrong.
when you say “actually loads a page”, I guess you are referring to an html response, you can check for the
Content-Type
header on your response headers and look fortext/html
and also check forContent-Length
header to check if there is content in your response, and even if you check for those things it’s hard to tell if the html actually displays any content.It really depends on what are you looking specifically, my suggestion is check the
Content-Type
header andContent-Length
and it also depends on the implementation of the website as every one might have different ways of implementing the HTTP protocol.The
HEAD
request is used to getmeta-information
contained in the HTTP headers. The good thing is that the response doesn’t contain the body. It’s pretty speedy and there shouldn’t be any heavy processing going on in the server to handle it. This makes it handy for quick status checking.Content-Length
is one of the meta-data available in the head request which gives the size of the body in bytes, so by checking the size only without loading the whole page you could check if some content is available in the response body.–
EDIT:
The above code is for
dataType
ofjson
. FordataType
ofjsonp
, callback functions forsuccess
anderror
properties will take of the response like the following: