We're using a curl HEAD request in a PHP application to verify the validity of generic links. We check the status code just to make sure that the link the user has entered is valid. Links to all websites have succeeded, except LinkedIn.
While it seems to work locally (Mac), when we attempt the request from any of our Ubuntu servers, LinkedIn returns a 999 status code. Not an API request, just a simple curl like we do for every other link. We've tried on a few different machines and tried altering the user agent, but no dice. How do I modify our curl so that working links return a 200?
A sample HEAD request:
curl -I --url https://www.linkedin.com/company/linkedin
Sample Response on Ubuntu machine:
HTTP/1.1 999 Request denied
Date: Tue, 18 Nov 2014 23:20:48 GMT
Server: ATS
X-Li-Pop: prod-lva1
Content-Length: 956
Content-Type: text/html
To respond to @alexandru-guzinschi a little better. We've tried masking the User Agents. To sum up our trials:
- Mac machine + Mac UA => works
- Mac machine + Windows UA => works
- Ubuntu remote machine + (no UA change) => fails
- Ubuntu remote machine + Mac UA => fails
- Ubuntu remote machine + Windows UA => fails
- Ubuntu local virtual machine (on Mac) + (no UA change) => fails
- Ubuntu local virtual machine (on Mac) + Windows UA => works
- Ubuntu local virtual machine (on Mac) + Mac UA => works
So now I'm thinking they block any curl requests that dont provide an alternate UA and also block hosting providers?
Is there any other way I can check if a link to linkedin is valid or if it will lead to their 404 page, from an Ubuntu machine using PHP?
See Question&Answers more detail:os