Posted on

Curl from the Command Line

We most frequently use Curl in the form of `libcurl`, a C library providing function transferring data between servers using many of the popular protocols like HTTP, FTP, SCP, and so on. This library is the foundation things like all the curl_*() functions in PHP, which are useful for writing code that interacts with various web services.

But there is also the Curl command line program, built atop the same library. I find the program useful for debugging and testing certain aspects of web applications, so I wanted to share a list of the things I like to do with Curl, which I hope you will find useful as well.

Headers

To see the headers from a site:

$ curl --head http://example.com

We can use this to make sure any custom headers are being sent properly, and to see things like what cache information the server is sending to browsers. It will also show information like the PHP session ID. Or more importantly sometimes is what the command does not show, if we have an error in our code that prevents necessary headers from being sent.

Cookies

The command above will show cookie info, but if that’s all we’re interested in then we use this:

$ curl --cookie-jar cookies.txt http://example.com/

We can then inspect the cookies to see if the values are set to what we expect. Or to try out different things we can change the values and then run:

$ curl --cookie cookies.txt http://example.com/

to simulate a request using our new cookie values. By using the option `–junk-session-cookies` in conjunction with the above, we can send all of our modified cookies but without any session information. This has the effect of behaving as if we had closed our browser.

Forms

When we want to write a script that deals with submitting a <form>, we can use the --data option to pass in values to the form fields. For example, to test a script where users can post comments to a site:

$ curl --data username='Lobby C Jones' --data email='Lobby@cybersprocket.com' --data message='Nom nom nom' http://localhost/eric/test.php

If the message we wanted to send was really long, we could put it in a text file and then change that particular option to:

--data-urlencode message@input.txt

That is, we can write:

--data-urlencode name@file

to mean the same thing as:

--data name=<contents of file>

This is *not* a file upload; it is simply a way to read contents from a file and use them as a form parameter value. To perform an actual file upload we can use the `–form` option. Let’s say we want to simulate uploading a CSV file to a web application:

 $ curl --form doc=@our-data.csv http://probably.dtuser.com/

This would upload our-data.csv as the doc form field. If needed, we can specify the content type:

$ curl --form "photo=@lobby.png;type=image/png" http://lonelysingles.com/photos/shellfish/upload.php

We can use --get to send our data in the form of GET instead of a POST, although this does not work with --form since it always uses the content type multipart/form-data. But it will modify any --data that we send to be appended to the URL.

Timeouts and Retries

When using Curl in scripts we want to avoid situations where the whole operation might hang, either because the server hangs, or because we are using the script to download something when the network connection is very slow, or because of a solar flare. We can use three options to avoid these problems.

  1. --connect-timeout <N> will wait N seconds for the connection to succeed before bailing. This only affects the connection. Once we successfully initiate communication with the server, there is no time limit. To control that we use…
  2. --max-time <N> which only allows N seconds for the entire operation.
  3. --no-solar-flare avoids all solar flares.

If we are scripting an operation that could fail then we can tell Curl to retry a number of times by using --retry <N>. If the request fails, Curl will wait one second and then try again. That delay then doubles after every successive failure, maxing out at ten minutes.

PUT Requests

We usually don’t deal with web applications that respond to PUT requests(although I think it’s a useful practice). In the cases where we are, we can use Curl to easily test out PUT requests by sending the contents of a file like so:

$ curl -T file.png http://example.com/put/script.php

Or if we wanted to PUT multiple files at once:

$ curl -T "image[1-100].png" http://example.com/put/script.php

This has the effect of PUT-ing the files image1.png, image2.png, and so on up to image100.png.

Other Requests

Besides PUT, there is also DELETE, which again is not commonly encountered. If needed, we can make such requests with Curl like so:

$ curl --request DELETE http://localhost/resource/to/delete/

If we are using Curl to interact with FTP then the request command can be any valid FTP command. And that’s it for my brain-dump about Curl usage. Everything I’ve shown above can be accomplished by browsers, either out-of-the-box or via various add-ons. But where I like to use Curl is in scripts; in contrast to browsers, Curl makes it easy to create a repeatable series of requests to send to a site, and then I can do simple tests on those results to determine whether or not something worked as expected. If you have any questions about Curl, or anything you like to use it for that hasn’t been covered here, then please share.

Posted on

More PHP Woes

I’ve been screwing around with the error response codes for CafePress and finally reached a dead end.  My only choice if I want to support 4.3 (as the required version for WordPress 2.9.2) is to use this wonderful gem that Eric & Chris shared with me last night:

 $result = @file_get_contents($url);

That hides all warnings & returns false into the $result variable.

Why does file-get_contents() return an error?  Because CafePress, rightfully so, returns a header with a 400 code saying “bad request”.   The fact of the matter is my request is bad, I’ve input an invalid section ID, just as a client might to by accident.  Rather than barf all over their web pages I’d prefer to catch the error gracefully, tell them about it, and move on.

The problem is that file_get_contents will NOT fetch the content of the URL if the header has an error code.   It fetches the header and that’s all.   It isn’t until PHP 5.0 that we get context processing on file_get_contents, but it’s not until PHP 5.2 that we actually get the ignore_errors setting that says “hey, fetch the content anyway”.   That is what we need.  CafePress sends a 400 error, but also tells you more about the reason the request is invalid by sending back XML data describing the specifics of the error in the content.  Sorry PHP 4.3 users (meaning any WordPress clients hosted on a virtual host at a low cost provider that hasn’t upgraded their servers in 5 years… there are more of those people than you think).

So what is the other option?  File_get_contents won’t return content if you get an error header.   Well, what about cURL?   Yup, we COULD use that.  The problem is you need BOTH the latest version of the cURL libraries AND preferably PHP version 5.0 or higher.  It *might* work on 4.3 of PHP, but not guaranteed. There are a lot of timing and other issues to be had with older libs, as we’ve already discover PLUS it adds another restriction of the user having the cURL libs installed and active in their PHP environment.

So to fix the problem you need PHP 5.2 or PHP 5.2 with cURL.

Or you can just ignore anything went wrong with the graceful @ “ignore everything that just broke” operator in PHP and spit back a generic “oh crap, something broke, do it differently” error message for the user.   What great solutions he have at our disposal.  Guess I shouldn’t complain about PHP as much as all those cheap-ass hosting companies that never upgrade their systems for fear of actually having to do some work to earn their revenue.