Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/thirdparty/curl/docs/INTERNALS.md @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 curl internals | |
| 2 ============== | |
| 3 | |
| 4 - [Intro](#intro) | |
| 5 - [git](#git) | |
| 6 - [Portability](#Portability) | |
| 7 - [Windows vs Unix](#winvsunix) | |
| 8 - [Library](#Library) | |
| 9 - [`Curl_connect`](#Curl_connect) | |
| 10 - [`multi_do`](#multi_do) | |
| 11 - [`Curl_readwrite`](#Curl_readwrite) | |
| 12 - [`multi_done`](#multi_done) | |
| 13 - [`Curl_disconnect`](#Curl_disconnect) | |
| 14 - [HTTP(S)](#http) | |
| 15 - [FTP](#ftp) | |
| 16 - [Kerberos](#kerberos) | |
| 17 - [TELNET](#telnet) | |
| 18 - [FILE](#file) | |
| 19 - [SMB](#smb) | |
| 20 - [LDAP](#ldap) | |
| 21 - [E-mail](#email) | |
| 22 - [General](#general) | |
| 23 - [Persistent Connections](#persistent) | |
| 24 - [multi interface/non-blocking](#multi) | |
| 25 - [SSL libraries](#ssl) | |
| 26 - [Library Symbols](#symbols) | |
| 27 - [Return Codes and Informationals](#returncodes) | |
| 28 - [AP/ABI](#abi) | |
| 29 - [Client](#client) | |
| 30 - [Memory Debugging](#memorydebug) | |
| 31 - [Test Suite](#test) | |
| 32 - [Asynchronous name resolves](#asyncdns) | |
| 33 - [c-ares](#cares) | |
| 34 - [`curl_off_t`](#curl_off_t) | |
| 35 - [curlx](#curlx) | |
| 36 - [Content Encoding](#contentencoding) | |
| 37 - [`hostip.c` explained](#hostip) | |
| 38 - [Track Down Memory Leaks](#memoryleak) | |
| 39 - [`multi_socket`](#multi_socket) | |
| 40 - [Structs in libcurl](#structs) | |
| 41 - [Curl_easy](#Curl_easy) | |
| 42 - [connectdata](#connectdata) | |
| 43 - [Curl_multi](#Curl_multi) | |
| 44 - [Curl_handler](#Curl_handler) | |
| 45 - [conncache](#conncache) | |
| 46 - [Curl_share](#Curl_share) | |
| 47 - [CookieInfo](#CookieInfo) | |
| 48 | |
| 49 <a name="intro"></a> | |
| 50 Intro | |
| 51 ===== | |
| 52 | |
| 53 This project is split in two. The library and the client. The client part | |
| 54 uses the library, but the library is designed to allow other applications to | |
| 55 use it. | |
| 56 | |
| 57 The largest amount of code and complexity is in the library part. | |
| 58 | |
| 59 | |
| 60 <a name="git"></a> | |
| 61 git | |
| 62 === | |
| 63 | |
| 64 All changes to the sources are committed to the git repository as soon as | |
| 65 they're somewhat verified to work. Changes shall be committed as independently | |
| 66 as possible so that individual changes can be easily spotted and tracked | |
| 67 afterwards. | |
| 68 | |
| 69 Tagging shall be used extensively, and by the time we release new archives we | |
| 70 should tag the sources with a name similar to the released version number. | |
| 71 | |
| 72 <a name="Portability"></a> | |
| 73 Portability | |
| 74 =========== | |
| 75 | |
| 76 We write curl and libcurl to compile with C89 compilers. On 32-bit and up | |
| 77 machines. Most of libcurl assumes more or less POSIX compliance but that's | |
| 78 not a requirement. | |
| 79 | |
| 80 We write libcurl to build and work with lots of third party tools, and we | |
| 81 want it to remain functional and buildable with these and later versions | |
| 82 (older versions may still work but is not what we work hard to maintain): | |
| 83 | |
| 84 Dependencies | |
| 85 ------------ | |
| 86 | |
| 87 - OpenSSL 0.9.7 | |
| 88 - GnuTLS 2.11.3 | |
| 89 - zlib 1.1.4 | |
| 90 - libssh2 0.16 | |
| 91 - c-ares 1.6.0 | |
| 92 - libidn2 2.0.0 | |
| 93 - wolfSSL 2.0.0 | |
| 94 - openldap 2.0 | |
| 95 - MIT Kerberos 1.2.4 | |
| 96 - GSKit V5R3M0 | |
| 97 - NSS 3.14.x | |
| 98 - PolarSSL 1.3.0 | |
| 99 - Heimdal ? | |
| 100 - nghttp2 1.0.0 | |
| 101 | |
| 102 Operating Systems | |
| 103 ----------------- | |
| 104 | |
| 105 On systems where configure runs, we aim at working on them all - if they have | |
| 106 a suitable C compiler. On systems that don't run configure, we strive to keep | |
| 107 curl running correctly on: | |
| 108 | |
| 109 - Windows 98 | |
| 110 - AS/400 V5R3M0 | |
| 111 - Symbian 9.1 | |
| 112 - Windows CE ? | |
| 113 - TPF ? | |
| 114 | |
| 115 Build tools | |
| 116 ----------- | |
| 117 | |
| 118 When writing code (mostly for generating stuff included in release tarballs) | |
| 119 we use a few "build tools" and we make sure that we remain functional with | |
| 120 these versions: | |
| 121 | |
| 122 - GNU Libtool 1.4.2 | |
| 123 - GNU Autoconf 2.57 | |
| 124 - GNU Automake 1.7 | |
| 125 - GNU M4 1.4 | |
| 126 - perl 5.004 | |
| 127 - roffit 0.5 | |
| 128 - groff ? (any version that supports `groff -Tps -man [in] [out]`) | |
| 129 - ps2pdf (gs) ? | |
| 130 | |
| 131 <a name="winvsunix"></a> | |
| 132 Windows vs Unix | |
| 133 =============== | |
| 134 | |
| 135 There are a few differences in how to program curl the Unix way compared to | |
| 136 the Windows way. Perhaps the four most notable details are: | |
| 137 | |
| 138 1. Different function names for socket operations. | |
| 139 | |
| 140 In curl, this is solved with defines and macros, so that the source looks | |
| 141 the same in all places except for the header file that defines them. The | |
| 142 macros in use are `sclose()`, `sread()` and `swrite()`. | |
| 143 | |
| 144 2. Windows requires a couple of init calls for the socket stuff. | |
| 145 | |
| 146 That's taken care of by the `curl_global_init()` call, but if other libs | |
| 147 also do it etc there might be reasons for applications to alter that | |
| 148 behaviour. | |
| 149 | |
| 150 3. The file descriptors for network communication and file operations are | |
| 151 not as easily interchangeable as in Unix. | |
| 152 | |
| 153 We avoid this by not trying any funny tricks on file descriptors. | |
| 154 | |
| 155 4. When writing data to stdout, Windows makes end-of-lines the DOS way, thus | |
| 156 destroying binary data, although you do want that conversion if it is | |
| 157 text coming through... (sigh) | |
| 158 | |
| 159 We set stdout to binary under windows | |
| 160 | |
| 161 Inside the source code, We make an effort to avoid `#ifdef [Your OS]`. All | |
| 162 conditionals that deal with features *should* instead be in the format | |
| 163 `#ifdef HAVE_THAT_WEIRD_FUNCTION`. Since Windows can't run configure scripts, | |
| 164 we maintain a `curl_config-win32.h` file in lib directory that is supposed to | |
| 165 look exactly like a `curl_config.h` file would have looked like on a Windows | |
| 166 machine! | |
| 167 | |
| 168 Generally speaking: always remember that this will be compiled on dozens of | |
| 169 operating systems. Don't walk on the edge! | |
| 170 | |
| 171 <a name="Library"></a> | |
| 172 Library | |
| 173 ======= | |
| 174 | |
| 175 (See [Structs in libcurl](#structs) for the separate section describing all | |
| 176 major internal structs and their purposes.) | |
| 177 | |
| 178 There are plenty of entry points to the library, namely each publicly defined | |
| 179 function that libcurl offers to applications. All of those functions are | |
| 180 rather small and easy-to-follow. All the ones prefixed with `curl_easy` are | |
| 181 put in the `lib/easy.c` file. | |
| 182 | |
| 183 `curl_global_init()` and `curl_global_cleanup()` should be called by the | |
| 184 application to initialize and clean up global stuff in the library. As of | |
| 185 today, it can handle the global SSL initing if SSL is enabled and it can init | |
| 186 the socket layer on windows machines. libcurl itself has no "global" scope. | |
| 187 | |
| 188 All printf()-style functions use the supplied clones in `lib/mprintf.c`. This | |
| 189 makes sure we stay absolutely platform independent. | |
| 190 | |
| 191 [ `curl_easy_init()`][2] allocates an internal struct and makes some | |
| 192 initializations. The returned handle does not reveal internals. This is the | |
| 193 `Curl_easy` struct which works as an "anchor" struct for all `curl_easy` | |
| 194 functions. All connections performed will get connect-specific data allocated | |
| 195 that should be used for things related to particular connections/requests. | |
| 196 | |
| 197 [`curl_easy_setopt()`][1] takes three arguments, where the option stuff must | |
| 198 be passed in pairs: the parameter-ID and the parameter-value. The list of | |
| 199 options is documented in the man page. This function mainly sets things in | |
| 200 the `Curl_easy` struct. | |
| 201 | |
| 202 `curl_easy_perform()` is just a wrapper function that makes use of the multi | |
| 203 API. It basically calls `curl_multi_init()`, `curl_multi_add_handle()`, | |
| 204 `curl_multi_wait()`, and `curl_multi_perform()` until the transfer is done | |
| 205 and then returns. | |
| 206 | |
| 207 Some of the most important key functions in `url.c` are called from | |
| 208 `multi.c` when certain key steps are to be made in the transfer operation. | |
| 209 | |
| 210 <a name="Curl_connect"></a> | |
| 211 Curl_connect() | |
| 212 -------------- | |
| 213 | |
| 214 Analyzes the URL, it separates the different components and connects to the | |
| 215 remote host. This may involve using a proxy and/or using SSL. The | |
| 216 `Curl_resolv()` function in `lib/hostip.c` is used for looking up host | |
| 217 names (it does then use the proper underlying method, which may vary | |
| 218 between platforms and builds). | |
| 219 | |
| 220 When `Curl_connect` is done, we are connected to the remote site. Then it | |
| 221 is time to tell the server to get a document/file. `Curl_do()` arranges | |
| 222 this. | |
| 223 | |
| 224 This function makes sure there's an allocated and initiated `connectdata` | |
| 225 struct that is used for this particular connection only (although there may | |
| 226 be several requests performed on the same connect). A bunch of things are | |
| 227 inited/inherited from the `Curl_easy` struct. | |
| 228 | |
| 229 <a name="multi_do"></a> | |
| 230 multi_do() | |
| 231 --------- | |
| 232 | |
| 233 `multi_do()` makes sure the proper protocol-specific function is called. | |
| 234 The functions are named after the protocols they handle. | |
| 235 | |
| 236 The protocol-specific functions of course deal with protocol-specific | |
| 237 negotiations and setup. They have access to the `Curl_sendf()` (from | |
| 238 `lib/sendf.c`) function to send printf-style formatted data to the remote | |
| 239 host and when they're ready to make the actual file transfer they call the | |
| 240 `Curl_setup_transfer()` function (in `lib/transfer.c`) to setup the | |
| 241 transfer and returns. | |
| 242 | |
| 243 If this DO function fails and the connection is being re-used, libcurl will | |
| 244 then close this connection, setup a new connection and re-issue the DO | |
| 245 request on that. This is because there is no way to be perfectly sure that | |
| 246 we have discovered a dead connection before the DO function and thus we | |
| 247 might wrongly be re-using a connection that was closed by the remote peer. | |
| 248 | |
| 249 <a name="Curl_readwrite"></a> | |
| 250 Curl_readwrite() | |
| 251 ---------------- | |
| 252 | |
| 253 Called during the transfer of the actual protocol payload. | |
| 254 | |
| 255 During transfer, the progress functions in `lib/progress.c` are called at | |
| 256 frequent intervals (or at the user's choice, a specified callback might get | |
| 257 called). The speedcheck functions in `lib/speedcheck.c` are also used to | |
| 258 verify that the transfer is as fast as required. | |
| 259 | |
| 260 <a name="multi_done"></a> | |
| 261 multi_done() | |
| 262 ----------- | |
| 263 | |
| 264 Called after a transfer is done. This function takes care of everything | |
| 265 that has to be done after a transfer. This function attempts to leave | |
| 266 matters in a state so that `multi_do()` should be possible to call again on | |
| 267 the same connection (in a persistent connection case). It might also soon | |
| 268 be closed with `Curl_disconnect()`. | |
| 269 | |
| 270 <a name="Curl_disconnect"></a> | |
| 271 Curl_disconnect() | |
| 272 ----------------- | |
| 273 | |
| 274 When doing normal connections and transfers, no one ever tries to close any | |
| 275 connections so this is not normally called when `curl_easy_perform()` is | |
| 276 used. This function is only used when we are certain that no more transfers | |
| 277 are going to be made on the connection. It can be also closed by force, or | |
| 278 it can be called to make sure that libcurl doesn't keep too many | |
| 279 connections alive at the same time. | |
| 280 | |
| 281 This function cleans up all resources that are associated with a single | |
| 282 connection. | |
| 283 | |
| 284 <a name="http"></a> | |
| 285 HTTP(S) | |
| 286 ======= | |
| 287 | |
| 288 HTTP offers a lot and is the protocol in curl that uses the most lines of | |
| 289 code. There is a special file `lib/formdata.c` that offers all the | |
| 290 multipart post functions. | |
| 291 | |
| 292 base64-functions for user+password stuff (and more) is in `lib/base64.c` | |
| 293 and all functions for parsing and sending cookies are found in | |
| 294 `lib/cookie.c`. | |
| 295 | |
| 296 HTTPS uses in almost every case the same procedure as HTTP, with only two | |
| 297 exceptions: the connect procedure is different and the function used to read | |
| 298 or write from the socket is different, although the latter fact is hidden in | |
| 299 the source by the use of `Curl_read()` for reading and `Curl_write()` for | |
| 300 writing data to the remote server. | |
| 301 | |
| 302 `http_chunks.c` contains functions that understands HTTP 1.1 chunked transfer | |
| 303 encoding. | |
| 304 | |
| 305 An interesting detail with the HTTP(S) request, is the `Curl_add_buffer()` | |
| 306 series of functions we use. They append data to one single buffer, and when | |
| 307 the building is finished the entire request is sent off in one single write. | |
| 308 This is done this way to overcome problems with flawed firewalls and lame | |
| 309 servers. | |
| 310 | |
| 311 <a name="ftp"></a> | |
| 312 FTP | |
| 313 === | |
| 314 | |
| 315 The `Curl_if2ip()` function can be used for getting the IP number of a | |
| 316 specified network interface, and it resides in `lib/if2ip.c`. | |
| 317 | |
| 318 `Curl_ftpsendf()` is used for sending FTP commands to the remote server. It | |
| 319 was made a separate function to prevent us programmers from forgetting that | |
| 320 they must be CRLF terminated. They must also be sent in one single `write()` | |
| 321 to make firewalls and similar happy. | |
| 322 | |
| 323 <a name="kerberos"></a> | |
| 324 Kerberos | |
| 325 ======== | |
| 326 | |
| 327 Kerberos support is mainly in `lib/krb5.c` and `lib/security.c` but also | |
| 328 `curl_sasl_sspi.c` and `curl_sasl_gssapi.c` for the email protocols and | |
| 329 `socks_gssapi.c` and `socks_sspi.c` for SOCKS5 proxy specifics. | |
| 330 | |
| 331 <a name="telnet"></a> | |
| 332 TELNET | |
| 333 ====== | |
| 334 | |
| 335 Telnet is implemented in `lib/telnet.c`. | |
| 336 | |
| 337 <a name="file"></a> | |
| 338 FILE | |
| 339 ==== | |
| 340 | |
| 341 The `file://` protocol is dealt with in `lib/file.c`. | |
| 342 | |
| 343 <a name="smb"></a> | |
| 344 SMB | |
| 345 === | |
| 346 | |
| 347 The `smb://` protocol is dealt with in `lib/smb.c`. | |
| 348 | |
| 349 <a name="ldap"></a> | |
| 350 LDAP | |
| 351 ==== | |
| 352 | |
| 353 Everything LDAP is in `lib/ldap.c` and `lib/openldap.c`. | |
| 354 | |
| 355 <a name="email"></a> | |
| 356 E-mail | |
| 357 ====== | |
| 358 | |
| 359 The e-mail related source code is in `lib/imap.c`, `lib/pop3.c` and | |
| 360 `lib/smtp.c`. | |
| 361 | |
| 362 <a name="general"></a> | |
| 363 General | |
| 364 ======= | |
| 365 | |
| 366 URL encoding and decoding, called escaping and unescaping in the source code, | |
| 367 is found in `lib/escape.c`. | |
| 368 | |
| 369 While transferring data in `Transfer()` a few functions might get used. | |
| 370 `curl_getdate()` in `lib/parsedate.c` is for HTTP date comparisons (and | |
| 371 more). | |
| 372 | |
| 373 `lib/getenv.c` offers `curl_getenv()` which is for reading environment | |
| 374 variables in a neat platform independent way. That's used in the client, but | |
| 375 also in `lib/url.c` when checking the proxy environment variables. Note that | |
| 376 contrary to the normal unix `getenv()`, this returns an allocated buffer that | |
| 377 must be `free()`ed after use. | |
| 378 | |
| 379 `lib/netrc.c` holds the `.netrc` parser. | |
| 380 | |
| 381 `lib/timeval.c` features replacement functions for systems that don't have | |
| 382 `gettimeofday()` and a few support functions for timeval conversions. | |
| 383 | |
| 384 A function named `curl_version()` that returns the full curl version string | |
| 385 is found in `lib/version.c`. | |
| 386 | |
| 387 <a name="persistent"></a> | |
| 388 Persistent Connections | |
| 389 ====================== | |
| 390 | |
| 391 The persistent connection support in libcurl requires some considerations on | |
| 392 how to do things inside of the library. | |
| 393 | |
| 394 - The `Curl_easy` struct returned in the [`curl_easy_init()`][2] call | |
| 395 must never hold connection-oriented data. It is meant to hold the root data | |
| 396 as well as all the options etc that the library-user may choose. | |
| 397 | |
| 398 - The `Curl_easy` struct holds the "connection cache" (an array of | |
| 399 pointers to `connectdata` structs). | |
| 400 | |
| 401 - This enables the 'curl handle' to be reused on subsequent transfers. | |
| 402 | |
| 403 - When libcurl is told to perform a transfer, it first checks for an already | |
| 404 existing connection in the cache that we can use. Otherwise it creates a | |
| 405 new one and adds that to the cache. If the cache is full already when a new | |
| 406 connection is added, it will first close the oldest unused one. | |
| 407 | |
| 408 - When the transfer operation is complete, the connection is left | |
| 409 open. Particular options may tell libcurl not to, and protocols may signal | |
| 410 closure on connections and then they won't be kept open, of course. | |
| 411 | |
| 412 - When `curl_easy_cleanup()` is called, we close all still opened connections, | |
| 413 unless of course the multi interface "owns" the connections. | |
| 414 | |
| 415 The curl handle must be re-used in order for the persistent connections to | |
| 416 work. | |
| 417 | |
| 418 <a name="multi"></a> | |
| 419 multi interface/non-blocking | |
| 420 ============================ | |
| 421 | |
| 422 The multi interface is a non-blocking interface to the library. To make that | |
| 423 interface work as well as possible, no low-level functions within libcurl | |
| 424 must be written to work in a blocking manner. (There are still a few spots | |
| 425 violating this rule.) | |
| 426 | |
| 427 One of the primary reasons we introduced c-ares support was to allow the name | |
| 428 resolve phase to be perfectly non-blocking as well. | |
| 429 | |
| 430 The FTP and the SFTP/SCP protocols are examples of how we adapt and adjust | |
| 431 the code to allow non-blocking operations even on multi-stage command- | |
| 432 response protocols. They are built around state machines that return when | |
| 433 they would otherwise block waiting for data. The DICT, LDAP and TELNET | |
| 434 protocols are crappy examples and they are subject for rewrite in the future | |
| 435 to better fit the libcurl protocol family. | |
| 436 | |
| 437 <a name="ssl"></a> | |
| 438 SSL libraries | |
| 439 ============= | |
| 440 | |
| 441 Originally libcurl supported SSLeay for SSL/TLS transports, but that was then | |
| 442 extended to its successor OpenSSL but has since also been extended to several | |
| 443 other SSL/TLS libraries and we expect and hope to further extend the support | |
| 444 in future libcurl versions. | |
| 445 | |
| 446 To deal with this internally in the best way possible, we have a generic SSL | |
| 447 function API as provided by the `vtls/vtls.[ch]` system, and they are the only | |
| 448 SSL functions we must use from within libcurl. vtls is then crafted to use | |
| 449 the appropriate lower-level function calls to whatever SSL library that is in | |
| 450 use. For example `vtls/openssl.[ch]` for the OpenSSL library. | |
| 451 | |
| 452 <a name="symbols"></a> | |
| 453 Library Symbols | |
| 454 =============== | |
| 455 | |
| 456 All symbols used internally in libcurl must use a `Curl_` prefix if they're | |
| 457 used in more than a single file. Single-file symbols must be made static. | |
| 458 Public ("exported") symbols must use a `curl_` prefix. (There are exceptions, | |
| 459 but they are to be changed to follow this pattern in future versions.) Public | |
| 460 API functions are marked with `CURL_EXTERN` in the public header files so | |
| 461 that all others can be hidden on platforms where this is possible. | |
| 462 | |
| 463 <a name="returncodes"></a> | |
| 464 Return Codes and Informationals | |
| 465 =============================== | |
| 466 | |
| 467 I've made things simple. Almost every function in libcurl returns a CURLcode, | |
| 468 that must be `CURLE_OK` if everything is OK or otherwise a suitable error | |
| 469 code as the `curl/curl.h` include file defines. The very spot that detects an | |
| 470 error must use the `Curl_failf()` function to set the human-readable error | |
| 471 description. | |
| 472 | |
| 473 In aiding the user to understand what's happening and to debug curl usage, we | |
| 474 must supply a fair number of informational messages by using the | |
| 475 `Curl_infof()` function. Those messages are only displayed when the user | |
| 476 explicitly asks for them. They are best used when revealing information that | |
| 477 isn't otherwise obvious. | |
| 478 | |
| 479 <a name="abi"></a> | |
| 480 API/ABI | |
| 481 ======= | |
| 482 | |
| 483 We make an effort to not export or show internals or how internals work, as | |
| 484 that makes it easier to keep a solid API/ABI over time. See docs/libcurl/ABI | |
| 485 for our promise to users. | |
| 486 | |
| 487 <a name="client"></a> | |
| 488 Client | |
| 489 ====== | |
| 490 | |
| 491 `main()` resides in `src/tool_main.c`. | |
| 492 | |
| 493 `src/tool_hugehelp.c` is automatically generated by the `mkhelp.pl` perl | |
| 494 script to display the complete "manual" and the `src/tool_urlglob.c` file | |
| 495 holds the functions used for the URL-"globbing" support. Globbing in the | |
| 496 sense that the `{}` and `[]` expansion stuff is there. | |
| 497 | |
| 498 The client mostly sets up its `config` struct properly, then | |
| 499 it calls the `curl_easy_*()` functions of the library and when it gets back | |
| 500 control after the `curl_easy_perform()` it cleans up the library, checks | |
| 501 status and exits. | |
| 502 | |
| 503 When the operation is done, the `ourWriteOut()` function in `src/writeout.c` | |
| 504 may be called to report about the operation. That function is using the | |
| 505 `curl_easy_getinfo()` function to extract useful information from the curl | |
| 506 session. | |
| 507 | |
| 508 It may loop and do all this several times if many URLs were specified on the | |
| 509 command line or config file. | |
| 510 | |
| 511 <a name="memorydebug"></a> | |
| 512 Memory Debugging | |
| 513 ================ | |
| 514 | |
| 515 The file `lib/memdebug.c` contains debug-versions of a few functions. | |
| 516 Functions such as `malloc()`, `free()`, `fopen()`, `fclose()`, etc that | |
| 517 somehow deal with resources that might give us problems if we "leak" them. | |
| 518 The functions in the memdebug system do nothing fancy, they do their normal | |
| 519 function and then log information about what they just did. The logged data | |
| 520 can then be analyzed after a complete session, | |
| 521 | |
| 522 `memanalyze.pl` is the perl script present in `tests/` that analyzes a log | |
| 523 file generated by the memory tracking system. It detects if resources are | |
| 524 allocated but never freed and other kinds of errors related to resource | |
| 525 management. | |
| 526 | |
| 527 Internally, definition of preprocessor symbol `DEBUGBUILD` restricts code | |
| 528 which is only compiled for debug enabled builds. And symbol `CURLDEBUG` is | |
| 529 used to differentiate code which is _only_ used for memory | |
| 530 tracking/debugging. | |
| 531 | |
| 532 Use `-DCURLDEBUG` when compiling to enable memory debugging, this is also | |
| 533 switched on by running configure with `--enable-curldebug`. Use | |
| 534 `-DDEBUGBUILD` when compiling to enable a debug build or run configure with | |
| 535 `--enable-debug`. | |
| 536 | |
| 537 `curl --version` will list 'Debug' feature for debug enabled builds, and | |
| 538 will list 'TrackMemory' feature for curl debug memory tracking capable | |
| 539 builds. These features are independent and can be controlled when running | |
| 540 the configure script. When `--enable-debug` is given both features will be | |
| 541 enabled, unless some restriction prevents memory tracking from being used. | |
| 542 | |
| 543 <a name="test"></a> | |
| 544 Test Suite | |
| 545 ========== | |
| 546 | |
| 547 The test suite is placed in its own subdirectory directly off the root in the | |
| 548 curl archive tree, and it contains a bunch of scripts and a lot of test case | |
| 549 data. | |
| 550 | |
| 551 The main test script is `runtests.pl` that will invoke test servers like | |
| 552 `httpserver.pl` and `ftpserver.pl` before all the test cases are performed. | |
| 553 The test suite currently only runs on Unix-like platforms. | |
| 554 | |
| 555 You'll find a description of the test suite in the `tests/README` file, and | |
| 556 the test case data files in the `tests/FILEFORMAT` file. | |
| 557 | |
| 558 The test suite automatically detects if curl was built with the memory | |
| 559 debugging enabled, and if it was, it will detect memory leaks, too. | |
| 560 | |
| 561 <a name="asyncdns"></a> | |
| 562 Asynchronous name resolves | |
| 563 ========================== | |
| 564 | |
| 565 libcurl can be built to do name resolves asynchronously, using either the | |
| 566 normal resolver in a threaded manner or by using c-ares. | |
| 567 | |
| 568 <a name="cares"></a> | |
| 569 [c-ares][3] | |
| 570 ------ | |
| 571 | |
| 572 ### Build libcurl to use a c-ares | |
| 573 | |
| 574 1. ./configure --enable-ares=/path/to/ares/install | |
| 575 2. make | |
| 576 | |
| 577 ### c-ares on win32 | |
| 578 | |
| 579 First I compiled c-ares. I changed the default C runtime library to be the | |
| 580 single-threaded rather than the multi-threaded (this seems to be required to | |
| 581 prevent linking errors later on). Then I simply build the areslib project | |
| 582 (the other projects adig/ahost seem to fail under MSVC). | |
| 583 | |
| 584 Next was libcurl. I opened `lib/config-win32.h` and I added a: | |
| 585 `#define USE_ARES 1` | |
| 586 | |
| 587 Next thing I did was I added the path for the ares includes to the include | |
| 588 path, and the libares.lib to the libraries. | |
| 589 | |
| 590 Lastly, I also changed libcurl to be single-threaded rather than | |
| 591 multi-threaded, again this was to prevent some duplicate symbol errors. I'm | |
| 592 not sure why I needed to change everything to single-threaded, but when I | |
| 593 didn't I got redefinition errors for several CRT functions (`malloc()`, | |
| 594 `stricmp()`, etc.) | |
| 595 | |
| 596 <a name="curl_off_t"></a> | |
| 597 `curl_off_t` | |
| 598 ========== | |
| 599 | |
| 600 `curl_off_t` is a data type provided by the external libcurl include | |
| 601 headers. It is the type meant to be used for the [`curl_easy_setopt()`][1] | |
| 602 options that end with LARGE. The type is 64-bit large on most modern | |
| 603 platforms. | |
| 604 | |
| 605 <a name="curlx"></a> | |
| 606 curlx | |
| 607 ===== | |
| 608 | |
| 609 The libcurl source code offers a few functions by source only. They are not | |
| 610 part of the official libcurl API, but the source files might be useful for | |
| 611 others so apps can optionally compile/build with these sources to gain | |
| 612 additional functions. | |
| 613 | |
| 614 We provide them through a single header file for easy access for apps: | |
| 615 `curlx.h` | |
| 616 | |
| 617 `curlx_strtoofft()` | |
| 618 ------------------- | |
| 619 A macro that converts a string containing a number to a `curl_off_t` number. | |
| 620 This might use the `curlx_strtoll()` function which is provided as source | |
| 621 code in strtoofft.c. Note that the function is only provided if no | |
| 622 `strtoll()` (or equivalent) function exist on your platform. If `curl_off_t` | |
| 623 is only a 32-bit number on your platform, this macro uses `strtol()`. | |
| 624 | |
| 625 Future | |
| 626 ------ | |
| 627 | |
| 628 Several functions will be removed from the public `curl_` name space in a | |
| 629 future libcurl release. They will then only become available as `curlx_` | |
| 630 functions instead. To make the transition easier, we already today provide | |
| 631 these functions with the `curlx_` prefix to allow sources to be built | |
| 632 properly with the new function names. The concerned functions are: | |
| 633 | |
| 634 - `curlx_getenv` | |
| 635 - `curlx_strequal` | |
| 636 - `curlx_strnequal` | |
| 637 - `curlx_mvsnprintf` | |
| 638 - `curlx_msnprintf` | |
| 639 - `curlx_maprintf` | |
| 640 - `curlx_mvaprintf` | |
| 641 - `curlx_msprintf` | |
| 642 - `curlx_mprintf` | |
| 643 - `curlx_mfprintf` | |
| 644 - `curlx_mvsprintf` | |
| 645 - `curlx_mvprintf` | |
| 646 - `curlx_mvfprintf` | |
| 647 | |
| 648 <a name="contentencoding"></a> | |
| 649 Content Encoding | |
| 650 ================ | |
| 651 | |
| 652 ## About content encodings | |
| 653 | |
| 654 [HTTP/1.1][4] specifies that a client may request that a server encode its | |
| 655 response. This is usually used to compress a response using one (or more) | |
| 656 encodings from a set of commonly available compression techniques. These | |
| 657 schemes include `deflate` (the zlib algorithm), `gzip`, `br` (brotli) and | |
| 658 `compress`. A client requests that the server perform an encoding by including | |
| 659 an `Accept-Encoding` header in the request document. The value of the header | |
| 660 should be one of the recognized tokens `deflate`, ... (there's a way to | |
| 661 register new schemes/tokens, see sec 3.5 of the spec). A server MAY honor | |
| 662 the client's encoding request. When a response is encoded, the server | |
| 663 includes a `Content-Encoding` header in the response. The value of the | |
| 664 `Content-Encoding` header indicates which encodings were used to encode the | |
| 665 data, in the order in which they were applied. | |
| 666 | |
| 667 It's also possible for a client to attach priorities to different schemes so | |
| 668 that the server knows which it prefers. See sec 14.3 of RFC 2616 for more | |
| 669 information on the `Accept-Encoding` header. See sec | |
| 670 [3.1.2.2 of RFC 7231][15] for more information on the `Content-Encoding` | |
| 671 header. | |
| 672 | |
| 673 ## Supported content encodings | |
| 674 | |
| 675 The `deflate`, `gzip` and `br` content encodings are supported by libcurl. | |
| 676 Both regular and chunked transfers work fine. The zlib library is required | |
| 677 for the `deflate` and `gzip` encodings, while the brotli decoding library is | |
| 678 for the `br` encoding. | |
| 679 | |
| 680 ## The libcurl interface | |
| 681 | |
| 682 To cause libcurl to request a content encoding use: | |
| 683 | |
| 684 [`curl_easy_setopt`][1](curl, [`CURLOPT_ACCEPT_ENCODING`][5], string) | |
| 685 | |
| 686 where string is the intended value of the `Accept-Encoding` header. | |
| 687 | |
| 688 Currently, libcurl does support multiple encodings but only | |
| 689 understands how to process responses that use the `deflate`, `gzip` and/or | |
| 690 `br` content encodings, so the only values for [`CURLOPT_ACCEPT_ENCODING`][5] | |
| 691 that will work (besides `identity`, which does nothing) are `deflate`, | |
| 692 `gzip` and `br`. If a response is encoded using the `compress` or methods, | |
| 693 libcurl will return an error indicating that the response could | |
| 694 not be decoded. If `<string>` is NULL no `Accept-Encoding` header is | |
| 695 generated. If `<string>` is a zero-length string, then an `Accept-Encoding` | |
| 696 header containing all supported encodings will be generated. | |
| 697 | |
| 698 The [`CURLOPT_ACCEPT_ENCODING`][5] must be set to any non-NULL value for | |
| 699 content to be automatically decoded. If it is not set and the server still | |
| 700 sends encoded content (despite not having been asked), the data is returned | |
| 701 in its raw form and the `Content-Encoding` type is not checked. | |
| 702 | |
| 703 ## The curl interface | |
| 704 | |
| 705 Use the [`--compressed`][6] option with curl to cause it to ask servers to | |
| 706 compress responses using any format supported by curl. | |
| 707 | |
| 708 <a name="hostip"></a> | |
| 709 `hostip.c` explained | |
| 710 ==================== | |
| 711 | |
| 712 The main compile-time defines to keep in mind when reading the `host*.c` | |
| 713 source file are these: | |
| 714 | |
| 715 ## `CURLRES_IPV6` | |
| 716 | |
| 717 this host has `getaddrinfo()` and family, and thus we use that. The host may | |
| 718 not be able to resolve IPv6, but we don't really have to take that into | |
| 719 account. Hosts that aren't IPv6-enabled have `CURLRES_IPV4` defined. | |
| 720 | |
| 721 ## `CURLRES_ARES` | |
| 722 | |
| 723 is defined if libcurl is built to use c-ares for asynchronous name | |
| 724 resolves. This can be Windows or \*nix. | |
| 725 | |
| 726 ## `CURLRES_THREADED` | |
| 727 | |
| 728 is defined if libcurl is built to use threading for asynchronous name | |
| 729 resolves. The name resolve will be done in a new thread, and the supported | |
| 730 asynch API will be the same as for ares-builds. This is the default under | |
| 731 (native) Windows. | |
| 732 | |
| 733 If any of the two previous are defined, `CURLRES_ASYNCH` is defined too. If | |
| 734 libcurl is not built to use an asynchronous resolver, `CURLRES_SYNCH` is | |
| 735 defined. | |
| 736 | |
| 737 ## `host*.c` sources | |
| 738 | |
| 739 The `host*.c` sources files are split up like this: | |
| 740 | |
| 741 - `hostip.c` - method-independent resolver functions and utility functions | |
| 742 - `hostasyn.c` - functions for asynchronous name resolves | |
| 743 - `hostsyn.c` - functions for synchronous name resolves | |
| 744 - `asyn-ares.c` - functions for asynchronous name resolves using c-ares | |
| 745 - `asyn-thread.c` - functions for asynchronous name resolves using threads | |
| 746 - `hostip4.c` - IPv4 specific functions | |
| 747 - `hostip6.c` - IPv6 specific functions | |
| 748 | |
| 749 The `hostip.h` is the single united header file for all this. It defines the | |
| 750 `CURLRES_*` defines based on the `config*.h` and `curl_setup.h` defines. | |
| 751 | |
| 752 <a name="memoryleak"></a> | |
| 753 Track Down Memory Leaks | |
| 754 ======================= | |
| 755 | |
| 756 ## Single-threaded | |
| 757 | |
| 758 Please note that this memory leak system is not adjusted to work in more | |
| 759 than one thread. If you want/need to use it in a multi-threaded app. Please | |
| 760 adjust accordingly. | |
| 761 | |
| 762 ## Build | |
| 763 | |
| 764 Rebuild libcurl with `-DCURLDEBUG` (usually, rerunning configure with | |
| 765 `--enable-debug` fixes this). `make clean` first, then `make` so that all | |
| 766 files are actually rebuilt properly. It will also make sense to build | |
| 767 libcurl with the debug option (usually `-g` to the compiler) so that | |
| 768 debugging it will be easier if you actually do find a leak in the library. | |
| 769 | |
| 770 This will create a library that has memory debugging enabled. | |
| 771 | |
| 772 ## Modify Your Application | |
| 773 | |
| 774 Add a line in your application code: | |
| 775 | |
| 776 `curl_dbg_memdebug("dump");` | |
| 777 | |
| 778 This will make the malloc debug system output a full trace of all resource | |
| 779 using functions to the given file name. Make sure you rebuild your program | |
| 780 and that you link with the same libcurl you built for this purpose as | |
| 781 described above. | |
| 782 | |
| 783 ## Run Your Application | |
| 784 | |
| 785 Run your program as usual. Watch the specified memory trace file grow. | |
| 786 | |
| 787 Make your program exit and use the proper libcurl cleanup functions etc. So | |
| 788 that all non-leaks are returned/freed properly. | |
| 789 | |
| 790 ## Analyze the Flow | |
| 791 | |
| 792 Use the `tests/memanalyze.pl` perl script to analyze the dump file: | |
| 793 | |
| 794 tests/memanalyze.pl dump | |
| 795 | |
| 796 This now outputs a report on what resources that were allocated but never | |
| 797 freed etc. This report is very fine for posting to the list! | |
| 798 | |
| 799 If this doesn't produce any output, no leak was detected in libcurl. Then | |
| 800 the leak is mostly likely to be in your code. | |
| 801 | |
| 802 <a name="multi_socket"></a> | |
| 803 `multi_socket` | |
| 804 ============== | |
| 805 | |
| 806 Implementation of the `curl_multi_socket` API | |
| 807 | |
| 808 The main ideas of this API are simply: | |
| 809 | |
| 810 1. The application can use whatever event system it likes as it gets info | |
| 811 from libcurl about what file descriptors libcurl waits for what action | |
| 812 on. (The previous API returns `fd_sets` which is very | |
| 813 `select()`-centric). | |
| 814 | |
| 815 2. When the application discovers action on a single socket, it calls | |
| 816 libcurl and informs that there was action on this particular socket and | |
| 817 libcurl can then act on that socket/transfer only and not care about | |
| 818 any other transfers. (The previous API always had to scan through all | |
| 819 the existing transfers.) | |
| 820 | |
| 821 The idea is that [`curl_multi_socket_action()`][7] calls a given callback | |
| 822 with information about what socket to wait for what action on, and the | |
| 823 callback only gets called if the status of that socket has changed. | |
| 824 | |
| 825 We also added a timer callback that makes libcurl call the application when | |
| 826 the timeout value changes, and you set that with [`curl_multi_setopt()`][9] | |
| 827 and the [`CURLMOPT_TIMERFUNCTION`][10] option. To get this to work, | |
| 828 Internally, there's an added struct to each easy handle in which we store | |
| 829 an "expire time" (if any). The structs are then "splay sorted" so that we | |
| 830 can add and remove times from the linked list and yet somewhat swiftly | |
| 831 figure out both how long there is until the next nearest timer expires | |
| 832 and which timer (handle) we should take care of now. Of course, the upside | |
| 833 of all this is that we get a [`curl_multi_timeout()`][8] that should also | |
| 834 work with old-style applications that use [`curl_multi_perform()`][11]. | |
| 835 | |
| 836 We created an internal "socket to easy handles" hash table that given | |
| 837 a socket (file descriptor) returns the easy handle that waits for action on | |
| 838 that socket. This hash is made using the already existing hash code | |
| 839 (previously only used for the DNS cache). | |
| 840 | |
| 841 To make libcurl able to report plain sockets in the socket callback, we had | |
| 842 to re-organize the internals of the [`curl_multi_fdset()`][12] etc so that | |
| 843 the conversion from sockets to `fd_sets` for that function is only done in | |
| 844 the last step before the data is returned. I also had to extend c-ares to | |
| 845 get a function that can return plain sockets, as that library too returned | |
| 846 only `fd_sets` and that is no longer good enough. The changes done to c-ares | |
| 847 are available in c-ares 1.3.1 and later. | |
| 848 | |
| 849 <a name="structs"></a> | |
| 850 Structs in libcurl | |
| 851 ================== | |
| 852 | |
| 853 This section should cover 7.32.0 pretty accurately, but will make sense even | |
| 854 for older and later versions as things don't change drastically that often. | |
| 855 | |
| 856 <a name="Curl_easy"></a> | |
| 857 ## Curl_easy | |
| 858 | |
| 859 The `Curl_easy` struct is the one returned to the outside in the external API | |
| 860 as a `CURL *`. This is usually known as an easy handle in API documentations | |
| 861 and examples. | |
| 862 | |
| 863 Information and state that is related to the actual connection is in the | |
| 864 `connectdata` struct. When a transfer is about to be made, libcurl will | |
| 865 either create a new connection or re-use an existing one. The particular | |
| 866 connectdata that is used by this handle is pointed out by | |
| 867 `Curl_easy->easy_conn`. | |
| 868 | |
| 869 Data and information that regard this particular single transfer is put in | |
| 870 the `SingleRequest` sub-struct. | |
| 871 | |
| 872 When the `Curl_easy` struct is added to a multi handle, as it must be in | |
| 873 order to do any transfer, the `->multi` member will point to the `Curl_multi` | |
| 874 struct it belongs to. The `->prev` and `->next` members will then be used by | |
| 875 the multi code to keep a linked list of `Curl_easy` structs that are added to | |
| 876 that same multi handle. libcurl always uses multi so `->multi` *will* point | |
| 877 to a `Curl_multi` when a transfer is in progress. | |
| 878 | |
| 879 `->mstate` is the multi state of this particular `Curl_easy`. When | |
| 880 `multi_runsingle()` is called, it will act on this handle according to which | |
| 881 state it is in. The mstate is also what tells which sockets to return for a | |
| 882 specific `Curl_easy` when [`curl_multi_fdset()`][12] is called etc. | |
| 883 | |
| 884 The libcurl source code generally use the name `data` for the variable that | |
| 885 points to the `Curl_easy`. | |
| 886 | |
| 887 When doing multiplexed HTTP/2 transfers, each `Curl_easy` is associated with | |
| 888 an individual stream, sharing the same connectdata struct. Multiplexing | |
| 889 makes it even more important to keep things associated with the right thing! | |
| 890 | |
| 891 <a name="connectdata"></a> | |
| 892 ## connectdata | |
| 893 | |
| 894 A general idea in libcurl is to keep connections around in a connection | |
| 895 "cache" after they have been used in case they will be used again and then | |
| 896 re-use an existing one instead of creating a new as it creates a significant | |
| 897 performance boost. | |
| 898 | |
| 899 Each `connectdata` identifies a single physical connection to a server. If | |
| 900 the connection can't be kept alive, the connection will be closed after use | |
| 901 and then this struct can be removed from the cache and freed. | |
| 902 | |
| 903 Thus, the same `Curl_easy` can be used multiple times and each time select | |
| 904 another `connectdata` struct to use for the connection. Keep this in mind, | |
| 905 as it is then important to consider if options or choices are based on the | |
| 906 connection or the `Curl_easy`. | |
| 907 | |
| 908 Functions in libcurl will assume that `connectdata->data` points to the | |
| 909 `Curl_easy` that uses this connection (for the moment). | |
| 910 | |
| 911 As a special complexity, some protocols supported by libcurl require a | |
| 912 special disconnect procedure that is more than just shutting down the | |
| 913 socket. It can involve sending one or more commands to the server before | |
| 914 doing so. Since connections are kept in the connection cache after use, the | |
| 915 original `Curl_easy` may no longer be around when the time comes to shut down | |
| 916 a particular connection. For this purpose, libcurl holds a special dummy | |
| 917 `closure_handle` `Curl_easy` in the `Curl_multi` struct to use when needed. | |
| 918 | |
| 919 FTP uses two TCP connections for a typical transfer but it keeps both in | |
| 920 this single struct and thus can be considered a single connection for most | |
| 921 internal concerns. | |
| 922 | |
| 923 The libcurl source code generally use the name `conn` for the variable that | |
| 924 points to the connectdata. | |
| 925 | |
| 926 <a name="Curl_multi"></a> | |
| 927 ## Curl_multi | |
| 928 | |
| 929 Internally, the easy interface is implemented as a wrapper around multi | |
| 930 interface functions. This makes everything multi interface. | |
| 931 | |
| 932 `Curl_multi` is the multi handle struct exposed as `CURLM *` in external | |
| 933 APIs. | |
| 934 | |
| 935 This struct holds a list of `Curl_easy` structs that have been added to this | |
| 936 handle with [`curl_multi_add_handle()`][13]. The start of the list is | |
| 937 `->easyp` and `->num_easy` is a counter of added `Curl_easy`s. | |
| 938 | |
| 939 `->msglist` is a linked list of messages to send back when | |
| 940 [`curl_multi_info_read()`][14] is called. Basically a node is added to that | |
| 941 list when an individual `Curl_easy`'s transfer has completed. | |
| 942 | |
| 943 `->hostcache` points to the name cache. It is a hash table for looking up | |
| 944 name to IP. The nodes have a limited life time in there and this cache is | |
| 945 meant to reduce the time for when the same name is wanted within a short | |
| 946 period of time. | |
| 947 | |
| 948 `->timetree` points to a tree of `Curl_easy`s, sorted by the remaining time | |
| 949 until it should be checked - normally some sort of timeout. Each `Curl_easy` | |
| 950 has one node in the tree. | |
| 951 | |
| 952 `->sockhash` is a hash table to allow fast lookups of socket descriptor for | |
| 953 which `Curl_easy` uses that descriptor. This is necessary for the | |
| 954 `multi_socket` API. | |
| 955 | |
| 956 `->conn_cache` points to the connection cache. It keeps track of all | |
| 957 connections that are kept after use. The cache has a maximum size. | |
| 958 | |
| 959 `->closure_handle` is described in the `connectdata` section. | |
| 960 | |
| 961 The libcurl source code generally use the name `multi` for the variable that | |
| 962 points to the `Curl_multi` struct. | |
| 963 | |
| 964 <a name="Curl_handler"></a> | |
| 965 ## Curl_handler | |
| 966 | |
| 967 Each unique protocol that is supported by libcurl needs to provide at least | |
| 968 one `Curl_handler` struct. It defines what the protocol is called and what | |
| 969 functions the main code should call to deal with protocol specific issues. | |
| 970 In general, there's a source file named `[protocol].c` in which there's a | |
| 971 `struct Curl_handler Curl_handler_[protocol]` declared. In `url.c` there's | |
| 972 then the main array with all individual `Curl_handler` structs pointed to | |
| 973 from a single array which is scanned through when a URL is given to libcurl | |
| 974 to work with. | |
| 975 | |
| 976 `->scheme` is the URL scheme name, usually spelled out in uppercase. That's | |
| 977 "HTTP" or "FTP" etc. SSL versions of the protocol need their own | |
| 978 `Curl_handler` setup so HTTPS separate from HTTP. | |
| 979 | |
| 980 `->setup_connection` is called to allow the protocol code to allocate | |
| 981 protocol specific data that then gets associated with that `Curl_easy` for | |
| 982 the rest of this transfer. It gets freed again at the end of the transfer. | |
| 983 It will be called before the `connectdata` for the transfer has been | |
| 984 selected/created. Most protocols will allocate its private | |
| 985 `struct [PROTOCOL]` here and assign `Curl_easy->req.protop` to point to it. | |
| 986 | |
| 987 `->connect_it` allows a protocol to do some specific actions after the TCP | |
| 988 connect is done, that can still be considered part of the connection phase. | |
| 989 | |
| 990 Some protocols will alter the `connectdata->recv[]` and | |
| 991 `connectdata->send[]` function pointers in this function. | |
| 992 | |
| 993 `->connecting` is similarly a function that keeps getting called as long as | |
| 994 the protocol considers itself still in the connecting phase. | |
| 995 | |
| 996 `->do_it` is the function called to issue the transfer request. What we call | |
| 997 the DO action internally. If the DO is not enough and things need to be kept | |
| 998 getting done for the entire DO sequence to complete, `->doing` is then | |
| 999 usually also provided. Each protocol that needs to do multiple commands or | |
| 1000 similar for do/doing need to implement their own state machines (see SCP, | |
| 1001 SFTP, FTP). Some protocols (only FTP and only due to historical reasons) has | |
| 1002 a separate piece of the DO state called `DO_MORE`. | |
| 1003 | |
| 1004 `->doing` keeps getting called while issuing the transfer request command(s) | |
| 1005 | |
| 1006 `->done` gets called when the transfer is complete and DONE. That's after the | |
| 1007 main data has been transferred. | |
| 1008 | |
| 1009 `->do_more` gets called during the `DO_MORE` state. The FTP protocol uses | |
| 1010 this state when setting up the second connection. | |
| 1011 | |
| 1012 `->proto_getsock` | |
| 1013 `->doing_getsock` | |
| 1014 `->domore_getsock` | |
| 1015 `->perform_getsock` | |
| 1016 Functions that return socket information. Which socket(s) to wait for which | |
| 1017 action(s) during the particular multi state. | |
| 1018 | |
| 1019 `->disconnect` is called immediately before the TCP connection is shutdown. | |
| 1020 | |
| 1021 `->readwrite` gets called during transfer to allow the protocol to do extra | |
| 1022 reads/writes | |
| 1023 | |
| 1024 `->defport` is the default report TCP or UDP port this protocol uses | |
| 1025 | |
| 1026 `->protocol` is one or more bits in the `CURLPROTO_*` set. The SSL versions | |
| 1027 have their "base" protocol set and then the SSL variation. Like | |
| 1028 "HTTP|HTTPS". | |
| 1029 | |
| 1030 `->flags` is a bitmask with additional information about the protocol that will | |
| 1031 make it get treated differently by the generic engine: | |
| 1032 | |
| 1033 - `PROTOPT_SSL` - will make it connect and negotiate SSL | |
| 1034 | |
| 1035 - `PROTOPT_DUAL` - this protocol uses two connections | |
| 1036 | |
| 1037 - `PROTOPT_CLOSEACTION` - this protocol has actions to do before closing the | |
| 1038 connection. This flag is no longer used by code, yet still set for a bunch | |
| 1039 of protocol handlers. | |
| 1040 | |
| 1041 - `PROTOPT_DIRLOCK` - "direction lock". The SSH protocols set this bit to | |
| 1042 limit which "direction" of socket actions that the main engine will | |
| 1043 concern itself with. | |
| 1044 | |
| 1045 - `PROTOPT_NONETWORK` - a protocol that doesn't use network (read `file:`) | |
| 1046 | |
| 1047 - `PROTOPT_NEEDSPWD` - this protocol needs a password and will use a default | |
| 1048 one unless one is provided | |
| 1049 | |
| 1050 - `PROTOPT_NOURLQUERY` - this protocol can't handle a query part on the URL | |
| 1051 (?foo=bar) | |
| 1052 | |
| 1053 <a name="conncache"></a> | |
| 1054 ## conncache | |
| 1055 | |
| 1056 Is a hash table with connections for later re-use. Each `Curl_easy` has a | |
| 1057 pointer to its connection cache. Each multi handle sets up a connection | |
| 1058 cache that all added `Curl_easy`s share by default. | |
| 1059 | |
| 1060 <a name="Curl_share"></a> | |
| 1061 ## Curl_share | |
| 1062 | |
| 1063 The libcurl share API allocates a `Curl_share` struct, exposed to the | |
| 1064 external API as `CURLSH *`. | |
| 1065 | |
| 1066 The idea is that the struct can have a set of its own versions of caches and | |
| 1067 pools and then by providing this struct in the `CURLOPT_SHARE` option, those | |
| 1068 specific `Curl_easy`s will use the caches/pools that this share handle | |
| 1069 holds. | |
| 1070 | |
| 1071 Then individual `Curl_easy` structs can be made to share specific things | |
| 1072 that they otherwise wouldn't, such as cookies. | |
| 1073 | |
| 1074 The `Curl_share` struct can currently hold cookies, DNS cache and the SSL | |
| 1075 session cache. | |
| 1076 | |
| 1077 <a name="CookieInfo"></a> | |
| 1078 ## CookieInfo | |
| 1079 | |
| 1080 This is the main cookie struct. It holds all known cookies and related | |
| 1081 information. Each `Curl_easy` has its own private `CookieInfo` even when | |
| 1082 they are added to a multi handle. They can be made to share cookies by using | |
| 1083 the share API. | |
| 1084 | |
| 1085 | |
| 1086 [1]: https://curl.haxx.se/libcurl/c/curl_easy_setopt.html | |
| 1087 [2]: https://curl.haxx.se/libcurl/c/curl_easy_init.html | |
| 1088 [3]: https://c-ares.haxx.se/ | |
| 1089 [4]: https://tools.ietf.org/html/rfc7230 "RFC 7230" | |
| 1090 [5]: https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html | |
| 1091 [6]: https://curl.haxx.se/docs/manpage.html#--compressed | |
| 1092 [7]: https://curl.haxx.se/libcurl/c/curl_multi_socket_action.html | |
| 1093 [8]: https://curl.haxx.se/libcurl/c/curl_multi_timeout.html | |
| 1094 [9]: https://curl.haxx.se/libcurl/c/curl_multi_setopt.html | |
| 1095 [10]: https://curl.haxx.se/libcurl/c/CURLMOPT_TIMERFUNCTION.html | |
| 1096 [11]: https://curl.haxx.se/libcurl/c/curl_multi_perform.html | |
| 1097 [12]: https://curl.haxx.se/libcurl/c/curl_multi_fdset.html | |
| 1098 [13]: https://curl.haxx.se/libcurl/c/curl_multi_add_handle.html | |
| 1099 [14]: https://curl.haxx.se/libcurl/c/curl_multi_info_read.html | |
| 1100 [15]: https://tools.ietf.org/html/rfc7231#section-3.1.2.2 |
