cookie

manpage

$ man wget

可以找到跟「cookie」相關的使用說明

--no-cookies

--no-cookies
    Disable the use of cookies.  Cookies are a mechanism for maintaining server-side state.  The server sends
    the client a cookie using the "Set-Cookie" header, and the client responds with the same cookie upon
    further requests.  Since cookies allow the server owners to keep track of visitors and for sites to
    exchange this information, some consider them a breach of privacy.  The default is to use cookies; however,
    storing cookies is not on by default.

--load-cookies file

--load-cookies file
    Load cookies from file before the first HTTP retrieval.  file is a textual file in the format originally
    used by Netscape's cookies.txt file.

    You will typically use this option when mirroring sites that require that you be logged in to access some
    or all of their content.  The login process typically works by the web server issuing an HTTP cookie upon
    receiving and verifying your credentials.  The cookie is then resent by the browser when accessing that
    part of the site, and so proves your identity.

    Mirroring such a site requires Wget to send the same cookies your browser sends when communicating with the
    site.  This is achieved by --load-cookies---simply point Wget to the location of the cookies.txt file, and
    it will send the same cookies your browser would send in the same situation.  Different browsers keep
    textual cookie files in different locations:

    "Netscape 4.x."
        The cookies are in ~/.netscape/cookies.txt.

    "Mozilla and Netscape 6.x."
        Mozilla's cookie file is also named cookies.txt, located somewhere under ~/.mozilla, in the directory
        of your profile.  The full path usually ends up looking somewhat like ~/.mozilla/default/some-weird-
        string/cookies.txt.

    "Internet Explorer."
        You can produce a cookie file Wget can use by using the File menu, Import and Export, Export Cookies.
        This has been tested with Internet Explorer 5; it is not guaranteed to work with earlier versions.

    "Other browsers."
        If you are using a different browser to create your cookies, --load-cookies will only work if you can
        locate or produce a cookie file in the Netscape format that Wget expects.

    If you cannot use --load-cookies, there might still be an alternative.  If your browser supports a "cookie
    manager", you can use it to view the cookies used when accessing the site you're mirroring.  Write down the
    name and value of the cookie, and manually instruct Wget to send those cookies, bypassing the "official"
    cookie support:

            wget --no-cookies --header "Cookie: <name>=<value>"

--save-cookies file

--save-cookies file
    Save cookies to file before exiting.  This will not save cookies that have expired or that have no expiry
    time (so-called "session cookies"), but also see --keep-session-cookies.

--keep-session-cookies

--keep-session-cookies
    When specified, causes --save-cookies to also save session cookies.  Session cookies are normally not saved
    because they are meant to be kept in memory and forgotten when you exit the browser.  Saving them is useful
    on sites that require you to log in or to visit the home page before you can access some pages.  With this
    option, multiple Wget runs are considered a single browser session as far as the site is concerned.

    Since the cookie file format does not normally carry session cookies, Wget marks them with an expiry
    timestamp of 0.  Wget's --load-cookies recognizes those as session cookies, but it might confuse other
    browsers.  Also note that cookies so loaded will be treated as other session cookies, which means that if
    you want --save-cookies to preserve them again, you must use --keep-session-cookies again.

相關案例說明

--post-data=string
--post-file=file
    Use POST as the method for all HTTP requests and send the specified data in the request body.  --post-data
    sends string as data, whereas --post-file sends the contents of file.  Other than that, they work in
    exactly the same way. In particular, they both expect content of the form "key1=value1&key2=value2", with
    percent-encoding for special characters; the only difference is that one expects its content as a command-
    line parameter and the other accepts its content from a file. In particular, --post-file is not for
    transmitting files as form attachments: those must appear as "key=value" data (with appropriate percent-
    coding) just like everything else. Wget does not currently support "multipart/form-data" for transmitting
    POST data; only "application/x-www-form-urlencoded". Only one of --post-data and --post-file should be
    specified.

    Please note that wget does not require the content to be of the form "key1=value1&key2=value2", and neither
    does it test for it. Wget will simply transmit whatever data is provided to it. Most servers however expect
    the POST data to be in the above format when processing HTML Forms.

    When sending a POST request using the --post-file option, Wget treats the file as a binary file and will
    send every character in the POST request without stripping trailing newline or formfeed characters. Any
    other control characters in the text will also be sent as-is in the POST request.

    Please be aware that Wget needs to know the size of the POST data in advance.  Therefore the argument to
    "--post-file" must be a regular file; specifying a FIFO or something like /dev/stdin won't work.  It's not
    quite clear how to work around this limitation inherent in HTTP/1.0.  Although HTTP/1.1 introduces chunked
    transfer that doesn't require knowing the request length in advance, a client can't use chunked unless it
    knows it's talking to an HTTP/1.1 server.  And it can't know that until it receives a response, which in
    turn requires the request to have been completed -- a chicken-and-egg problem.

    Note: As of version 1.15 if Wget is redirected after the POST request is completed, its behaviour will
    depend on the response code returned by the server.  In case of a 301 Moved Permanently, 302 Moved
    Temporarily or 307 Temporary Redirect, Wget will, in accordance with RFC2616, continue to send a POST
    request.  In case a server wants the client to change the Request method upon redirection, it should send a
    303 See Other response code.

    This example shows how to log in to a server using POST and then proceed to download the desired pages,
    presumably only accessible to authorized users:

            # Log in to the server.  This can be done only once.
            wget --save-cookies cookies.txt \
                 --post-data 'user=foo&password=bar' \
                 http://server.com/auth.php

            # Now grab the page or pages we care about.
            wget --load-cookies cookies.txt \
                 -p http://server.com/interesting/article.php

    If the server is using session cookies to track user authentication, the above will not work because
    --save-cookies will not save them (and neither will browsers) and the cookies.txt file will be empty.  In
    that case use --keep-session-cookies along with --save-cookies to force saving of session cookies.

範例

前置作業

執行下面的指令,產生一個檔案「index.php」。

cat > index.php <<EOL
<?php

    var_dump(\$_COOKIE);
EOL

內容如下

<?php

    var_dump($_COOKIE);

執行下面的指令,產生一個檔案「cookie.php」。

cat > cookie.php <<EOL
<?php

\$value = 'something from somewhere';

setcookie("TestCookie", \$value);
EOL

內容如下

<?php

$value = 'something from somewhere';

setcookie("TestCookie", $value);

執行下面的指令,產生一個檔案「session.php」。

cat > session.php <<EOL
<?php

    session_start();
EOL

內容如下

<?php

    session_start();

啟動「測試 Web Server」

執行

$ php -S localhost:8080

切換到測試路徑

另外開一個「Terminal」

執行

mkdir test -p
cd test

測試

執行

$ wget localhost:8080

會產生一個檔案「index.html」

執行

$ cat index.html

顯示

array(0) {                                                                                                                
}    

執行

$ wget localhost:8080/cookie.php --keep-session-cookies --save-cookies cookie.txt

會產生檔案「cookie.php」和「cookie.txt」

執行

$ cat cookie.php

沒有任何顯示,執行出現下一個提示字元。

執行

$ cat cookie.txt

顯示

# HTTP cookie file.
# Generated by Wget on 2016-05-12 21:35:21.
# Edit at your own risk.

localhost:8080  FALSE   /       FALSE   0       TestCookie      something+from+somewhere

刪除「index.html」

$ rm index.html

執行

$ wget localhost:8080 --load-cookies cookie.txt

會產生一個檔案「index.html」。

執行下面指令,觀看「index.html」。

$ cat index.html

顯示

array(1) {
  ["TestCookie"]=>
  string(24) "something from somewhere"
}

執行

$ wget localhost:8080/session.php --keep-session-cookies --save-cookies cookie.txt

會產生檔案「session.php」和「cookie.txt」

執行

$ cat session.php

沒有任何顯示,執行出現下一個提示字元。

執行

$ cat cookie.txt

顯示

# HTTP cookie file.
# Generated by Wget on 2016-05-12 21:51:42.
# Edit at your own risk.

localhost:8080  FALSE   /       FALSE   0       PHPSESSID       6u9k9bgolqvcttfun9fftrue85

更多參考