PSGI(3) User Contributed Perl Documentation PSGI(3)NAMEPSGI - Perl Web Server Gateway Interface Specification
ABSTRACT
This document specifies a standard interface between web servers and
Perl web applications or frameworks, to promote web application
portability and reduce the duplicated efforts by web application
framework developers.
Keep in mind that PSGI is not Yet Another web application framework.
PSGI is a specification to decouple web server environments from web
application framework code. PSGI is also not the web application API.
Web application developers (end users) are not supposed to run their
web applications directly using the PSGI interface, but instead are
encouraged to use frameworks that support PSGI, or use the helper
implementations like Plack (more on that later).
TERMINOLOGIES
Servers
Servers are web servers that accept HTTP requests, dispatch the
requests to the web applications and return the HTTP response to
the clients. In PSGI specification it's a Perl process that's
running inside an HTTP server (e.g. mod_perl in Apache), a daemon
process called from a web server (e.g. FastCGI daemon) or a pure
perl HTTP server.
Servers are also called PSGI implementations as well as Backends.
Applications
Applications are web applications that actually get HTTP requests
and return HTTP response. In PSGI it's a code reference: see below.
Middleware
Middleware is a PSGI application, which is a code reference, but
also runs like a server to run other applications. It can be
thought of a plugin to extend PSGI application: see below.
Framework developers
Framework developers are authors of web application frameworks.
They need to write adapters (or engines) to read PSGI input, then
run the application logic and returns PSGI response to the server.
Web application developers
Web application developers are developers who write code that uses
one of the web application framework that uses PSGI interface. They
usually don't need to deal with nor care about PSGI protocol at
all.
SPECIFICATION
Applications
A PSGI application is a Perl code reference. It takes exactly one
argument, the environment and returns an array reference of exactly
three values.
sub app {
my $env = shift;
return [
'200',
[ 'Content-Type' => 'text/plain' ],
[ "Hello World" ], # or IO::Handle-like object
];
}
The Environment
The environment MUST be a hash reference that includes CGI-like
headers. The application is free to modify the environment. The
environment is required to include these variables (adopted from
PEP333, Rack and JSGI) except when they'd be empty, but see below:
· "REQUEST_METHOD": The HTTP request method, such as "GET" or "POST".
This cannot ever be an empty string, and so is always required.
· "SCRIPT_NAME": The initial portion of the request URL's path that
corresponds to the application, so that the application knows its
virtual "location". This may be an empty string if the application
corresponds to the "root" of the server.
· "PATH_INFO": The remainder of the request URL's "path", designating
the virtual "location" of the request's target within the
application. This may be an empty string if the request URL targets
the application root and does not have a trailing slash. This value
should be URI decoded by servers to be compatible to RFC 3875.
· "REQUEST_URI": The undecoded, raw request URL line. It is the raw
URI path and query part that appears in the HTTP "GET /...
HTTP/1.x" line and doesn't contain URI scheme and host names.
Unlike "PATH_INFO", this value SHOULD NOT be decoded by servers and
hence it is an application's responsibility to properly decode
paths to map URL to application handlers, when using "REQUEST_URI"
over "PATH_INFO".
· "QUERY_STRING": The portion of the request URL that follows the
"?", if any. May be empty, but is always required.
· "SERVER_NAME", "SERVER_PORT": When combined with "SCRIPT_NAME" and
"PATH_INFO", these variables can be used to complete the URL. Note,
however, that "HTTP_HOST", if present, should be used in preference
to "SERVER_NAME" for reconstructing the request URL. "SERVER_NAME"
and "SERVER_PORT" can never be empty strings, and so are always
required.
· "SERVER_PROTOCOL": The version of the protocol the client used to
send the request. Typically this will be something like "HTTP/1.0"
or "HTTP/1.1" and may be used by the application to determine how
to treat any HTTP request headers.
· "HTTP_" Variables: Variables corresponding to the client-supplied
HTTP request headers (i.e., variables whose names begin with
"HTTP_"). The presence or absence of these variables should
correspond to the presence or absence of the appropriate HTTP
header in the request.
If there are multiple header lines sent with the same key, the
server should treat them as if they're sent in one line, i.e.
combine them with ", " as in RFC 2616.
In addition to this, the PSGI environment MUST include these PSGI-
specific variables:
· "psgi.version": An array ref [1,0] representing this version of
PSGI.
· "psgi.url_scheme": A string "http" or "https", depending on the
request URL.
· "psgi.input": the input stream. See below.
· "psgi.errors": the error stream. See below.
· "psgi.multithread": true if the application may be simultaneously
invoked by another thread in the same process, false otherwise.
· "psgi.multiprocess": true if an equivalent application object may
be simultaneously invoked by another process, false otherwise.
The PSGI environment MAY include these optional PSGI variables:
· "psgi.run_once": true if the server expects (but does not
guarantee!) that the application will only be invoked this one
time during the life of its containing process. Normally, this will
only be true for a server based on CGI (or something similar).
· "psgi.nonblocking": true if the server is calling the application
in an non-blocking event loop.
· "psgi.streaming": true if the server supports callback style
delayed response and streaming writer object.
The server or the application can store its own data in the
environment, too. The keys MUST contain at least one dot, and should be
prefixed uniquely. The prefix "psgi." is reserved for use with the PSGI
core implementation and other accepted extensions and MUST NOT be used
otherwise. The environment MUST NOT contain the keys
"HTTP_CONTENT_TYPE" or "HTTP_CONTENT_LENGTH" (use the versions without
"HTTP_"). The CGI keys (named without a period) MUST have a scalar
variable containing strings. There are the following restrictions:
· "psgi.version" MUST be an array of integers.
· "psgi.url_scheme" MUST be a scalar variable containing either the
string "http" or "https".
· There MUST be a valid input stream in "psgi.input".
· There MUST be a valid error stream in "psgi.errors".
· The "REQUEST_METHOD" MUST be a valid token.
· The "SCRIPT_NAME", if non-empty, MUST start with "/"
· The "PATH_INFO", if non-empty, MUST start with "/"
· The "CONTENT_LENGTH", if given, MUST consist of digits only.
· One of "SCRIPT_NAME" or "PATH_INFO" MUST be set. "PATH_INFO" should
be "/" if "SCRIPT_NAME" is empty. "SCRIPT_NAME" should never be
"/", but should instead be empty.
The Input Stream
The input stream in "psgi.input" is an IO::Handle-like object which
streams the raw HTTP POST or PUT data. If it is a file handle then it
MUST be opened in binary mode. The input stream MUST respond to "read"
and MAY implement "seek".
The built-in filehandle or IO::Handle based objects should work fine
everywhere. Application developers SHOULD NOT inspect the type or class
of the stream, but instead just call "read" to duck type.
Application developers SHOULD NOT use the built-in "read" function to
read from the input stream, because "read" function only works with the
real IO object (a glob ref based file handle or PerlIO) and makes duck
typing difficult. Web application framework developers, if they know
the input stream will be used with the built-in read() in any upstream
code they can't touch, SHOULD use PerlIO or tie handle to work around
with this problem.
read
$input->read($buf, $len [, $offset ]);
Returns the number of characters actually read, 0 at end of file,
or undef if there was an error.
seek
$input->seek($pos, $whence);
Returns 1 on success, 0 otherwise.
The Error Stream
The error stream in "psgi.errors" is an IO::Handle-like object to print
errors. The error stream must implement "print".
The built-in filehandle or IO::Handle based objects should work fine
everywhere. Application developers SHOULD NOT inspect the type or class
of the stream, but instead just call "print" to duck type.
print
$errors->print($error);
Returns true if successful.
The Response
The response MUST be a three element array reference if the application
wants to directly return the HTTP response.
An application MAY choose to return other type of responses such as a
code reference, to delay the response only if the server supports the
streaming (See below).
Status
HTTP status code, is an integer and MUST be greater than or equal to
100.
Headers
The headers must be an array reference (and NOT a hash reference!)
containing key and value pairs. Its number of elements MUST be even.
The header MUST NOT contain a "Status" key, contain keys with ":" or
newlines in their name, contain keys that end in "-" or "_" but only
contain keys that consist of letters, digits, "_" or "-" and start with
a letter. The value of the header must be a scalar value that contain a
string. The value string MUST NOT contain characters below chr(37)
except chr(32) (whitespace).
If the same key name appears multiple times in an array ref, those
header lines MUST be sent to the client separately (e.g. multiple
"Set-Cookie" lines).
Content-Type
There MUST be a "Content-Type" except when the "Status" is 1xx, 204 or
304, in which case there MUST be none given.
Content-Length
There MUST NOT be a "Content-Length" header when the "Status" is 1xx,
204 or 304.
If the Status is not 1xx, 204 or 304 and there is no "Content-Length"
header, servers MAY calculate the content length by looking at Body, in
case it can be calculated (i.e. if it's an array ref of body chunk or a
real file handle), and append to the outgoing headers.
Body
The response body is returned from the application in one of following
two types of scalar variable.
· An array reference containing body as lines.
my $body = [ "Hello\n", "World\n" ];
Note that the elements in an array reference are NOT REQUIRED to
end in a newline. The servers SHOULD just write each elements as is
to the client, and SHOULD NOT care if the line ends with newline or
not.
So, when you have a big chunk of HTML in a single scalar $body,
[ $body ]
is a valid response body.
· An IO::Handle-like object or a built-in filehandle.
open my $body, "</path/to/file";
open my $body, "<:via(SomePerlIO)", ...;
my $body = IO::File->new("/path/to/file");
my $body = SomeClass->new(); # mock class that implements getline() and close()
Servers SHOULD NOT check the type or class of the body but instead
just call "getline" (i.e. duck type) to iterate over the body and
call "close" when done.
Servers MAY check if the body is a real filehandle using "fileno"
and "Scalar::Util::reftype" and if it's a real filehandle that has
a file descriptor, it MAY optimize the file serving using
techniques like sendfile(2).
The body object MAY respond to "path" method to return the local
file system path, which MAY be used by some servers to switch to
more efficient file serving method using the file path instead of a
file descriptor.
Servers are RECOMMENDED to set $/ special variable to the buffer
size when reading content from $body using "getline" method, in
case it's a binary filehandle. Applications, when it returns a mock
object that implements "getline" are NOT REQUIRED to respect the $/
value.
Delayed Reponse and Streaming Body
PSGI interface allows applications and servers optionally handle
callback-style response (instead of three-element array reference) to
delay the HTTP response and stream content (server push).
To enable delayed response, an application SHOULD check if
"psgi.streaming" environment is true, and in that case, MAY return a
callback that is passed another callback (response starter) as its
first argument, and pass the three element response to the callback.
my $app = sub {
my $env = shift;
# Delays response until it fetches content from the network
return sub {
my $respond = shift;
fetch_content_from_server(sub {
my $content = shift;
# ...
$respond->([ 200, $headers, [ $content ] ]);
});
};
};
Similarly, an application MAY omit the third element (the body) in the
callback to get a response writer object, that implements "write",
"poll_cb" and "close" method to push the response body.
my $app = sub {
my $env = shift;
# immediately starts the response and stream the content
return sub {
my $respond = shift;
my $writer = $respond->([ 200, [ 'Content-Type', 'application/json' ]]);
wait_for_events(sub {
my $new_event = shift;
if ($new_event) {
$writer->write($new_event->as_json . "\n");
# Or:
# $writer->poll_cb(sub { $_[0]->write($new_event->as_json . "\n") });
} else {
$writer->close;
}
});
};
};
Delayed response and streaming should be useful if you want to
implement non-blocking I/O based server streaming or long-poll Comet
push technology. IO::Handle-like object is pull, while this streaming
response implements push.
This interface is optional: An applciation SHOULD check if the server
supports streaming. Servers MAY decide to not accept this streaming
response and throws an exception. Servers MUST set "psgi.streaming" to
true if this interface is supported. Servers MUST return a writer
object if the third argument (response body) is omitted or not defined
in the response starter callback arguments.
Middleware
Middleware is itself a PSGI application but it takes an existing PSGI
application and runs it like a server, mostly to do pre-processing on
$env or post-processing on the response objects.
Here's a simple example that appends special HTTP header X-PSGI-Used to
any PSGI application.
# $app is a simple PSGI application
my $app = sub {
my $env = shift;
return [ '200', [ 'Content-Type' => 'text/plain' ], [ "Hello World" ] ];
};
# $xheader is a middleware to wrap $app
my $xheader = sub {
my $env = shift;
my $res = $app->($env);
push @{$res->[1]}, 'X-PSGI-Used' => 1;
return $res;
};
Middleware itself MUST behave exactly like a PSGI application: take
$env and return $res. Middleware MAY decide not to support the
streaming interface (see above) but SHOULD pass through the response
types that it doesn't understand.
ACKNOWLEDGEMENTS
Some parts of this specification are adopted from the following
specifications.
· PEP333 Python Web Server Gateway Interface
http://www.python.org/dev/peps/pep-0333
<http://www.python.org/dev/peps/pep-0333>
· Rack <http://rack.rubyforge.org/doc/SPEC.html>
· JSGI Specification http://jackjs.org/jsgi-spec.html
<http://jackjs.org/jsgi-spec.html>
I'd like to thank authors of these great documents.
AUTHOR
Tatsuhiko Miyagawa <miyagawa@bulknews.net>
CONTRIBUTORS
The following people have contributed to the PSGI specification and
Plack implementation by commiting their code, sending patches,
reporting bugs, asking questions, suggesting useful advices,
nitpicking, chatting on IRC or commenting on my blog (in no particular
order):
Tokuhiro Matsuno
Kazuhiro Osawa
Yuval Kogman
Kazuho Oku
Alexis Sukrieh
Takatoshi Kitano
Stevan Little
Daisuke Murase
mala
Pedro Melo
Jesse Luehrs
John Beppu
Shawn M Moore
Mark Stosberg
Matt S Trout
Jesse Vincent
Chia-liang Kao
Dave Rolsky
Hans Dieter Pearcey
Randy J Ray
Benjamin Trott
Max Maischein
Slaven ReziX
Marcel Gruenauer
Masayoshi Sekimura
Brock Wilcox
Piers Cawley
Daisuke Maki
Kang-min Liu
Yasuhiro Matsumoto
Ash Berlin
Artur Bergman
Simon Cozens
Scott McWhirter
Jiro Nishiguchi
Masahiro Chiba
Patrick Donelan
Paul Driver
COPYRIGHT AND LICENSE
Copyright Tatsuhiko Miyagawa, 2009.
This document is licensed under the Creative Commons license by-sa.
perl v5.14.0 2009-10-22 PSGI(3)