MAKEPP_SIGNATURES(1) Makepp MAKEPP_SIGNATURES(1)NAMEmakepp_signatures-- How makepp knows when files have changed
DESCRIPTION
C: C,
c_compilation_md5, M: "md5", P: "plain", S: "shared_object",
X: "xml",
xml_space
Each file is associated with a signature, which is a string that
changes if the file has changed. Makepp compares signatures to see
whether it needs to rebuild anything. The default signature for files
is a concatenation of the file's modification time and its size, unless
you're executing a C/C++ compilation command, in which case the default
signature is a cryptographic checksum on the file's contents, ignoring
comments and whitespace. If you want, you can switch to a different
method, or you can define your own signature functions.
How the signature is actually used is controlled by the build check
method (see makepp_build_check). Normally, if a file's signature
changes, the file itself is considered to have changed, and makepp
forces a rebuild.
If makepp is building a file, and you don't think it should be, you
might want to check the build log (see makepplog). Makepp writes an
explanation of what it thought each file depended on, and why it chose
to rebuild.
There are several signature methods included in makepp. Makepp usually
picks the most appropriate standard one automatically. However, you
can change the signature method for an individual rule by using
":signature" modifier on the rule which depends on the files you want
to check, or for all rules in a makefile by using the "signature"
statement, or for all makefiles at once using the "-m" or
"--signature-method" command line option.
Mpp::Signature methods included in the distribution
plain (actually nameless)
The plain signature method is the file's modification time and the
file's size, concatenated. These values are quickly obtainable
from the operating system and almost always change when the file
changes.
Makepp used to look only at the file's modification time, but if
you run makepp several times within a second (e.g., in a script
that's building several small things), sometimes modification times
won't change. Then, hopefully the file's size will change.
If the case where you may run makepp several times a second is a
problem for you, you may find that using the "md5" method is
somewhat more reliable. If makepp builds a file, it flushes its
cached MD5 signatures even if the file's date hasn't changed.
For efficiency's sake, makepp won't reread the file and recompute
the complex signatures below if this plain signature hasn't changed
since the last time it computed it. This can theoretically cause a
problem, since it's possible to change the file's contents without
changing its date and size. In practice, this is quite hard to do
so it's not a serious danger. In the future, as more filesystems
switch to timestamps of under a second, hopefully Perl will give us
access to this info, making this failsafe.
C
c_compilation_md5
This is the method for input files to C like compilers, if your
Perl supports MD5 (only optionally available up to 5.6.2). It
checks if a file's name looks like C or C++ source code, including
things like Corba IDL. If it does, this method applies. If it
doesn't, it falls back to plain signatures for binary files
(determined by name or else by content) and else to "md5". Alas
you can currently tune what is to be considered C or binary only by
subclassing "Mpp::Signature::c_compilation_md5" to get a new
signature type and there overriding or extending the methods
"recognizes_file" and "excludes_file".
The idea is to be independent of formatting changes. This is done
by pulling everything up as far as possible, and by eliminating
insignificant spaces. Words are exempt from pulling up, since they
might be macros containing "__LINE__", so they remain on the line
where they were.
// ignored comment
#ifdef XYZ
#include <xyz.h>
#endif
int a = 1;
void f
(
int b
)
{
a += b + ++c;
}
/* more ignored comment */
is treated as though it were
#ifdef XYZ
#include<xyz.h>
#endif
int a=1;
void f(
int b){
a+=b+ ++c;}
That way you can reindent your code or add or change comments
without triggering a rebuild, so long as you don't change the line
numbers. (This signature method recompiles if line numbers have
changed because that causes calls to "__LINE__" and most debugging
information to change.) It also ignores whitespace and comments
after the last token. This is useful for preventing a useless
rebuild if your VC adds lines at a "$""Log$" tag when checking in.
This method is particularly useful for the following situations:
· You want to make changes to the comments in a commonly included
header file, or you want to reformat or reindent part of it.
For one project that I worked on a long time ago, we were very
unwilling to correct inaccurate comments in a common header
file, even when they were seriously misleading, because doing
so would trigger several hours of rebuilds. With this
signature method, this is no longer a problem.
· You like to save your files often, and your editor (unlike
emacs) will happily write a new copy out even if nothing has
changed.
· You have C/C++ source files which are generated automatically
by other build commands (e.g., yacc or some other
preprocessor). For one system I work with, we have a
preprocessor which (like yacc) produces two output files, a
".cxx" and a ".h" file:
%.h %.cxx: %.qtdlg $(HLIB)/Qt/qt_dialog_generator
$(HLIB)/Qt/qt_dialog_generator $(input)
Every time the input file changed, the resulting .h file also
was rewritten, and ordinarily this would trigger a rebuild of
everything that included it. However, most of the time the
contents of the .h file didn't actually change (except for a
comment about the build time written by the preprocessor), so a
recompilation was not actually necessary.
md5 This is the default method, if your Perl supports MD5 (optionally
available up to 5.6.2). Computes an MD5 checksum of the file's
contents, rather than looking at the file's date or size. This
means that if you change the date on the file but don't change its
contents, makepp won't try to rebuild anything that depends on it.
This is particularly useful if you have some file which is often
regenerated during the build process that other files depend on,
but which usually doesn't actually change. If you use the "md5"
signature checking method, makepp will realize that the file's
contents haven't changed even if the file's date has changed. (Of
course, this won't help if the files have a timestamp written
inside of them, as archive files do for example.)
shared_object
This method only works if you have the utility "nm" in your path,
and it accepts the "-P" option to output Posix format. In that
case only the names and types of symbols in dynamically loaded
libraries become part of their signature. The result is that you
can change the coding of functions without having to relink the
programs that use them.
In the following command the parser will detect an implicit
dependency on $(LIBDIR)/libmylib.so, and build it if necessary.
However the link command will only be reperformed whenever the
library exports a different set of symbols:
myprog: $(OBJECTS) :signature shared_object
$(LD) -L$(LIBDIR) -lmylib $(inputs) -o $(output)
This works as long as the functions' interfaces don't change. But
in that case you'd change the declaration, so you'd also need to
change the callers.
Note that this method only applies to files whose name looks like a
shared library. For all other files it falls back to
"c_compilation_md5", which may in turn fall back to others.
xml
xml_space
These are two similar methods which treat xml canonically and
differ only in their handling of whitespace. The first completely
ignores it around tags and considers it like a single space
elsewhere, making the signature immune to formatting changes. The
second respects any whitespace in the xml, which is necessary even
if just a small part requires that, like a "<pre>" section in an
xhtml document.
Common to both methods is that they sign the essence of each xml
document. Presence or not of a BOM or "<?xml?>" header is ignored.
Comments are ignored, as is whether text is protected as "CDATA" or
with entities. Order and quoting style of attributes doesn't
matter, nor does how you render empty tags.
For any file which is not valid xml, or if the Expat based
"XML::Parser" or the "XML::LibXML" parser is not installed, this
falls back to method md5. If you switch your Perl installation
from one of the parsers to the others, makepp will think the files
are different as soon as their timestamp changes. This is because
the result of either parser is logically equivalent, but they
produce different signatures. In the unlikely case that this is a
problem, you can force use of only "XML::LibXML" by setting in
Perl:
$Mpp::Signature::xml::libxml = 1;
Extending applicability
The "C" or "c_compilation_md5" method has a built in list of suffixes
it recognizes as being C or C-like. If it gets applied to other files
it falls back to simpler signature methods. But many file types are
syntactically close enough to C++ for this method to be useful. Close
enough means C++ comment and string syntax and whitespace is
meaningless except one space between words (and C++'s problem cases "-
-", "+ +", "/ *" and "< <").
It (and its subclasses) can now easily be extended to other suffixes.
Anyplace you can specify a signature you can now tack on one one of
these syntaxes to make the method accept additional filenames:
C.suffix1,suffix2,suffix3
One or more comma-separated suffixes can be added to the method by
a colon. For example "C.ipp,tpp" means that besides the built in
suffixes it will also apply to files ending in .ipp or .tpp, which
you might be using for the inline and template part of C++ headers.
C.(suffix-regexp)
This is like the previous, but instead of enumerating suffixes, you
give a Perl regular expression to match the ones you want. The
previous example would be "C.(ipp|tpp)" or "C.([it]pp)" in this
syntax.
C(regexp)
Without a dot the Perl regular expression can match anywhere in the
file name. If it includes a slash, it will be tried against the
fully qualified filename. So if you have C++ style suffixless
headers in a directory include, use "C(/include/)" as your
signature method. However the above suffix example would be quite
nasty this way, "C(\.(?:ipp|tpp)$$)" or "C(\.[it]pp$$)" because "$"
is the expansion character in makefiles.
Shortcomings
Signature methods apply to all files of a rule. Now if you have a
compiler that takes a C like source code and an XML configuration file
you'd either need a combined signature method that smartly handles both
file types, or you must choose an existing method which will not know
whether a change in the other file is significant.
In the future signature method configuration may be changed to
filename-pattern, optionally per command.
Custom methods
You can, if you want, define your own methods for calculating file
signatures and comparing them. You will need to write a Perl module to
do this. Have a look at the comments in "Mpp/Signature.pm" in the
distribution, and also at the existing signature algorithms in
"Mpp/Signature/*.pm" for details.
Here are some cases where you might want a custom signature method:
· When you want all changes in a file to be ignored. Say you always
want dateStamp.o to be a dependency (to force a rebuild), but you
don't want to rebuild if only dateStamp.o has changed. You could
define a signature method that inherits from "c_compilation_md5"
that recognizes the dateStamp.o file by its name, and always
returns a constant value for that file.
· When you want to ignore part of a file. Suppose that you have a
program that generates a file that has a date stamp in it, but you
don't want to recompile if only the date stamp has changed. Just
define a signature method similar to "c_compilation_md5" that
understands your file format and skips the parts you don't want to
take into account.
perl v5.20.3 2012-02-07 MAKEPP_SIGNATURES(1)