Discussion:
[cmake-developers] Experiments in CMake support for Clang (header & standard) modules
Stephen Kelly
2018-05-07 16:01:47 UTC
Permalink
I think this discussion is more suited to the cmake-developers mailing
list. Moving there. Hopefully Brad or someone else can provide other
input from research already done.
  clang++ -fmodules -xc++ -Xclang -emit-module -Xclang
-fmodules-codegen -fmodule-name=foo foo.modulemap -o foo.pcm
  clang++ -fmodules -c -fmodule-file=foo.pcm use.cpp
  clang++ -c foo.pcm
  clang++ foo.o use.o -o a.out
Ok. Fundamentally, I am suspicious of having to have a
-fmodule-file=foo.pcm for every 'import foo' in each cpp file. I
shouldn't have to manually add that each time I add a new import
to my cpp file. Even if it can be automated (eg by CMake), I
shouldn't have to have my buildsystem be regenerated each time I
add an import to my cpp file either.
That's something I mentioned in the google groups post I made
which you linked to. How will that work when using Qt or any other
library?
- My understanding/feeling is that this would be similar to how a user
has to change their link command when they pick up a new dependency.
Perhaps it would be interesting to get an idea of how often users need
to change their buildsystems because of a new link dependency, and how
often users add includes to existing c++ files.

I expect you'll find the latter to be a far bigger number.

I also expect that expecting users to edit their buildsystem, or allow
it to be regenerated every time they add/remove includes would lead to
less adoption of modules. I can see people trying them and then giving
up in frustration.

I think I read somewhere that the buildsystem in google already requires
included '.h' files to be listed explicitly in the buildsystem, so it's
no change in workflow there. For other teams, that would be a change
which could be a change in workflow and something rebelled against.

By the way, do you have any idea how much modules adoption would be
needed to constitute "success"? Is there a goal there?
Nope, scratch that ^ I had thought that was the case, but talking more
with Richard Smith it seems there's an expectation that modules will
be somewhere between header and library granularity (obviously some
small libraries today have one or only a few headers, some (like Qt)
have many - maybe those on the Qt end might have slightly fewer
modules than the have headers - but still several modules to one
library most likely, by the sounds of it)
Why? Richard maybe you can answer that? These are the kinds of things I
was trying to get answers to in the previous post to iso sg2 in the
google group. I didn't get an answer as definitive as this, so maybe you
can share the reason behind such a definitive answer?
Now, admittedly, external dependencies are a little more complicated
than internal (within a single project consisting of multiple
libraries) - which is why I'd like to focus a bit on the simpler
internal case first.
Fair enough.
 
Today, a beginner can find a random C++ book, type in a code
example from chapter one and put `g++ -I/opt/book_examples
prog1.cpp` into a terminal and get something compiling and
running. With modules, they'll potentially have to pass a whole
list of module files too.
Yeah, there's some talk of supporting a mode that doesn't explicitly
build/use modules in the filesystem, but only in memory for the
purpose of preserving the isolation semantics of modules. This would
be used in simple direct-compilation cases like this. Such a library
might need a configuration file or similar the compiler can parse to
discover the parameters (warning flags, define flags, whatever else)
needed to build the BMI.
Perhaps. I'd be interested in how far into the book such a system would
take a beginner. Maybe that's fine, I don't know. Such a system might
not help with code in stack overflow questions/answers though, which
would probably be simpler sticking with includes (eg for Qt/boost).

Library authors will presumably have some say, or try to introduce some
'best practice' for users to follow. And such best practice will be
different for each library.
 
 
I raised some of these issues a few years ago regarding the clang
http://clang-developers.42468.n3.nabble.com/How-do-I-try-out-C-modules-with-clang-td4041946.html
http://clang-developers.42468.n3.nabble.com/How-do-I-try-out-C-modules-with-clang-td4041946i20.html
Interestingly, GCC is taking a directory-centric approach in the
driver (-fmodule-path=<dir>) as opposed to the 'add a file to your
 http://gcc.gnu.org/wiki/cxx-modules
Why is Clang not doing a directory-centric driver-interface? It
seems to obviously solve problems. I wonder if modules can be a
success without coordination between major compiler and
buildsystem developers. That's why I made the git repo - to help
work on something more concrete to see how things scale.
'We' (myself & other Clang developers) are/will be talking to GCC
folks to try to get consistency here, in one direction or another
(maybe some 3rd direction different from Clang or LLVM's). As you
noted in a follow-up, there is a directory-based flag in Clang now,
added by Boris as he's been working through adding modules support to
Build2.
I just looked through the commits from Boris, and it seems he made some
changes relating to -fmodule-file=. That still presupposes that all
(transitively) used module files are specified on the command line.

I was talking about the -fprebuilt-module-path option added by Manman
Ren in https://reviews.llvm.org/D23125 because that actually relieves
the user/buildsystem of maintaining a list of all used modules (I hope).
Having just read all of my old posts again, I still worry things
like this will hinder modules 'too much' to be successful. The
more (small) barriers exist, the less chance of success. If
modules aren't successful, then they'll become a poisoned chalice
and no one will be able to work on fixing them. That's actually
exactly what I expect to happen, but I also still hope I'm just
missing something :). I really want to see a committee document
from the people working on modules which actually explores the
problems and barriers to adoption and concludes with 'none of
those things matter'. I think it's fixable, but I haven't seen
anyone interested enough to fix the problems (or even to find out
what they are).
Indeed - hence my desire to talk through these things, get some
practical experience, document them to the committee in perhaps a
less-ranty, more concrete form along with pros/cons/unknowns/etc to
hopefully find some consistency, maybe write up a document of "this is
how we expect build systems to integrate with this C++ feature", etc.
Great. Nathan Sidwell already wrote a paper which is clearer than I am
on some of the problems:

 http://open-std.org/JTC1/SC22/WG21/docs/papers/2017/p0778r0.pdf

However he told me it 'wasn't popular'. I don't know if he means the
problems were dismissed, or his proposed solution was dismissed as not
popular.

Nevertheless, I recommend reading the problems stated there.
My current very simplistic prototype, to build a module file, its
respective module object file, and include those in the
  add_custom_command(
          COMMAND ${CMAKE_CXX_COMPILER} ${CMAKE_CXX_FLAGS} -xc++
-c -Xclang -emit-module -fmodules -fmodule-name=Hello
${CMAKE_CURRENT_SOURCE_DIR}/module.modulemap -o
${CMAKE_CURRENT_BINARY_DIR}/hello_module.pcm -Xclang
-fmodules-codegen
          DEPENDS module.modulemap hello.h
Why does this command depend on hello.h?
Because it builds the binary module interface (hello_module.pcm) that
is a serialized form of the compiler's internal representation of the
contents of module.modulemap which refers to hello.h (the modulemap
lists the header files that are part of the module). This is all using
Clang's current backwards semi-compatible "header modules" stuff. In a
"real" modules system, ideally there wouldn't be any modulemap. Just a
.cppm file, and any files it depends on (discovered through the build
system scanning the module imports, or a compiler-driven .d file style
thing).
Perhaps it'd be better for me to demonstrate something closer to the
actual modules reality, rather than this retro header modules stuff
that clang supports.
That would be better for me. I'm interested in modules-ts, but I'm not
interested in clang-modules.
 
If that is changed and module.modulemap is not, what will happen?
If hello.h is changed and module.modulemap is not changed? The
hello_module.pcm does need to be rebuilt.
Hmm, this assumes that the pcm/BMI only contains declarations and not
definitions, right? I think clang outputs the definitions in a separate
object file, but GCC currently doesn't. Perhaps that's a difference that
cmake has to account for or pass on to the user.
Ideally all of this would be implicit (maybe with some
flag/configuration, or detected based on new file extensions for C++
interface definitions) in the add_library - taking, let's imagine, the
.ccm (let's say, for argument's sake*) file listed in the
add_library's inputs and using it to build a .pcm (BMI), building that
.pcm as an object file along with all the normal .cc files,
Ok, I think this is the separation I described above.
* alternatively, maybe they'll all just be .cc files & a build system
would be scanning the .cc files to figure out dependencies & could
notice that one of them is the blessed module interface definition
based on the first line in the file.
Today, users have to contend with errors resulting from their own code
being incorrect, using some 3rd party template incorrectly, linking not
working due to incorrect link dependencies, and incorrect compiles due
to missing include directories (or incorrect defines specified). I can
see incorrect inputs to module generation being a new category of errors
to confuse users.

For example, if in your list of files there are two files which look
like the blessed module interface based on the first line in the file,
there will be something to debug.
So I suppose the more advanced question: Is there a way I can extend
handling of existing CXX files (and/or define a new kind of file, say,
CXXM?) specified in a cc_library. If I want to potentially check if a
.cc file is a module, discover its module dependencies, add new rules
about how to build those, etc. Is that do-able within my cmake
project, or would that require changes to cmake itself? (I'm happy to
poke around at what those changes might look like)
One of the things users can do in order to ensure that CMake works best
is to explicitly list the cpp files they want compiled, instead of
relying on globbing as users are prone to want to do:

 https://stackoverflow.com/questions/1027247/is-it-better-to-specify-source-files-with-glob-or-each-file-individually-in-cmak

if using globbing, adding a new file does not cause the buildsystem to
be regenerated, and you won't have a working build until you explicitly
cause cmake to be run again.

I expect you could get into similar problems with modules - needing a
module to be regenerated because its dependencies change (because it
exports what it imports from a dependency for example). I'm not sure
anything can be done to cause cmake to reliably regenerate the module in
that case. It seems similar to the globbing case to me.

But aside from that you could probably experimentally come up with a way
to do the check for whether a file is a module and discover its direct
dependencies using file(READ). You might want to delegate to a script in
another language to determine transitive dependencies and what
add_custom{_command,_target} code to generate.
But this isn't ideal - I don't /think/ I've got the dependencies
quite right & things might not be rebuilding at the right times.
Also it involves hardcoding a bunch of things like the pcm file
names, header files, etc.
Indeed. I think part of that comes from the way modules have been
designed. The TS has similar issues.
Sure - but I'd still be curious to understand how I might go about
modifying the build system to handle this. If there are obvious things
I have gotten wrong about the dependencies, etc, that would cause this
not to rebuild on modifications to any of the source/header files -
I'd love any tips you've got.
Sure. I didn't notice anything from reading, but I also didn't try it
out. You might need to provide a repo with the module.modulemap/c++
files etc that are part of your experiment. Or better, provide something
based on modules-ts that I can try out.
& if there are good paths forward for ways to prototype changes to the
build system to handle, say, specifying a switch/setting a
property/turning on a feature that I could implement that would
collect all the .ccm files in an add_library rule and use them to make
a .pcm file - I'd be happy to try prototyping that.
cmGeneratorTarget has a set of methods like GetResxSources which return
a subset of the files provided to add_library/target_sources by
splitting them by 'kind'. You would probably extend ComputeKindedSources
to handle the ccm extension, add a GetCCMFiles() to cmGeneratorTarget,
then use that new GetCCMFiles() in the makefiles/ninja generator to
generate rules.

When extending ComputeKindedSources could use

 if(Target->getPropertyAsBool("MAKE_CCM_RULES"))

as a condition to populating the 'kind'. Then rules will only be created
for targets which use something like

 set_property(TARGET myTarget PROPERTY MAKE_CCM_RULES ON)

in cmake code.

I'm guessing that's enough for you to implement what you want as an
experiment?
Ideally, at least for a simplistic build, I wouldn't mind
generating a modulemap from all the .h files (& have those
headers listed in the add_library command - perhaps splitting
public and private headers in some way, only including the public
headers in the module file, likely). Eventually for the
standards-proposal version, it's expected that there won't be any
modulemap file, but maybe all headers are included in the module
compilation (just pass the headers directly to the compiler).
In a design based on passing directories instead of files, would
those directories be redundant with the include directories?
I'm not sure I understand the question, but if I do, I think the
answer would be: no, they wouldn't be redundant. The system will not
have precompiled modules available to use - because binary module
definitions are compiler (& compiler version, and to some degree,
compiler flags (eg: are you building this for x86 32 bit or 64 bit?))
dependent.
Right. I discussed modules with Nathan Sidwell meanwhile and realised
this too.
 
One of the problems modules adoption will hit is that all the
compilers are designing fundamentally different command line
interfaces for them.
*nod* We'll be working amongst GCC and Clang at least to try to
converge on something common.
Different flags would not be a problem for cmake at least, but if Clang
didn't have something like -fprebuilt-module-path and GCC did, that
would be the kind of 'fundamental' difference I mean.
This also doesn't start to approach the issue of how to build
modules for external libraries - which I'm happy to
discuss/prototype too, though interested in working to streamline
the inter-library but intra-project (not inter-project) case first.
Yes, there are many aspects to consider.
Are you interested in design of a CMake abstraction for this
stuff? I have thoughts on that, but I don't know if your level of
interest stretches that far.
Not sure how much work it'd be - at the moment my immediate interest
is to show as much real-world/can-actually-run prototype with cmake as
possible, either with or without changes to cmake itself (or a
combination of minimal cmake changes plus project-specific recipes of
how to write a user's cmake files to work with this stuff) or also
showing non-working/hypothetical prototypes of what ideal user cmake
files would look like with reasonable/viable (but not yet implemented)
cmake support.
Yes, it's specifying the ideal user cmake files that I mean. Given that
granularity of modules can be anywhere on the spectrum between
one-module-file-per-library and one-module-file-per-class, I think cmake
will need to consider one-module-file-per-library and
*not*-one-module-file-per-library separately.

In the *not*-one-module-file-per-library case, cmake might have to
delegate more to the user, so it would be more inconvenient for them.

In the one-module-file-per-library case, I think the ideal is something
like:

 add_library(foo foo.cpp)
 # assuming foo.h is a module interface file, this creates
 # a c++-module called foo and makes it an interface usage
 # requirement of the foo target defined above
 add_cxx_module(foo foo.h)

 # bar.cpp imports foo.
 add_library(bar bar.cpp)
 # bar links to foo, and a suitable compile line argument is added if
 # needed for the foo module.
 target_link_libraries(bar foo)

This would work best if foo.h did not contain

 module;
 export module foo;

(after http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0713r1.html)

but instead contained only

 module;

and the module name came from the buildsystem (or from the compiler
using the basename).

As it is, the above cmake code would have to determine the module name
from foo.h and throw an error if it was different from foo. Having the
module name inside the source just adds scope for things to be wrong. It
would be better to specify the module name on the outside.

I wonder what you think about that, and whether it can be changed in the
modules ts? My thoughts on this and what the ideal in cmake would be are
changing as the discussion continues.
Can you help? It would really help my understanding of where
things currently stand with modules.
I can certainly have a go, for sure.
Great, thanks.
 
For example, is there only one way to port the contents of the cpp
files?
Much like header grouping - how granular headers are (how many headers
you have for a given library) is up to the developer to some degree
(certain things can't be split up), similarly with modules - given a
set of C++ definitions, it's not 100% constrained how those
definitions are exposed as modules - the developer has some freedom
over how the declarations of those entities are grouped into modules.
Yes, exactly. This repo is small, but has a few libraries, so if we
start with one approach we should be easily able to also try a different
approach and examine what the difference is and what it means.
 
After that, is there one importable module per class or one per
shared library (which I think would make more sense for Qt)?
Apparently (this was a surprise to me - since I'd been thinking about
this based on the Clang header modules (backwards compatibility stuff,
not the standardized/new language feature modules)) the thinking is
probably somewhere between one-per-class and one-per-shared-library.
But for me, in terms of how a build file would interact with this,
more than one-per-shared-library is probably the critical tipping point.
Yes. I think you're talking about the one-module-file-per-library and
*not*-one-module-file-per-library distinction I mentioned above.
If it was just one per shared library, then I'd feel like the
dependency/flag management would be relatively simple. You have to add
a flag to the linker commandline to link in a library, so you have to
add a flag to the compile step to reference a module, great. But, no,
bit more complicated than that given the finer granularity that's
expected here.
"finer granularity that's *allowed* here" really. If there is a simple
thing for the user to do (ie one-module-file-per-library), then cmake
can make that simple to achieve (because the dependencies between
modules are the same as dependencies between targets, which the user
already specifies with target_link_libraries).

If the user wants to do the more complicated thing
(*not*-one-module-file-per-library), then cmake can provide APIs for the
user to do that (perhaps by requiring the user to explicitly specify the
dependencies between modules).

My point is that cmake can design optimize for the easy way and I think
users will choose the easy way most of the time.
 
The git repo is an attempt to make the discussion concrete because
it would show how multiple classes and multiple libraries with
dependencies could interact in a modules world. I'm interested in
what it would look like ported to modules-ts, because as far as I
know, clang-modules and module maps would not need porting of the
cpp files at all.
Right, clang header-modules is a backwards compatibility feature. It
does require a constrained subset of C++ to be used to be effective
(ie: basically your headers need to be what we think of as
ideal/canonical headers - reincludable, independent, complete, etc).
So if you've got good/isolated headers, you can port them to Clang's
header modules by adding the module maps & potentially not doing
anything else - though, if you rely on not changing your build system,
then that presents some problems if you want to scale (more cores) or
distribute your build. Because the build system doesn't know about
these  dependencies - so if you have, say, two .cc files that both
include foo.h then bar.h - well, the build system runs two compiles,
both compiles try to implicitly build the foo.h module - one blocks
waiting for the other to complete, then they continue and block again
waiting for bar.h module to be built. If the build system knew about
these dependencies (what Google uses - what we call "explicit
(header)modules") then it could build the foo.h module and the bar.h
module in parallel, then build the two .cc files in parallel.
I think that the 'build a module' step should be a precondition to the
compile step. I think the compiler should issue an error if it
encounters an import for a module it doesn't find a file for. No one
expects a linker to compile foo.cpp into foo.o and link it just because
it encounters a fooFunc without a definition which was declared in foo.h.

That would reduce the magic and expect something like

 add_cxx_module(somename somefile.h otherfiles.h)

to specify a module file and its constituent partitions, which I think
is fine.
Basically: What do folks think about supporting these sort of
features in CMake C++ Builds? Any pointers on how I might best
implement this with or without changes to CMake?
I think some design is needed up front. I expect CMake would want
to have a first-class (on equal footing with include directories
or compile definitions and with particular handling) concept for
modules, extending the install(TARGET) command to install module
binary files etc.
Module binary files wouldn't be installed in the sense of being part
of the shipped package of a library - because module binary files are
compiler/flag/etc specific.
Ok.

Thanks,

Stephen.
Brad King
2018-05-07 17:13:14 UTC
Permalink
Hopefully Brad or someone else can provide other input from research already done.
I'm not particularly familiar with what compiler writers or the modules
standard specification expects build systems to do w.r.t modules.
However, IIUC at least at one time the expectation was that the module
files would not be installed like headers and are used only within a
local build tree. Are modules even supposed to be first-class entities
in the build system specification that users write?

In the Fortran world users just list all the sources and build systems are
expected to figure it out. CMake has very good support for Fortran modules.
Our Ninja generator has rules to preprocess the translation units first,
then parse the preprocessed output to extract module definitions and usages,
then inject the added dependencies into the build graph, and then begin
compilation of sources ordered by those dependencies (this requires a
custom fork of Ninja pending upstream acceptance).

Is that what is expected from C++ buildsystems for modules too?

-Brad
--
Powered by www.kitware.com

Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
https://cmake.org/mailman/listinfo/cmake-developers
David Blaikie
2018-05-09 22:36:31 UTC
Permalink
Post by Stephen Kelly
Hopefully Brad or someone else can provide other input from research
already done.
I'm not particularly familiar with what compiler writers or the modules
standard specification expects build systems to do w.r.t modules.
However, IIUC at least at one time the expectation was that the module
files would not be installed like headers
The module interface source file is, to the best of my knowledge, intended
to be installed like headers - and I'm currently advocating/leaning/pushing
towards it being installed exactly that way (in the same directories, using
the same include search path, etc).
Post by Stephen Kelly
and are used only within a local build tree.
The /binary/ representation of the module interface is likely to be only
local to a specific build tree - since it's not portable between compilers
or different compiler flag sets, even.
Post by Stephen Kelly
Are modules even supposed to be first-class entities
in the build system specification that users write?
In the Fortran world users just list all the sources and build systems are
expected to figure it out. CMake has very good support for Fortran modules.
Our Ninja generator has rules to preprocess the translation units first,
then parse the preprocessed output to extract module definitions and usages,
then inject the added dependencies into the build graph, and then begin
compilation of sources ordered by those dependencies (this requires a
custom fork of Ninja pending upstream acceptance).
Is that what is expected from C++ buildsystems for modules too?
Yes, likely something along those lines - though I'm looking at a few
different possible support models. A couple of major different ones (that
may be both supported by GCC and Clang at least, if they work out/make
sense) are:

* the wrapper-script approach, where, once the compiler determines the set
of direct module dependencies, it would invoke a script to ask for the
location of the binary module interface files for those modules. Build
systems could use this to dynamically discover the module dependencies of a
file as it is being compiled.

* tool-based parsing (more like what you've described Fortran+Ninja+CMake
is doing). The goal is to limit the syntax of modules code enough that
discovering the set of direct module dependencies is practical for an
external (non-compiler) tool - much like just some preprocessing and
looking for relatively simple keywords, etc. - then the tool/build system
can find the dependencies ahead of time without running the compiler

(a 3rd scenario, is what I've been sort of calling the "hello world"
example - where it's probably important that it's still practical for a new
user to compile something simple like "hello world" that uses a modularized
standard library, without having to use a build system for it (ie: just run
the compiler & it goes off & builds the in-memory equivalent of BMIs
without even writing them to disk/reusing them in any way))

- Dave
Post by Stephen Kelly
-Brad
Stephen Kelly
2018-05-15 07:22:52 UTC
Permalink
Post by Brad King
Hopefully Brad or someone else can provide other input from research already done.
I'm not particularly familiar with what compiler writers or the modules
standard specification expects build systems to do w.r.t modules.
However, IIUC at least at one time the expectation was that the module
files would not be installed like headers and are used only within a
local build tree. Are modules even supposed to be first-class entities
in the build system specification that users write?
The answer is probably both 'hopefully not' and 'sometimes'.
Post by Brad King
In the Fortran world users just list all the sources and build systems are
expected to figure it out. CMake has very good support for Fortran
modules. Our Ninja generator has rules to preprocess the translation units
first, then parse the preprocessed output to extract module definitions
and usages, then inject the added dependencies into the build graph, and
then begin compilation of sources ordered by those dependencies (this
requires a custom fork of Ninja pending upstream acceptance).
Is that what is expected from C++ buildsystems for modules too?
Hopefully. However, in some cases, the step of 'extracting module
definitions and usages' might be very hard to do. This document is quite
concise about that:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1052r0.html

So, the answer for cmake might be that CMake can learn to extract that
stuff, but ignore certain cases like imports within ifdefs. Maybe CMake
could then also provide API for users to specify the usages/dependencies
explicitly in those cases. I don't know how convenient that would be (or
could be made through design).

Thanks,

Stephen.
--
Powered by www.kitware.com

Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
https://cmake.org/mailman/listinfo/cmake-developers
Brad King
2018-05-15 13:18:07 UTC
Permalink
Post by Stephen Kelly
So, the answer for cmake might be that CMake can learn to extract that
stuff, but ignore certain cases like imports within ifdefs.
We'd need to do the extraction from already-preprocessed sources.
This is how Fortran+Ninja+CMake works. Unfortunately for C++
this will typically require preprocessing twice: once just to
extract module dependencies and again to actually compile. With
Fortran we compile using the already-preprocessed source but
doing that with C++ will break things like Clang's nice handling
of macros in diagnostic messages.

-Brad
--
Powered by www.kitware.com

Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
https://cmake.org/mailman/listinfo/cmake-developers
David Blaikie
2018-05-09 22:28:49 UTC
Permalink
This post might be inappropriate. Click to display it.
Stephen Kelly
2018-05-15 08:34:30 UTC
Permalink
Post by David Blaikie
Nope, scratch that ^ I had thought that was the case, but talking more
with Richard Smith it seems there's an expectation that modules will be
somewhere between header and library granularity (obviously some small
libraries today have one or only a few headers, some (like Qt) have many
- maybe those on the Qt end might have slightly fewer modules than the
have headers - but still several modules to one library most likely, by
the sounds of it)
Why? Richard maybe you can answer that? These are the kinds of things I
was trying to get answers to in the previous post to iso sg2 in the
google group. I didn't get an answer as definitive as this, so maybe you
can share the reason behind such a definitive answer?
It's more that the functionality will allow this & just judging by how
people do things today (existing header granularity partly motivated by
the cost of headers that doesn't apply to modules), how they're likely to
do things in the future (I personally would guess people will probably try
to just port their headers to modules - and a few places where there are
circular dependencies in headers or the like they might glob them up into
one module).
It seems quite common to have one PCH file per shared library (that's what
Qt does for example). What makes you so sure that won't be the case with
modules?

I'd say that what people will do will be determined by whatever their tools
optimize for. If it is necessary to list all used modules on the compile
line, people would choose fewer modules. If 'import QtCore' is fast and
allows the use of QString and QVariant etc and there is no downside, then
that will be the granularity offered by Qt (instead of 'QtCore.QString').
That is also comparable to '#include <QtCore>' which is possible today.
Post by David Blaikie
I just looked through the commits from Boris, and it seems he made some
changes relating to -fmodule-file=. That still presupposes that all
(transitively) used module files are specified on the command line.
Actually I believe the need is only the immediate dependencies - at least
with Clang's implementation.
Ok. That's not much better though. It still means editing/generating the
buildsystem each time you add an import. I don't think a model with that
requirement will gain adoption.
Post by David Blaikie
I was talking about the -fprebuilt-module-path option added by Manman Ren
in https://reviews.llvm.org/D23125 because that actually relieves the
user/buildsystem of maintaining a list of all used modules (I hope).
*nod* & as you say, GCC has something similar. Though the build system
probably wants to know about the used modules to do dependency analysis &
rebuilding correctly.
Yes, presumably that will work with -MM.
Post by David Blaikie
Yeah, thanks for the link - useful to read.
There seems to be a slew of activity around modules at the moment. You can
read some other reactions here which might have input for your paper:

https://www.reddit.com/r/cpp/comments/8jb0nt/what_modules_can_actually_provide_and_what_not/

https://www.reddit.com/r/cpp/comments/8j1edf/really_think_that_the_macro_story_in_modules_is/

I look forward to reading your paper anyway.
Post by David Blaikie
I think clang outputs the definitions in a separate object file, but GCC
currently doesn't. Perhaps that's a difference that cmake has to account
for or pass on to the user.
Clang outputs frontend-usable (not object code, but serialized AST usable
for compiling other source code) descriptions of the entire module
(whatever it contains - declarations, definitions, etc) to the .pcm file.
It can then, in a separate step, build an object file from the pcm. I
think GCC produces both of these artifacts in one go - but not in the same
file.
Ok, I must have misremembered something.
Post by David Blaikie
Sure. I didn't notice anything from reading, but I also didn't try it
out. You might need to provide a repo with the module.modulemap/c++ files
etc that are part of your experiment. Or better, provide something based
on modules-ts that I can try out.
*nod* I'll see if I can get enough of modules-ts type things working to
provide some examples, but there's some more variance/uncertainty there in
the compiler support, etc.
Something working only with clang for example would be a good start.
Post by David Blaikie
I'm guessing that's enough for you to implement what you want as an
experiment?
OK, so in that case it requires source changes to cmake? *nod* sounds
plausible - I appreciate the pointers. I take it that implies there's not
a way I could hook into those file kinds and filters without changing
cmake? (ie: from within my project's cmake build files, without modifying
a cmake release)
There is no way to hook into the system I described without patching CMake.
Your custom command approach might be the way to do that if it is the
priority.

Thanks,

Stephen.
--
Powered by www.kitware.com

Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
https://cmake.org/mailman/listinfo/cmake-developers
David Blaikie
2018-07-19 23:07:01 UTC
Permalink
(just CC'ing you Richard in case you want to read my ramblings/spot any
inaccuracies, etc)

Excuse the delay - coming back to this a bit now. Though the varying
opinions on what modules will take to integrate with build system still
weighs on me a bit - but I'm trying to find small ways/concrete steps to
make some progress on this rather than being lost in choice/opinion
paralysis.

To that end, Stephen, I've made a fork of your example repository & a very
simple/direct change to use C++ modules as currently implemented in Clang.
Some workarounds are required for a few bugs/incomplete features (oh, one I
didn't comment in source is the use of #include for the standard library -
that'd actually be "import legacy" in the current modules TS2/atom merged
proposal, if I understand correctly - and the way I've implemented it in
the example is as-if the standard library were not modularized (so the
project wraps it in a modularized header itself))

The build.sh script shows the commands required to build it (though I
haven't checked the exact fmodule-file dependencies to check that they're
all necessary, etc) - and with current Clang top-of-tree it does build and
run the example dinnerparty program.

There are a few ideas being tossed aroudn currently for how module
dependency discovery could be done by build systems or exposed by the
compiler itself (GCC has a service protocol it can interact with when it
needs a new compiled module description, which the build system could
implement to fulfill such requests - giving the build system all
information about the inter-module dependencies without a separate ahead of
time scan and while allowing maximal parallelism (compiler could request
all needed modules then wait for them all to be ready - rather than one at
a time, so as not to stall parallelism)).

The syntax is intended to support a build system that would do the scanning
itself, though - requiring limited preprocessing (I think the preprocessing
would need to know where to stop - since it might not be able to preprocess
the whole file without error without importing modules (since those imports
could contain new macro definitions used later in the file)).

If you happen to try experimenting with any ways the commands in the
build.sh file could be run from CMake it a sensible way - even if you
hypothesize what -MM support (or other compiler hooks like the dependency
server I alluded to above, etc) for modules might look like to do so, I'd
love to chat about it/throw ideas around/try mocking up/prototyping the
sort of compiler support (I don't think there's -MM support yet, but I
could see about adding it, for example) that seems like it might be most
useful.

One thing I'm vaguely concerned about is actually the dependency of
building modules within a single library (as in this project/example - at
least the way I've built it for now - I didn't try building it as separate
.so/.a files). At least across-library we can work at the library
granularity and provide on the command line (or via a file as GCC does) the
module files for all the modules from dependent libraries. But I'm not sure
how best to determine the order in which to build files within a library -
that's where the sort of -MM-esque stuff, etc, would be necessary. (though
that sort of stuff would be useful even cross-library to speed up the build
(eg: foo_a.cppm depends on bar_a.cppm but not bar_b.cppm - don't rebuild
foo_a.cppm if bar_b.cppm changes, even though libfoo depends on libbar, etc)
Post by Stephen Kelly
Post by David Blaikie
Nope, scratch that ^ I had thought that was the case, but talking more
with Richard Smith it seems there's an expectation that modules will be
somewhere between header and library granularity (obviously some small
libraries today have one or only a few headers, some (like Qt) have many
- maybe those on the Qt end might have slightly fewer modules than the
have headers - but still several modules to one library most likely, by
the sounds of it)
Why? Richard maybe you can answer that? These are the kinds of things I
was trying to get answers to in the previous post to iso sg2 in the
google group. I didn't get an answer as definitive as this, so maybe you
can share the reason behind such a definitive answer?
It's more that the functionality will allow this & just judging by how
people do things today (existing header granularity partly motivated by
the cost of headers that doesn't apply to modules), how they're likely to
do things in the future (I personally would guess people will probably
try
Post by David Blaikie
to just port their headers to modules - and a few places where there are
circular dependencies in headers or the like they might glob them up into
one module).
It seems quite common to have one PCH file per shared library (that's what
Qt does for example). What makes you so sure that won't be the case with
modules?
Can't say I've worked with code using existing PCH - if that seems common
enough, it might be a good analogy/guidance people might follow with C++
modules.
Post by Stephen Kelly
I'd say that what people will do will be determined by whatever their tools
optimize for. If it is necessary to list all used modules on the compile
line, people would choose fewer modules. If 'import QtCore' is fast and
allows the use of QString and QVariant etc and there is no downside, then
that will be the granularity offered by Qt (instead of 'QtCore.QString').
That is also comparable to '#include <QtCore>' which is possible today.
Yep, perhaps they will.
Post by Stephen Kelly
Post by David Blaikie
I just looked through the commits from Boris, and it seems he made some
changes relating to -fmodule-file=. That still presupposes that all
(transitively) used module files are specified on the command line.
Actually I believe the need is only the immediate dependencies - at least
with Clang's implementation.
Ok. That's not much better though. It still means editing/generating the
buildsystem each time you add an import.
Isn't that true today with headers, though? But today the build system does
this under the covers with header scanning, -MM modes, etc?

What would be different about having a similar requirement for modules (but
a somewhat different discovery process)? I guess the difference is that now
that discovery changes the command used to compile a given source file,
whereas in the past (with headers) it didn't change those commands but did
end up with an implicit dependency ("when this header file changes, rerun
this compile command (even though it doesn't mention the header - trust us,
we know it depends on it)") whereas now it'd be explicit.
Post by Stephen Kelly
I don't think a model with that requirement will gain adoption.
Post by David Blaikie
I was talking about the -fprebuilt-module-path option added by Manman
Ren
Post by David Blaikie
in https://reviews.llvm.org/D23125 because that actually relieves the
user/buildsystem of maintaining a list of all used modules (I hope).
*nod* & as you say, GCC has something similar. Though the build system
probably wants to know about the used modules to do dependency analysis &
rebuilding correctly.
Yes, presumably that will work with -MM.
Post by David Blaikie
Yeah, thanks for the link - useful to read.
There seems to be a slew of activity around modules at the moment. You can
https://www.reddit.com/r/cpp/comments/8jb0nt/what_modules_can_actually_provide_and_what_not/
https://www.reddit.com/r/cpp/comments/8j1edf/really_think_that_the_macro_story_in_modules_is/
I look forward to reading your paper anyway.
Post by David Blaikie
I think clang outputs the definitions in a separate object file, but GCC
currently doesn't. Perhaps that's a difference that cmake has to account
for or pass on to the user.
Clang outputs frontend-usable (not object code, but serialized AST usable
for compiling other source code) descriptions of the entire module
(whatever it contains - declarations, definitions, etc) to the .pcm file.
It can then, in a separate step, build an object file from the pcm. I
think GCC produces both of these artifacts in one go - but not in the
same
Post by David Blaikie
file.
Ok, I must have misremembered something.
Post by David Blaikie
Sure. I didn't notice anything from reading, but I also didn't try it
out. You might need to provide a repo with the module.modulemap/c++
files
Post by David Blaikie
etc that are part of your experiment. Or better, provide something based
on modules-ts that I can try out.
*nod* I'll see if I can get enough of modules-ts type things working to
provide some examples, but there's some more variance/uncertainty there
in
Post by David Blaikie
the compiler support, etc.
Something working only with clang for example would be a good start.
Post by David Blaikie
I'm guessing that's enough for you to implement what you want as an
experiment?
OK, so in that case it requires source changes to cmake? *nod* sounds
plausible - I appreciate the pointers. I take it that implies there's not
a way I could hook into those file kinds and filters without changing
cmake? (ie: from within my project's cmake build files, without modifying
a cmake release)
There is no way to hook into the system I described without patching CMake.
Your custom command approach might be the way to do that if it is the
priority.
Thanks,
Stephen.
--
Powered by www.kitware.com
http://www.cmake.org/Wiki/CMake_FAQ
Kitware offers various services to support the CMake community. For more
CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html
Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html
https://cmake.org/mailman/listinfo/cmake-developers
Stephen Kelly
2018-07-24 22:19:53 UTC
Permalink
Post by David Blaikie
(just CC'ing you Richard in case you want to read my ramblings/spot any
inaccuracies, etc)
Excuse the delay - coming back to this a bit now. Though the varying
opinions on what modules will take to integrate with build system still
weighs on me a bit
Can you explain what you mean by 'weighs on you'? Does that mean you see it
as tricky now?

I've kind of been assuming that people generally think it is not tricky, and
I'm just wrong in thinking it is and I'll eventually see how it is all
manageable.
Post by David Blaikie
- but I'm trying to find small ways/concrete steps to
make some progress on this rather than being lost in choice/opinion
paralysis.
Cool.
Post by David Blaikie
To that end, Stephen, I've made a fork of your example repository & a very
simple/direct change to use C++ modules as currently implemented in Clang.
Interesting, thanks for doing that! Here's the link for anyone else
interested:

https://github.com/dwblaikie/ModulesExperiments
Post by David Blaikie
The build.sh script shows the commands required to build it (though I
haven't checked the exact fmodule-file dependencies to check that they're
all necessary, etc) - and with current Clang top-of-tree it does build and
run the example dinnerparty program.
Ok. I tried with my several-weeks-old checkout and it failed on the first
command with -modules-ts in it (for AbstractFruit.cppm - the simplest one).

I'll update my build and try again, but that will take some time.
Post by David Blaikie
If you happen to try experimenting with any ways the commands in the
build.sh file could be run from CMake it a sensible way - even if you
hypothesize what -MM support (or other compiler hooks like the dependency
server I alluded to above, etc) for modules might look like to do so, I'd
love to chat about it/throw ideas around/try mocking up/prototyping the
sort of compiler support (I don't think there's -MM support yet, but I
could see about adding it, for example) that seems like it might be most
useful.
Yes, that would be useful I think.
Post by David Blaikie
One thing I'm vaguely concerned about is actually the dependency of
building modules within a single library (as in this project/example - at
least the way I've built it for now - I didn't try building it as separate
.so/.a files). At least across-library we can work at the library
granularity and provide on the command line (or via a file as GCC does)
the module files for all the modules from dependent libraries. But I'm not
sure how best to determine the order in which to build files within a
library
Exactly, that's one of the reasons my repo has libraries of multiple files
with dependencies between them, but building everything into one executable
also exposes that issue.

I guess it would work in a similar way to determining the order to link
libraries though, so probably not a major problem.
Post by David Blaikie
- that's where the sort of -MM-esque stuff, etc, would be
necessary.
Would it? I thought the -MM stuff would mostly be necessary for determining
when to rebuild? Don't we need to determine the build order before the first
build of anything? The -MM stuff doesn't help that.
Post by David Blaikie
Post by Stephen Kelly
It seems quite common to have one PCH file per shared library (that's
what Qt does for example). What makes you so sure that won't be the case
with modules?
Can't say I've worked with code using existing PCH - if that seems common
enough, it might be a good analogy/guidance people might follow with C++
modules.
Perhaps. I wonder how the C++ code/build.sh code would change in that
scenario? If there were only four modules - one for each of the libraries
(as delimited in the CMakeLists). Would the C++ code change then too
(something about building partial modules from each of the multiple cppm
files?), or how would the build.sh code change?
Post by David Blaikie
Post by Stephen Kelly
Ok. That's not much better though. It still means editing/generating the
buildsystem each time you add an import.
Isn't that true today with headers, though?
No. Imagine you implemented FruitBowl.cpp in revision 1 such that it did not
#include Grape.h and it did not add the Grape to the bowl.

Then you edit FruitBowl.cpp to #include Grape.h and add the Grape to the
bowl. Because Grape.h and Apple.h are in the same directory (which you
already have a -Ipath/to/headers for in your buildsystem), in this (today)
scenario, you don't have to edit the buildsystem.

In your port, you would have to add an import of Grape (fine, equivalent),
add the Grape to the bowl (the same as today), but additionally, you have to

* add -fmodule-file=Grape.pcm to the compile line or run your buildsystem
generator such as CMake to cause that compile line to be generated with the
argument added.
* Generate Grape.pcm (because the library has 1000 fruit classes in it and
your buildsystem is smart enough to lazily generate pcm files as needed)

Partly that is a result of the totally-granular approach you took to
creating modules (one module per class). If you used the approach of one
module per library, then you would not need to touch the buildsystem. I hope
that's clear now, as I've mentioned it multiple times, but I wonder if that
part of what I say is being understood.
Post by David Blaikie
But today the build system
does this under the covers with header scanning, -MM modes, etc?
Perhaps. I notice that running CMake on my llvm/clang/clang-tools-extra
checkout takes a non-zero amount of time, and for other buildsystems takes a
significantly non-zero amount of time.

Many buildsystem generators already avoid the time/complexity of
automatically regenerating the buildsystem when needed. Users have to leave
their IDE and run a script on the command line.

I wonder if people will use C++ modules if CMake/their generator has to be
re-run (automatically or through explicit user action) every time they add
'import foo;' to their C++ code... What do you think?
Post by David Blaikie
What would be different about having a similar requirement for modules
(but a somewhat different discovery process)?
I hope the above is clear. Please let me know if not. Maybe I'm missing
something still.

Thanks,

Stephen.
--
Powered by www.kitware.com

Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
https://cmake.org/mailman/listinfo/cmake-developers
Stephen Kelly
2018-08-06 21:36:59 UTC
Permalink
Post by Stephen Kelly
Post by David Blaikie
The build.sh script shows the commands required to build it (though I
haven't checked the exact fmodule-file dependencies to check that they're
all necessary, etc) - and with current Clang top-of-tree it does build
and run the example dinnerparty program.
Ok. I tried with my several-weeks-old checkout and it failed on the first
command with -modules-ts in it (for AbstractFruit.cppm - the simplest one).
I'll update my build and try again, but that will take some time.
I have locally tried your modifications. Aside from the content of my
previous email, I updated my clone (force push) to clean up the commits, and
to modify your build.sh script to maintain the different libraries in the
repo.

https://github.com/steveire/ModulesExperiments/commits/master

I am still interested in what the C++ code (and build.sh) look like in the
case of one-module-per-library.

It is obvious from looking at build.sh as it is that the buildsystem needs
to be changed when adding a new import to a c++ file, as I have described.
see the commit adding Grape separately and the changes required to the
buildsystem which were not required in the non-modules world:

https://github.com/steveire/ModulesExperiments/commit/428bea53fc6

I will see if I can get a recent GCC build for comparison and to determine
whether the callback-to-the-buildsystem used in GCC makes a difference in
that respect.

I'm still interested in a response to my previous email in that respect.

Thanks,

Stephen.
--
Powered by www.kitware.com

Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
https://cmake.org/mailman/listinfo/cmake-developers
David Blaikie
2018-08-08 00:29:03 UTC
Permalink
Post by David Blaikie
Post by Stephen Kelly
Post by David Blaikie
The build.sh script shows the commands required to build it (though I
haven't checked the exact fmodule-file dependencies to check that
they're
Post by Stephen Kelly
Post by David Blaikie
all necessary, etc) - and with current Clang top-of-tree it does build
and run the example dinnerparty program.
Ok. I tried with my several-weeks-old checkout and it failed on the first
command with -modules-ts in it (for AbstractFruit.cppm - the simplest one).
I'll update my build and try again, but that will take some time.
I have locally tried your modifications. Aside from the content of my
previous email, I updated my clone (force push) to clean up the commits, and
to modify your build.sh script to maintain the different libraries in the
repo.
https://github.com/steveire/ModulesExperiments/commits/master
I am still interested in what the C++ code (and build.sh) look like in the
case of one-module-per-library.
Oh, sure - that's easy enough, in terms of rolling together any modules in
the same library. Mash all the code in any cppm interface files that are
part of the same library into a single cppm file & build that into a single
module.
Post by David Blaikie
It is obvious from looking at build.sh as it is that the buildsystem needs
to be changed when adding a new import to a c++ file, as I have described.
Though potentially not manually - and some of those lines would be added
when the new module was added (as would be done when a new .cpp file is
added today) - rather than when the import occurred.

The 'worst' part is for intra-library module dependencies. For
inter-library module dependencies, a coarse-grained solution would be to
only build a library once all of the BMIs (.pcm files) for dependent
libraries are built - and all of those are passed to libraries that depend
on them. But, yeah, having the build system trying to figure out the
dependency between modules within a library (which isn't explicit in the
build files, unlike the inter-module dependency) is awkward. (& honestly,
for maximum build parallelism, you wouldn't want the coarse-grained
solution I described above - because then if something in library A
depended on only module B.1 but not module B.2, you'd want to start
building that part of A when B.1's BMI was built, without waiting for B.2's
BMI to be finished)

So, yes, if you have more than one module per library, then both for the
intra-library module dependency problem and the minimal-cross-library
dependency issue, you'd need the build system to either be told by the
compiler as its going (ala the proposed build server/oracle system that's
implemented in GCC) or by pre-scanning with a semi-aware scanner (doesn't
have to know all of C++ - modules are notionally well defined to be able to
scan a short preamble with limited syntax... but I'm not fully up on all
the nuance there & retro #include-to-import legacy support might complicate
things)

(I guess, technically, even in the one-module-per-library, arguably your
library dependency courld be a purely implementation dependency, in which
case you'd still want to be able to build your 'dependent' BMI without
waiting for the dependency's BMI to finish, since they don't depend on each
other - so if your build system doesn't differentiate between external and
internal dependencies, then again you'd need the kind of discovery phase if
you want maximal build parallelism... )
Post by David Blaikie
see the commit adding Grape separately and the changes required to the
https://github.com/steveire/ModulesExperiments/commit/428bea53fc6
I will see if I can get a recent GCC build for comparison and to determine
whether the callback-to-the-buildsystem used in GCC makes a difference in
that respect.
I'm still interested in a response to my previous email in that respect.
Yep, coming back to that now.
Post by David Blaikie
Thanks,
Stephen.
--
Powered by www.kitware.com
http://www.cmake.org/Wiki/CMake_FAQ
Kitware offers various services to support the CMake community. For more
CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html
Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html
https://cmake.org/mailman/listinfo/cmake-developers
David Blaikie
2018-08-24 01:32:40 UTC
Permalink
Post by Stephen Kelly
Post by David Blaikie
(just CC'ing you Richard in case you want to read my ramblings/spot any
inaccuracies, etc)
Excuse the delay - coming back to this a bit now. Though the varying
opinions on what modules will take to integrate with build system still
weighs on me a bit
Can you explain what you mean by 'weighs on you'? Does that mean you see it
as tricky now?
Yes, to some extent. If the build system is going to require the
compiler-callsback-to-buildsystem that it sounds like (from discussions
with Richard & Nathan, etc) is reasonable - yeah, I'd say that's a bigger
change to the way C++ is compiled than I was expecting/thinking of going
into this.

Some folks see this as not "tricky" but just "hey, C++ has been getting
away with a vastly simplified build model until now - now that it's getting
modular separation like other languages, it's going to have to have a build
system like other languages" (ala Java/C#, Haskell/Go maybe, etc) - but I'm
not especially familiar with any other languages enterprise-level build
systems (done a bit of C# at Microsoft but wasn't looking closely at the
build system - insulated from it by either Visual Studio or the OS release
processes, etc - and the last time I did Java I was working on small enough
things & without many cores, so throwing all the .java files in a package
at the compiler in one go was totally reasonable)
Post by Stephen Kelly
I've kind of been assuming that people generally think it is not tricky, and
I'm just wrong in thinking it is and I'll eventually see how it is all
manageable.
I think it's manageable - the thing that weighs on me, I suppose, is
whether or not the community at large will "buy" it, as such. And part of
that is on the work we're doing to figure out the integration with build
systems, etc, so that there's at least the first few pieces of support that
might help gain user adoption to justify/encourage/provide work on further
support, etc...
Post by Stephen Kelly
To that end, Stephen, I've made a fork of your example repository & a very
Post by David Blaikie
simple/direct change to use C++ modules as currently implemented in
Clang.
Interesting, thanks for doing that! Here's the link for anyone else
https://github.com/dwblaikie/ModulesExperiments
Oh, thanks - totally didn't realize I failed to link that!
Post by Stephen Kelly
Post by David Blaikie
The build.sh script shows the commands required to build it (though I
haven't checked the exact fmodule-file dependencies to check that they're
all necessary, etc) - and with current Clang top-of-tree it does build
and
Post by David Blaikie
run the example dinnerparty program.
Ok. I tried with my several-weeks-old checkout and it failed on the first
command with -modules-ts in it (for AbstractFruit.cppm - the simplest one).
I'll update my build and try again, but that will take some time.
Huh - I mean it's certainly a moving target - I had to file/workaround a
few bugs to get it working as much as it is, so not /too/ surprising. Did
you get it working in the end? If not, could you specify the exact revision
your compiler's at and show the complete output?
Post by Stephen Kelly
Post by David Blaikie
If you happen to try experimenting with any ways the commands in the
build.sh file could be run from CMake it a sensible way - even if you
hypothesize what -MM support (or other compiler hooks like the dependency
server I alluded to above, etc) for modules might look like to do so, I'd
love to chat about it/throw ideas around/try mocking up/prototyping the
sort of compiler support (I don't think there's -MM support yet, but I
could see about adding it, for example) that seems like it might be most
useful.
Yes, that would be useful I think.
Post by David Blaikie
One thing I'm vaguely concerned about is actually the dependency of
building modules within a single library (as in this project/example - at
least the way I've built it for now - I didn't try building it as
separate
Post by David Blaikie
.so/.a files). At least across-library we can work at the library
granularity and provide on the command line (or via a file as GCC does)
the module files for all the modules from dependent libraries. But I'm
not
Post by David Blaikie
sure how best to determine the order in which to build files within a
library
Exactly, that's one of the reasons my repo has libraries of multiple files
with dependencies between them, but building everything into one executable
also exposes that issue.
Ah, when I say "library" I mean static (.lib/.a) or dynamic (.so/.dll/etc)
- same situation in either. But yeah - any situation where there are
multiple modules within a single library, getting those dependencies will
involve some discovery (either a pre-parsing check (same way a build system
might check which headers are included in a file) or the compiler callback
system, where the compiler asks the build system for the modules it needs)
- and even in the one-module-per-library
Post by Stephen Kelly
I guess it would work in a similar way to determining the order to link
libraries though, so probably not a major problem.
The dependency between libraries is usually explicit in the build system,
right? I don't think that would be the case for modules - there'd be a
reasonable expectation that
Post by Stephen Kelly
Post by David Blaikie
- that's where the sort of -MM-esque stuff, etc, would be
necessary.
Would it? I thought the -MM stuff would mostly be necessary for determining
when to rebuild? Don't we need to determine the build order before the first
build of anything? The -MM stuff doesn't help that.
-MM produces output separate from the compilation (so far as I can tell -
clang++ -MM x.cpp doesn't produce anything other than the makefile fragment
on stdout) & finds all the headers, etc. So that's basically the same as
what we'd need here - but -MM only has to preprocess, whereas this would
have to do some C++ parsing to find imports, etc. But it would only have to
look for direct dependencies (unlike -MM which walks through includes to
find other includes) - and then the build system could re-invoke it for the
next module and so on...
Post by Stephen Kelly
Post by David Blaikie
It seems quite common to have one PCH file per shared library (that's
Post by Stephen Kelly
what Qt does for example). What makes you so sure that won't be the case
with modules?
Can't say I've worked with code using existing PCH - if that seems common
enough, it might be a good analogy/guidance people might follow with C++
modules.
Perhaps. I wonder how the C++ code/build.sh code would change in that
scenario? If there were only four modules - one for each of the libraries
(as delimited in the CMakeLists). Would the C++ code change then too
(something about building partial modules from each of the multiple cppm
files?), or how would the build.sh code change?
One module per library would be fairly clear, I think.

Looking at your example - if you have a library for all the fruits and
libabstractfruit, libfruitsalad, libnotfruitsalad, and libbowls - then
you'd have one module interface for each of those (AbstractFruit.cppm,
FruitSalad.cppm, NotFruitSalad.cppm, Bowls.cppm) that would be imported (so
replace "import Apple", "import Grape" with "import FruitSalad", etc... ) &
the implementations could be in multiple files if desired (Apple.cpp,
Grape.cpp, etc).
Post by Stephen Kelly
Post by David Blaikie
Post by Stephen Kelly
Ok. That's not much better though. It still means editing/generating the
buildsystem each time you add an import.
Isn't that true today with headers, though?
No. Imagine you implemented FruitBowl.cpp in revision 1 such that it did not
#include Grape.h and it did not add the Grape to the bowl.
Then you edit FruitBowl.cpp to #include Grape.h and add the Grape to the
bowl. Because Grape.h and Apple.h are in the same directory (which you
already have a -Ipath/to/headers for in your buildsystem), in this (today)
scenario, you don't have to edit the buildsystem.
Well, you don't have to do it manually, but your build system ideally
should reflect this new dependency so it knows to rebuild FruitBowl.cpp if
Grape.h changes.
Post by Stephen Kelly
In your port, you would have to add an import of Grape (fine, equivalent),
add the Grape to the bowl (the same as today), but additionally, you have to
* add -fmodule-file=Grape.pcm to the compile line or run your buildsystem
generator such as CMake to cause that compile line to be generated with the
argument added.
* Generate Grape.pcm (because the library has 1000 fruit classes in it and
your buildsystem is smart enough to lazily generate pcm files as needed)
Partly that is a result of the totally-granular approach you took to
creating modules (one module per class). If you used the approach of one
module per library, then you would not need to touch the buildsystem.
Right - much like if you have coarse grained headers then picking up a new
dependency doesn't involve adding a new #include nor modifying the build
system.

Though in both header and module cases, there's some room for optimization
from the coarse grained library dependencies you might have - if libA
depends on libB, not every C++ source file in libA needs to be rebuilt if
any header in libB is modified. That's why build systems use -MM-like
behavior to create more fine-grained dependencies, both to ensure things
are rebuilt when needed, but also to ensure they aren't rebuilt more often
than needed too.
Post by Stephen Kelly
I hope
that's clear now, as I've mentioned it multiple times, but I wonder if that
part of what I say is being understood.
Post by David Blaikie
But today the build system
does this under the covers with header scanning, -MM modes, etc?
Perhaps. I notice that running CMake on my llvm/clang/clang-tools-extra
checkout takes a non-zero amount of time, and for other buildsystems takes a
significantly non-zero amount of time.
Many buildsystem generators already avoid the time/complexity of
automatically regenerating the buildsystem when needed. Users have to leave
their IDE and run a script on the command line.
That surprises me a bit - if I were using an IDE, I'd expect it to update
the build system based on changes I made there - add a source file,
#include, etc, and it should be reflected in the build system files. That's
sort of meant to be one of the perks of using an IDE, I thought - that it
kept all that in sync because it knows about what I'm doing when I say
"create a new C++ source file" - rather than working on the command line
where the build system isn't notified when I open vim and save a new .cpp
file (unless the build system is using globs to consider anything with the
.cpp file extension to be part of the project).
Post by Stephen Kelly
I wonder if people will use C++ modules if CMake/their generator has to be
re-run (automatically or through explicit user action) every time they add
'import foo;' to their C++ code... What do you think?
If it's automatic & efficient (I hope it doesn't redo all the work of
discovery for all files - just the ones that have changed) it seems
plausible to me.

Sorry for the rather long delay on this - hopefully it helps us converge a
little.

I'll try to find some time to get back to my original prototype & your
replies to do with that to see if I can flesh out the simpler "one module
per library (with some of the inefficiency of just assuming strong
dependencies between libraries, rather than the fine grained stuff we could
do with -MM-esque support), no external modules" scenario (& maybe the
retro/"header modules" style, rather than/in addition to the new C++
modules TS/atom style) - would be great to have a reasonable prototype of
that as a place to work from, I think.

- Dave
Post by Stephen Kelly
What would be different about having a similar requirement for modules
Post by David Blaikie
(but a somewhat different discovery process)?
I hope the above is clear. Please let me know if not. Maybe I'm missing
something still.
Thanks,
Stephen.
--
Powered by www.kitware.com
http://www.cmake.org/Wiki/CMake_FAQ
Kitware offers various services to support the CMake community. For more
CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html
Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html
https://cmake.org/mailman/listinfo/cmake-developers
Stephen Kelly
2018-08-24 09:35:31 UTC
Permalink
Post by David Blaikie
Post by David Blaikie
(just CC'ing you Richard in case you want to read my
ramblings/spot any
Post by David Blaikie
inaccuracies, etc)
Excuse the delay - coming back to this a bit now. Though the varying
opinions on what modules will take to integrate with build
system still
Post by David Blaikie
weighs on me a bit
Can you explain what you mean by 'weighs on you'? Does that mean you see it
as tricky now?
Yes, to some extent. If the build system is going to require the
compiler-callsback-to-buildsystem that it sounds like (from
discussions with Richard & Nathan, etc) is reasonable - yeah, I'd say
that's a bigger change to the way C++ is compiled than I was
expecting/thinking of going into this.
Yes.
Post by David Blaikie
I've kind of been assuming that people generally think it is not tricky, and
I'm just wrong in thinking it is and I'll eventually see how it is all
manageable.
I think it's manageable - the thing that weighs on me, I suppose, is
whether or not the community at large will "buy" it, as such.
Yes, that has been my point since I first started talking about modules.
I don't think modules will gain a critical mass of adoption as currently
designed (and as currently designed to work with buildsystems).
Post by David Blaikie
And part of that is on the work we're doing to figure out the
integration with build systems, etc, so that there's at least the
first few pieces of support that might help gain user adoption to
justify/encourage/provide work on further support, etc...
Yes, reading the document Nathan sent us on June 12th this year, it
seems that CMake would have to implement a server mode so that the
compiler will invoke it with RPC. That server will also need to consume
some data generated by CMake during buildsystem generation (eg user
specified flags) and put that together with information sent by the
compiler (eg ) in order to formulate a response. It's complex. Maybe
CMake and other buildsystem generators can do it, but there are many
bespoke systems out there which would have to have some way to justify
the cost of developing such a thing.
Post by David Blaikie
Post by David Blaikie
The build.sh script shows the commands required to build it
(though I
Post by David Blaikie
haven't checked the exact fmodule-file dependencies to check
that they're
Post by David Blaikie
all necessary, etc) - and with current Clang top-of-tree it does
build and
Post by David Blaikie
run the example dinnerparty program.
Ok. I tried with my several-weeks-old checkout and it failed on the first
command with -modules-ts in it (for AbstractFruit.cppm - the simplest one).
I'll update my build and try again, but that will take some time.
Huh - I mean it's certainly a moving target - I had to file/workaround
a few bugs to get it working as much as it is, so not /too/
surprising. Did you get it working in the end? If not, could you
specify the exact revision your compiler's at and show the complete
output?
Yes, I got it working. See
Post by David Blaikie
Post by David Blaikie
But I'm not sure how best to determine the order in which to
build files within a library - that's where the sort of -MM-esque
stuff, etc, would be
Post by David Blaikie
necessary.
Would it? I thought the -MM stuff would mostly be necessary for determining
when to rebuild? Don't we need to determine the build order before the first
build of anything? The -MM stuff doesn't help that.
-MM produces output separate from the compilation (so far as I can
tell - clang++ -MM x.cpp doesn't produce anything other than the
makefile fragment on stdout) & finds all the headers, etc. So that's
basically the same as what we'd need here
Are you sure? I thought compiling with -MM gives us information that we
need before we compile the first time. Sorry if that was not clear from
what I wrote above. I see a chicken-egg problem. However, I assume I'm
just misunderstanding you (you said that -MM would be used to determine
build order for the initial build) so let's just drop this.
Post by David Blaikie
Looking at your example - if you have a library for all the fruits and
libabstractfruit, libfruitsalad, libnotfruitsalad, and libbowls - then
you'd have one module interface for each of those (AbstractFruit.cppm,
FruitSalad.cppm, NotFruitSalad.cppm, Bowls.cppm) that would be
imported (so replace "import Apple", "import Grape" with "import
FruitSalad", etc... ) & the implementations could be in multiple files
if desired (Apple.cpp, Grape.cpp, etc).
Could you show me what that would look like for the repo? I am
interested to know if this approach means concatenating the content of
multiple files (eg Grape.h and Apple.h) and porting that result to a
module. My instinct says that won't gain adoption.
Post by David Blaikie
Post by David Blaikie
Post by Stephen Kelly
Ok. That's not much better though. It still means
editing/generating the
Post by David Blaikie
Post by Stephen Kelly
buildsystem each time you add an import.
Isn't that true today with headers, though?
No. Imagine you implemented FruitBowl.cpp in revision 1 such that it did not
#include Grape.h and it did not add the Grape to the bowl.
Then you edit FruitBowl.cpp to #include Grape.h and add the Grape to the
bowl. Because Grape.h and Apple.h are in the same directory (which you
already have a -Ipath/to/headers for in your buildsystem), in this (today)
scenario, you don't have to edit the buildsystem.
Well, you don't have to do it manually, but your build system ideally
should reflect this new dependency so it knows to rebuild
FruitBowl.cpp if Grape.h changes.
I never said it had to be done manually in the real world. I mentioned
that in the context of your script. The point I keep making is that the
buildsystem has to be regenerated.
Post by David Blaikie
Perhaps. I notice that running CMake on my
llvm/clang/clang-tools-extra
checkout takes a non-zero amount of time, and for other
buildsystems takes a
significantly non-zero amount of time.
Many buildsystem generators already avoid the time/complexity of
automatically regenerating the buildsystem when needed. Users have to leave
their IDE and run a script on the command line.
That surprises me a bit
Yes, there is a large diversity out there in the world regarding how
things work.
Post by David Blaikie
I wonder if people will use C++ modules if CMake/their generator has to be
re-run (automatically or through explicit user action) every time they add
'import foo;' to their C++ code... What do you think?
If it's automatic & efficient (I hope it doesn't redo all the work of
discovery for all files - just the ones that have changed) it seems
plausible to me.
At least in the CMake case, the logic is currently coarse - if the
buildsystem needs to be regenerated, the entire configure and generate
steps are invoked. Maybe that can be changed, but that's just more
effort required on the part of all buildsystem generators, including
bespoke ones. I think the level of effort being pushed on buildsystems
is not well appreciated by the modules proposal.

What I see as a worst-case scenario is:

* Modules gets added to the standard to much applause
* User realize that they have to rename all of their .h files to cppm
and carefully change those files to use imports. There are new
requirements regarding where imports can appear, and things don't work
at first because of various reasons.
* Maybe some users think that creating a module per library is a better
idea, so they concat those new cppm files, sorting all the imports to
the top.
* Porting to Modules is hard anyway, because dependencies also need to
be updated etc. Developers don't get benefits until everything is 'just
right'.
* Some popular buildsystems develop the features to satisfy the new
requirements
* Most buildsystems, which are bespoke, don't implement the GCC
oracle-type stuff and just fudge things with parsing headers using a
simple script which looks for imports. It kind of works, but is fragile.
* Lots of time is spent on buildsystems being regenerated, because the
bespoke systems don't get optimized in this new way.
* After a trial run, most organizations that try modules reverse course
and stop using them.
* Modules deemed to have failed.

Maybe I'm being too negative, but this seems to be the likely result to
me. I think there are more problems lurking that we don't know about
yet. But, I've said this before, and I still hope I'm wrong and just
missing something.
Post by David Blaikie
Sorry for the rather long delay on this - hopefully it helps us
converge a little.
I'll try to find some time to get back to my original prototype & your
replies to do with that to see if I can flesh out the simpler "one
module per library (with some of the inefficiency of just assuming
strong dependencies between libraries, rather than the fine grained
stuff we could do with -MM-esque support), no external modules"
scenario (& maybe the retro/"header modules" style, rather than/in
addition to the new C++ modules TS/atom style) - would be great to
have a reasonable prototype of that as a place to work from, I think.
Yes, sounds interesting.

There are other things we would want to explore then too. In particular,
in my repo, all of the examples are part of the same buildsystem. We
should model external dependencies too - ie, pretend each library has a
standalone/hermetic buildsystem. That would mean that AbstractFruit
would generate its own pcm files to build itself, but each dependency
would also have to generate the AbstractFruit pcm files in order to
compile against it as an external library (because pcm files will never
be part of an install step, or a linux package or anything - they are
not distribution artifacts).

Thanks,

Stephen.
David Blaikie
2018-08-30 23:28:24 UTC
Permalink
Post by David Blaikie
Post by Stephen Kelly
Post by David Blaikie
(just CC'ing you Richard in case you want to read my ramblings/spot any
inaccuracies, etc)
Excuse the delay - coming back to this a bit now. Though the varying
opinions on what modules will take to integrate with build system still
weighs on me a bit
Can you explain what you mean by 'weighs on you'? Does that mean you see it
as tricky now?
Yes, to some extent. If the build system is going to require the
compiler-callsback-to-buildsystem that it sounds like (from discussions
with Richard & Nathan, etc) is reasonable - yeah, I'd say that's a bigger
change to the way C++ is compiled than I was expecting/thinking of going
into this.
Yes.
Post by Stephen Kelly
I've kind of been assuming that people generally think it is not tricky, and
I'm just wrong in thinking it is and I'll eventually see how it is all
manageable.
I think it's manageable - the thing that weighs on me, I suppose, is
whether or not the community at large will "buy" it, as such.
Yes, that has been my point since I first started talking about modules. I
don't think modules will gain a critical mass of adoption as currently
designed (and as currently designed to work with buildsystems).
For myself, I don't think I'd go that far (I think the current design might
be feasible) - and I'm mostly trying to set aside those concerns to get to
more concrete things - to work through the build system ramifications,
prototype things, etc, to get more concrete experience/demonstration of the
possibilities and problems.
Post by David Blaikie
And part of that is on the work we're doing to figure out the integration
with build systems, etc, so that there's at least the first few pieces of
support that might help gain user adoption to justify/encourage/provide
work on further support, etc...
Yes, reading the document Nathan sent us on June 12th this year, it seems
that CMake would have to implement a server mode so that the compiler will
invoke it with RPC. That server will also need to consume some data
generated by CMake during buildsystem generation (eg user specified flags)
and put that together with information sent by the compiler (eg ) in order
to formulate a response. It's complex. Maybe CMake and other buildsystem
generators can do it, but there are many bespoke systems out there which
would have to have some way to justify the cost of developing such a thing.
Yeah, certainly a possibility - maybe it'd be enough of a wedge to cause
people to collapse the large build system space into fewer options - many
other languages require this sort of level of coupling between
compiler/build system/language.

But getting that momentum started would be getting the main build systems
supporting it & starting at the leaves (independent projects starting to
write modular code for themselves - even if all their external dependencies
aren't) - then, with enough users using it downstream, might be some
libraries providing a modular option (or maybe even only provide it as
modules & that'd be the wedge I'm talking about - then teams/projects that
don't support modules would be left out a bit & provide an incentive for
them to move over to have modules support to use these dependencies).
Post by David Blaikie
Post by Stephen Kelly
The build.sh script shows the commands required to build it (though I
Post by David Blaikie
haven't checked the exact fmodule-file dependencies to check that
they're
Post by David Blaikie
all necessary, etc) - and with current Clang top-of-tree it does build
and
Post by David Blaikie
run the example dinnerparty program.
Ok. I tried with my several-weeks-old checkout and it failed on the first
command with -modules-ts in it (for AbstractFruit.cppm - the simplest one).
I'll update my build and try again, but that will take some time.
Huh - I mean it's certainly a moving target - I had to file/workaround a
few bugs to get it working as much as it is, so not /too/ surprising. Did
you get it working in the end? If not, could you specify the exact revision
your compiler's at and show the complete output?
Yes, I got it working. See
Post by Stephen Kelly
But I'm not sure how best to determine the order in which to build files
within a library - that's where the sort of -MM-esque stuff, etc, would be
Post by David Blaikie
necessary.
Would it? I thought the -MM stuff would mostly be necessary for determining
when to rebuild? Don't we need to determine the build order before the first
build of anything? The -MM stuff doesn't help that.
-MM produces output separate from the compilation (so far as I can tell -
clang++ -MM x.cpp doesn't produce anything other than the makefile fragment
on stdout) & finds all the headers, etc. So that's basically the same as
what we'd need here
Are you sure? I thought compiling with -MM gives us information that we
need before we compile the first time. Sorry if that was not clear from
what I wrote above. I see a chicken-egg problem. However, I assume I'm just
misunderstanding you (you said that -MM would be used to determine build
order for the initial build) so let's just drop this.
Yeah, still a bit confused - not sure ignoring this tangent is useful,
maybe there's something in the misunderstanding here.

clang -MM doesn't compile the source file, and prints out something like:

f1.o: f1.cpp foo.h

So we know that f1.cpp needs foo.h - now, in this case it would error if
foo.h didn't exist, because -MM has to look through all the inclusions to
provide all the transitive inclusions. The same isn't true of modules - if,
instead of foo.h this was the foo module (foo.cppm) and that module depends
on bar.cppm - then the output for f1.cpp would just be "f1.o: f1.cpp
foo.pcm" - without needing to compile foo.cppm, then the build system could
ask "what are the dependencies for foo.cppm (knowing it can generate
foo.pcm)" and get "foo.pcm.o: foo.cppm bar.pcm", etc. Very rough/hand-wavy,
but the general idea I think is reasonable.

(there are some bonus wrinkles for legacy imports, but there are some ways
to address that too)
Post by David Blaikie
Looking at your example - if you have a library for all the fruits and
libabstractfruit, libfruitsalad, libnotfruitsalad, and libbowls - then
you'd have one module interface for each of those (AbstractFruit.cppm,
FruitSalad.cppm, NotFruitSalad.cppm, Bowls.cppm) that would be imported (so
replace "import Apple", "import Grape" with "import FruitSalad", etc... ) &
the implementations could be in multiple files if desired (Apple.cpp,
Grape.cpp, etc).
Could you show me what that would look like for the repo? I am interested
to know if this approach means concatenating the content of multiple files
(eg Grape.h and Apple.h) and porting that result to a module. My instinct
says that won't gain adoption.
Sure, let's see...

https://github.com/dwblaikie/ModulesExperiments/commit/4438a017c422c37106741253a78e2bd7ee99c43e

I mean it could be done in other ways - you could #include Grape.h and
Apple.h into Fruit.cppm, I suppose. Could allow you to keep the old headers
for non-modular users & just wrap them up in a module (same way I did for
the "std" module in this example).
Post by David Blaikie
Post by Stephen Kelly
Post by David Blaikie
Ok. That's not much better though. It still means editing/generating the
Post by Stephen Kelly
buildsystem each time you add an import.
Isn't that true today with headers, though?
No. Imagine you implemented FruitBowl.cpp in revision 1 such that it did not
#include Grape.h and it did not add the Grape to the bowl.
Then you edit FruitBowl.cpp to #include Grape.h and add the Grape to the
bowl. Because Grape.h and Apple.h are in the same directory (which you
already have a -Ipath/to/headers for in your buildsystem), in this (today)
scenario, you don't have to edit the buildsystem.
Well, you don't have to do it manually, but your build system ideally
should reflect this new dependency so it knows to rebuild FruitBowl.cpp if
Grape.h changes.
I never said it had to be done manually in the real world. I mentioned
that in the context of your script. The point I keep making is that the
buildsystem has to be regenerated.
*nod* Though the same seems to be true today when #includes change - the
build system has to become aware of the new dependencies that have been
introduced (either discovered during compilation with -MD (generating the
.o file and the .d Makefile fragment at the same time) or separately/ahead
of time with -MM (generating the Makefile fragment on stdout and not
performing compilation)).
Post by David Blaikie
I wonder if people will use C++ modules if CMake/their generator has to be
Post by Stephen Kelly
re-run (automatically or through explicit user action) every time they add
'import foo;' to their C++ code... What do you think?
If it's automatic & efficient (I hope it doesn't redo all the work of
discovery for all files - just the ones that have changed) it seems
plausible to me.
At least in the CMake case, the logic is currently coarse - if the
buildsystem needs to be regenerated, the entire configure and generate
steps are invoked.
When I add a new include to a file currently, it doesn't look like cmake
re-runs. Any idea why that is? I guess ninja knows enough to update for the
new include dependency - perhaps it'd need to rerun cmake if my new
#include was of a generated file (even if there were already rules for
generating that file?)?

If modules worked similarly to the way things seem to work here - importing
an external module that hadn't been imported anywhere by your project
before might rerun cmake, but importing that module into a second source
file would be akin to #including a file that already had a rule for
generating it - so it wouldn't rerun cmake, I don't think.
Post by David Blaikie
Maybe that can be changed, but that's just more effort required on the
part of all buildsystem generators, including bespoke ones. I think the
level of effort being pushed on buildsystems is not well appreciated by the
modules proposal.
Perhaps - and that's what I'm trying to work through & write up.
Post by David Blaikie
* Modules gets added to the standard to much applause
* User realize that they have to rename all of their .h files to cppm and
carefully change those files to use imports. There are new requirements
regarding where imports can appear, and things don't work at first because
of various reasons.
* Maybe some users think that creating a module per library is a better
idea, so they concat those new cppm files, sorting all the imports to the
top.
* Porting to Modules is hard anyway, because dependencies also need to be
updated etc. Developers don't get benefits until everything is 'just right'.
* Some popular buildsystems develop the features to satisfy the new
requirements
* Most buildsystems, which are bespoke, don't implement the GCC
oracle-type stuff and just fudge things with parsing headers using a simple
script which looks for imports. It kind of works, but is fragile.
* Lots of time is spent on buildsystems being regenerated, because the
bespoke systems don't get optimized in this new way.
* After a trial run, most organizations that try modules reverse course
and stop using them.
* Modules deemed to have failed.
Maybe I'm being too negative, but this seems to be the likely result to
me. I think there are more problems lurking that we don't know about yet.
But, I've said this before, and I still hope I'm wrong and just missing
something.
I think that's certainly a possibility - but I'm approaching this
optimistically - rather than trying to prove it can't work, I'm trying to
figure out how it would work & if those solutions all turn out to be
unworkable/untenable by the community, then that's some useful data to feed
back into the committee process.
Post by David Blaikie
Sorry for the rather long delay on this - hopefully it helps us converge a
little.
I'll try to find some time to get back to my original prototype & your
replies to do with that to see if I can flesh out the simpler "one module
per library (with some of the inefficiency of just assuming strong
dependencies between libraries, rather than the fine grained stuff we could
do with -MM-esque support), no external modules" scenario (& maybe the
retro/"header modules" style, rather than/in addition to the new C++
modules TS/atom style) - would be great to have a reasonable prototype of
that as a place to work from, I think.
Yes, sounds interesting.
There are other things we would want to explore then too. In particular,
in my repo, all of the examples are part of the same buildsystem. We should
model external dependencies too - ie, pretend each library has a
standalone/hermetic buildsystem. That would mean that AbstractFruit would
generate its own pcm files to build itself, but each dependency would also
have to generate the AbstractFruit pcm files in order to compile against it
as an external library (because pcm files will never be part of an install
step, or a linux package or anything - they are not distribution artifacts).
Yep, eventually I'll get to that - trying to focus on small, incremental
steps to start with. The first adoption of modules will likely be on leaf
projectns in part because of this complexity - and they can always wrap any
external dependencies in modules ala the 'std' module in my example.
Getting leaf projects adopting this functionality would be a good gateway
towards pressure on build systems to support that and eventually to support
modularized libraries.

- Dave
Post by David Blaikie
Thanks,
Stephen.
Loading...