?

Log in

No account? Create an account

Design problem with Ptah - Journal of Omnifarious

Aug. 14th, 2005

12:38 am - Design problem with Ptah

Previous Entry Share Next Entry

A brief description of the design at this point:

I have nodes and builders. A node knows which other nodes it depends on, and it has a set of abstract attributes to describe that dependency that are set up as flags.

The dependency flags are MUST_EXIST, BUILT_FROM, and OUTDATED_BY. Here are there descriptions:

BUILT_FROM
This states that the depended upon node is used to build this node. It additionally implies a MUST_EXIST, and an OUTDATED_BY dependency.
MUST_EXIST
This dependency type says that the depended upon node must be built successfully for this node to be built. Being built successfully may simply entail checking to see if the file exists.
OUTDATED_BY
This dependency says that if the depended upon node changes since the dependent node was last built, the dependent node should be built again. But it does not require the depended upon node to be successfully built. The change in state from 'buildable' to 'not-buildable' is one sort of change that can outdate something. Since the set of header files that are needed to build a .o file can change, this is the appropriate dependency for header files. If a header file changes or is removed, the .o file needs to be rebuilt, but if the .h file goes away that doesn't necessarily mean that the .o file is now not buildable.

A file node automatically has a MUST_EXIST dependency on the containing directory. This allows you to easily arrange for directories to be created for copying header files to or building .o files into.

There is a bit of magic surrounding file nodes. If you construct a file node for a particular path, it will always be the same object.

A node also knows what builder object is supposed to be used to build it. The builder objects follow the flyweight pattern. All the .o nodes that build a .o file from a .c file may use the exact same builder object.

Here's my problem...

How do I handle tools like flex with this structure? flex takes in a .l file describing a lexer and produces a .h file with #defines for all the lexemes and a .c file containing the lexer itself. This can be described by the dependency graph just fine, but it's hard to describe how the .c and .h files are built because a builder isn't supposed to know which nodes it's operating on until it's invoked, and then it's only told what node it's building. The .c and .h files are sibling nodes that don't really know about each other.

Current Mood: [mood icon] thoughtful

Comments:

From:rosencrantz319
Date:August 14th, 2005 01:29 pm (UTC)
(Link)
(Reply) (Thread)
[User Picture]
From:omnifarious
Date:August 15th, 2005 05:35 pm (UTC)

Re: Who what in the where now?

(Link)

Would the animals in the parable be small and made of plastic?

(Reply) (Parent) (Thread)
[User Picture]
From:sfllaw
Date:August 14th, 2005 03:25 pm (UTC)
(Link)
.h MUST_EXIST .c
.c BUILT_FROM .lex

Since lex generates both .h and .c, you can just build one and check for the existance of the other.
(Reply) (Thread)
[User Picture]
From:omnifarious
Date:August 14th, 2005 11:47 pm (UTC)
(Link)

Yeah, that's a pretty good answer. I feel slight embarassed that I didn't think of doing it that way. Thanks!

I will also have 'environments' that have methods for turning one kind of file into another that will automatically put nodes together with builders in dependency structures, so requiring a specific dependency structure works out just fine.

(Reply) (Parent) (Thread)
From:hattifattener
Date:August 15th, 2005 01:19 am (UTC)
(Link)
My approach would be to include the builders in the dependency graph directly, so that both builders and files are nodes. So, e.g. "lex foo.l" depends-on foo.l, because its behavior somehow differs based on foo.l; and foo.{c,h} depends-on "lex foo.l". Files wouldn't normally depends-on other files directly, unless they were symlinks, or potentially the same file, or etc.
(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:August 15th, 2005 04:40 am (UTC)
(Link)

That was an approach I was going to use many years ago when I first envisioned Ptah. I'm not sure why I decided against it this time.

I did just realize a problem with sflaw's approach, and I'm going to reply to him with it.

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:August 16th, 2005 01:09 am (UTC)

hmmm some more...

(Link)

I've thought about this a bunch. The approach of doing this with builders seems elegant on the face of it. Fewer major concepts in a system make for a simpler system...

But I don't think it'll fit...

In my system, every node has a builder, even nodes that aren't built out of other things. Those nodes have a builder that checks on their existence and tells other people data about the node like the signature, so other builders can tell if the node has changed since they last ran. If I made builders into nodes, I'd have to have a builder node for every non-builder node. It warps the tree too much. It doesn't feel right.

OTOH, I can borrow your idea...

Basically, what you're doing is relating the two sibling nodes to each other through a node they both depend on, and having that node be the one that does the actual building. I could do this in my system by have the .c and the .h nodes both depend on an empty 'fake' node that then depended in the .l node.

Again, this tells me I can solve the problem without having to make any structural changes or hacks to the internals. The one tricky detail in all this is a node that has both a MUST_EXIST and an UPDATED_BY dependency on it after I change the behavior for those two different dependency types. And that tricky detail exists in bothmodels.

Oh, BTW, the plan is that a builder can add dependencies at build time, before the build process actually starts. This will be used by builders to add dependencies on the tools themselves, or the arguments given to the tools.

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:August 15th, 2005 04:47 am (UTC)
(Link)

There is a problem here that won't actually be a problem with my current system, but if I tweak it slightly, it is. And it's a tweak I intend to do sometime.

A MUST_EXIST dependency won't cause the .c file to be re-created from the .l file if the .l file is newer if the .c file. This is because a MUST_EXIST dependency only requires that the target exist, not that it be up-to-date.

Now, in the current system, that distinction is moot. The system tries to keep all targets up-to-date, even if there are only MUST_EXIST dependencies on them. But I hope to change that.

What you could do is have both a MUST_EXIST and an OUTDATED_BY dependency without also having a BUILT_FROM dependency. That would probably work and achieve the desired effect. The .h file would have a 'secondary output file' type builder. This means that this concept could be re-used in a variety of situations such as a .java file producing many .class files. One .class file would be considered 'the master' and the remaining ones would be secondary like you have the .h file there.

Hmmm... I'm going to have to think more about this.

(Reply) (Parent) (Thread)
[User Picture]
From:sfllaw
Date:August 15th, 2005 04:53 am (UTC)
(Link)
Why would you ever possibly want something to exist but be out of date, by default? That sounds like one of those horrible things you'll have to explain in a manual.
(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:August 15th, 2005 06:39 am (UTC)
(Link)

A directory. A file depends on the directory it's in existing. But if that directory is newer than the file, that doesn't mean the file has to be rebuilt. Same thing for a symlink. :-)

(Reply) (Parent) (Thread)
[User Picture]
From:sfllaw
Date:August 15th, 2005 06:54 am (UTC)
(Link)
By default, though? Not by default!

You almost certainly want the default to be to check timestamps, or people will just forget. It's far worse to have an out-of-sync tree than to waste cycles building things too often. You may then make it an explicit exception for directories, or do it implicitly. (For reference, GNU Make always rebuilds if one of the dependencies is a directory, to support recursive make. This may be construed as a misfeature.)

And for symlinks, I think you don't want that. Most of the time, you'll want symlinks to be readlink()ed by default, and it's the underlying file you care about. Your symlinks would then either be versioned by your source control, or `ln -s`ed by your build system. (Note that if you auto-readlink a symlink, your build system will only run `ls -s` once.)
(Reply) (Parent) (Thread)
[User Picture]
From:sfllaw
Date:August 15th, 2005 06:58 am (UTC)
(Link)
Sorry, my bad. GNU Make treats directories exactly the same as it does files.
(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:August 15th, 2005 03:37 pm (UTC)
(Link)

It isn't a 'by default' or 'not by default' sort of thing. That's the dependency type. If that's what you want, that's the dependency type you use, and if it isn't, you don't.

(Reply) (Parent) (Thread)
[User Picture]
From:sfllaw
Date:August 15th, 2005 04:08 pm (UTC)
(Link)
It is by default, because of the semantics of your tokens.

If you had a MUST_EXIST (that implies OUTDATED_BY) and then a MERELY_EXIST; then this would make more sense to people, and they won't choose the wrong one.
(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:August 15th, 2005 05:34 pm (UTC)
(Link)

Perhaps. For the most part those dependency types aren't going to be exposed to the average user of Ptah anyway. Most people will use a function that will put in a bunch of nodes and attach them to builders. Those functions will be the equivalent of saying "I want to build this C program from this set of .c files."

I'm strongly attached to the idea of the dependency types being a minimal set. But I could make the function a user would use to add a non-BUILT_FROM dependency default to adding both a MUST_EXIST and UPDATED_BY dependency.

(Reply) (Parent) (Thread)
[User Picture]
From:omnifarious
Date:August 24th, 2005 05:14 am (UTC)
(Link)

I've thought about this a bunch, and I think the only reasonable solution is for the action taken to build something to not depend on how it's depended upon. This means that a MUST_EXIST dependency will have the exact same effect as an UPDATED_BY or BUILT_FROM dependency from the perspective of the thing being depended on. The only difference will be that action taken by the dependent when it finds that the node it depends on is in some particular state.

(Reply) (Parent) (Thread)