resolve.c difficulties + data structures in ion

Hello,
first: thanks for sharing all this knowledge! I definitely already learned a ton more details about c and some workflow.

The videos so far have been a very good insight although I have to admit the resolve.c file is a bit confusing to me still.
Having some more textual write down about the process/ data flow would help me I think, maybe I will do it to actually understand it cause the different struct types do not really fit in my head all at the same time like what does this Type/ Typefield/ Sym actually do at every point in time...probably obvious if one has this very strong c background in mind but kind of hard for me.


I'd like to ask another question though:
How do you expect to write the coming data structures in ion without the preprocessor and generics?
Like stretchy buff, hash maps are just the two main ones I think (and I think you think) about.

side note: my background is: I'm halfway though CS and have written a small OS (preemptive multitasking, shared memory) in C as one part of it

Edited by listabo on Reason: Initial post
Sure, I plan to cover the resolver algorithm in the write-up I'll start working on this weekend. The core idea is like the textbook DFS algorithm for topological sort. If you know how to implement a recursive DFS with 'visited' flags on nodes to detect cycles, it's a similar idea to how the resolver works via resolve_sym. There's another variant of recursive DFS which uses monotonic timestamps. We're going to add something similar soon since it lets you print the symbol path that led to a cycle: when a cycle is detected (from a non-monotonic timestamp sequence), sort all the SYM_RESOLVING symbols by timestamp and print them in order. You could also reconstruct the path by manually maintaining a path stack in parallel to the resolve_sym call stack. But all of these things are just DFS topological sorting in various guises. The thing that's different from textbook DFS is that we aren't just doing the topological sort as a separate pass but intermingled with everything else in the resolve pass, especially type checking.

For version 0 there's no plan for an easy way to support stretchy bufs. We might add special support for dynamic arrays in some later iteration, but there's no short-term plans. But you can do what you've always done in C, which is implement each type instance manually, sharing what is possible/reasonable. Then the manual per-instance code is limited, e.g.

1
2
3
4
5
func buf_push(buf: void**, elem_ptr: void*, elem_size: size_t);

func int_buf_push(buf: int**, elem: int) {
    buf_push(buf, &elem, sizeof(elem));
}


Or you could just call the generic buf_push directly at each call site.

That's even assuming you want a stretchy buf style interface. Other approaches to dynamic arrays might be equally good or better once you start doing a bit of custom code per instance.

Stretchy bufs are mainly valuable for prototyping (e.g. all the current uses in the Ion compiler will probably be replaced by something more custom). Having them available is great for productivity but it doesn't take a lot of time to write out a specific type instance by hand when you already have elem_size-parameterized generic versions based on void pointers.

Again, we might do language-level support for something like stretchy bufs later, but it's not really a show stopper, just a minor inconvenience, like the other compromises we're making.

Edited by Per Vognsen on
Thanks for the planned writeup!

I actually know the topological algorithm and how this is supposed to solve the type inference (that is kind of what you learn at university) so for me that in itself is not the issue.
As far as I see I also understand the things like string interning and hash consing and those first parts of the code mostly make much sense to me/ were not very difficult to understand/ explained well in the video.
It's more about the naming and design of the whole thing for me at least.
In the end it also comes down to just investing more time into reading/ thinking for me so I don't blame you for the lack of a writeup, it's already helpful what you're doing.

Language level support would probably go a long way like in go if that was done at some point.
I'd actually think I'd use ion to write c code for microcontrollers if the refactoring/ exporting facilities work out as you described, really liked those ideas.

Edited by listabo on
some time ago I started reading craftinginterpreters and did not really get it, but your explanations in the first videos made much sense to me and I tend to revisit topics ever so often and I now find that resource to be helpful as well.

Maybe others are interested in reading something of that as well so I thought I'd briefly mention it.
pervognsen
1
2
3
4
5
func buf_push(buf: void**, elem_ptr: void*, elem_size: size_t);

func int_buf_push(buf: int**, elem: int) {
    buf_push(buf, &elem, sizeof(elem));
}


Correct me if I'm misunderstanding something, but isn't elem passed on the stack? And then you're taking the address and storing it in your buffer.
I realise that wasn't intended as production code, but I think it demonstrates why doing something like stretchybuf in C without using macros is such a pain.

I think it could be really instructive to think through the cases where macros are useful in C, and see what the code would look like if you were forced to do without them. Some uses, for example as a replacement for reference types, aren't a big deal - you can just pass a pointer instead. As a replacement for generics, or as a way to cut down on very repetitive code, they can be really handy, but I guess the only casualty is that you end up with more verbose code. Are there any other common uses for macros that would be difficult to do without? I guess one idiom you see quite a lot (especially in single header libs and the like) is to set various compile time parameters - due to the resulting combinatorial explosion that could be a real pain to emulate, you'd probably end up having to make them runtime parameters instead a lot of the time.

yes the int is copied onto the stack (really in a register) but buf_push will copy it again into the buffer itself.

It's much more natural to be able to do int_buf_push(&buf, i); than int_buf_push(&buf, &i); and the latter doesn't allow passing int literals.

Edited by ratchetfreak on
ratchetfreak

yes the int is copied onto the stack (really in a register) but buf_push will copy it again into the buffer itself.

It's much more natural to be able to do int_buf_push(&buf, i); than int_buf_push(&buf, &i); and the latter doesn't allow passing int literals.


Everyone knows how big a fan I am of stretchy bufs, so I was just saying that while it would be nice to have stretchy bufs as is, there are palatable alternatives.

For your specific case, you can use int literals if you use a C99 (and Ion) trick: int_buf_push(&buf, &(int){42}) works because compound literals (which despite the name works with non-compound types like int) are lvalues with the same storage duration as their enclosing scope. It's not a huge labor-saving device for this example, but I like reminding people how much C99 improved simple quality-of-life issues like these. Of course in this case you want to use int_buf_push(&buf, 42) since that's just a friendlier API.


Edited by Per Vognsen on
Cause you mentioned typedefs in todays stream briefly I would like to ask another "feature" question.

What is your objection to having strong/strict typedefs in a sense like go has which do not cast to their base automatically?

I would like to think one easy option to do this would be to create a struct on the c side and automatically do the boxing/unboxing of the struct member in c code generation.
This seems like an easy win for me if you have code generation in place anyway.

One major downside I could image is that it's harder to realize this for the other backend(s?)?
There are no strong objections other than the fact that in "version 0" I'm trying to focus on parity with C features (and not even all C features). Strong typedefs can be added later. I still want to have a C weak typedef equivalent for parity.