/bsp/ - bsp

Someone's Office

Index Catalog Archive Bottom Refresh
+
-
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0 (Temporarily Dead).

Ghost Screen
Celebrating its fifth anniversary all September


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

DL1 Thread bsp Board owner 07/11/2025 (Fri) 01:08:00 Id: 97de93 No. 11
DL1 is a computing model, virtual machine, intermediate representation and programming language. DL1 most closely resembles an untyped functional programming language with effects that corresponds to stored-program computers based on a pattern-matching variant of the lambda calculus. DL1 has three primary features distinguishing it from other functional languages: - Explicit evaluation: an eval function that is completely specified, does not perform macro-expansion, operates specifically on a small well-defined structured language and has the same semantics as a JIT compiler. - Meta-execution: a set of meta-execution primitives that allows for safe virtual execution of DL1 code. - Copy-on-write semantics: DL1 vectors include an update function that uses copy-on-write to guarantee constant-time mutation when possible and produces a copy of a vector otherwise.
One of the consequences of working on DL1 is that I've had the chance to work on various macro systems for it. Generally, my approach to macro systems for DL1 has been somewhat different from the common solutions, owing both to wanting something both simple to implement and having a reasonable semantics. To my knowledge the particular approach that I've taken is unique to DL1, so I want to make a few notes about macro systems. Among other languages, there are a few major approaches to macro systems. >Looping quasi-quote + gensym macros Common Lisp and Clojure take the "most obvious" approach for Lisp-like languages. This approach is to simply have the user generate code as they would have written it. Typically, an impure "gensym" primitive is provided to allow constructing unique names that do not collide with a macro user's own variable names. When this style of macro-expander runs, it has roughly this operation: >if macro-expanding a symbol, generate a variable lookup >if macro-expanding any other atom, generate a quote form >if macro-expanding a list and the head is a symbol corresponding to a macro's name, run the macro to generate a new expression, then macro-expand the new expression >if macro-expanding a list and the head is not a symbol corresponding to a macro's name, macro-expand every element of the list and generate a function call of the macro-expanded head to a set of arguments consisting of the macro-expanded tail The third step restarts the macro-expander. I refer to such macro-expanders as looping macro-expanders, since it repeats the macro-expander after running a macro. This lets disciplined users construct macros with relative ease, not requiring the user to learn any additional mechanisms beyond the well-known quasiquote form. Despite the potential for misuse, I feel that the balance between implementation difficulty and chance of misuse favors this approach compared to Scheme's hygienic macros. >Looping hygienic macros Scheme takes a different approach. While its macro-expander's operation is the same, its macros instead goes out of its way to avoid variable capture. The result is that variable names used by a macro will not be visible to the macro's users. In the R5RS syntax-rules mechanism, this involves constructing macros by clauses each pairing a pattern-matching and a new body where pattern-variables are replaced with actual occurrences. Later, syntax-case provided a more explicit mechanism, and has the user specify a program to process the AST and use a syntax-quote form to yield a new AST. Following Scheme, many languages have replicated hygienic macros, often touting it as a key feature. >AST-based macros Template Haskell from the GHC implementation of Haskell as well as OCaml's PPX use explicit AST manipulation in a monad alongside quasi-quotation. The issue with Template Haskell is that while it can generate code and this code generation is composable, the actual code generation is not practically usable at the user level. Ironically, Haskell can specify macros but unfortunately cannot call those macros. As for OCaml's PPX, I haven't used it, but a history of PPX exists on the OCaml forums: https://discuss.ocaml.org/t/the-future-of-ppx/3766 With DL1, I had a different idea. DL1's AST is fixed, so there is no clear advantage in a looping macro system. Instead, my initial macro-expander in DL1 acted as follows: >if macro-expanding symbol, generate variable lookup >if macro-expanding a vector and the first element exists and is a symbol naming a macro in the <macros> namespace, then yield the result of calling the macro with the rest of the vector as the macro-argument vector, environment and location >if macro-expanding a vector and the first element exists and is not a symbol naming a macro, then macro-expand every element and generate a function call where the first element is applied to a vector containing the rest of the elements >if macro-expanding empty vector, crash >if macro-expanding value, return quoted value While this worked well enough, macro composition was impossible. In practice, this would most likely have resulted in a proliferation of helpers just for manipulating ASTs. When working on DL2 using an adapted macro system, it became clear that this approach wasn't practical. I was proto-typing a typed DL2 macro-expander which would forego ordinary type inference in favor of using only explicit types. These macros as I had implemented them were inherently much more involved due to taking in type information, performing type-checking and yielding terms and judgments rather than just expressions. If I could compose these macros, I could avoid manual environment management and intermediate type-checking to instead focus on just the terms and the types of importance. I prototyped the second iteration of macro-expansion for DL2, which has roughly this behavior: >if macro-expanding a vector of the form [#'<macro-apply> f argv], call f with macro-argument vector argv, environment and location, otherwise macro-expand as previously described This "<macro-apply> macro" system immediately allowed for composition of macros, and was a major boon when continuing to implement DL2 macros. I introduced additional helpers and macros to automatically lookup DL2 macros and compose them using <macro-apply>. For DL2's more complicated macro-expansion system, this is extremely useful and practically necessary. I was very pleased and promptly added the clause to DL1's macro-expander. My third iteration of a macro-expander for DL1, which currently exists as a separate program from the DL1 micro-code, uses a modified reader that attaches lexical metadata when reading files. This allowed for a form of macro-expansion which would use this information to simplify error reporting. Before then, location information had to be passed down to macros by the macro-expander, forcing the user to piece together the actual location from a path through the AST. After <macro-apply> was added, maintaining location meta-data became a considerable burden. This "lexical macro" modification allowed macro-expanders to read and use lexical information directly from their argument vector. This implementation is slightly more complicated, but much more pleasant from a user perspective. A side benefit of the lexical macro system is its baffling generality. Many of the helpers used operate on "lexical parse trees" rather than the values represented by those parse trees. Per-macro-system boilerplate is minimal, mostly revolving around looking up that macro system's macros from its macro name-space, however it is defined in the current environment. One benefit that appears in writing macro-expanders is that it offers full flexibility regarding name-spaces. It is entirely possible, for example, to write a macro-expander which emulates Common Lisp's macro-expansion where functions and macros share a name-space that is separate from all other variables. For DL1, I have used this instead to provide different name-spaces for different languages, allowing multiple languages to practically co-exist through macro systems. This has made macro development using DL1 extremely comfortable. In fact, some might even call it cozy. I doubt I'll ever find myself wanting for much when using DL1 macros.
From early on in DL1's development, I've had a "recurse" form for expressions to perform recursive iteration and execution. It has gone through a couple iterations, none of which have been completely satisfying to work with. It isn't until after doing some work with DL2, then returning to DL1 that I figured out what I believe to be the best general solution to the problem. While working on DL2 I needed a way to perform purely-functional iteration. In imperative languages, recursion cannot be used for general control flow, a property which I refer to in the context of DL2 as having a "stack-recursive" resource semantics. An oddity of almost all imperative languages is that, since looping operates fundamentally by mutation, the general process of computing a result from a loop tends to require either: - declaring an uninitialized variable of the result type in the outer scope, then assigning that variable to the actual result in the final iteration of the loop - declaring a tagged union loop state variable holding either an intermediate state of the loop or the final result of the loop, looping on intermediate states then extracting the final state The latter solution is the technically-correct solution. The post-condition of the loop is that the loop state variable holds a final result rather than an intermediate state. As a result, it is a dynamic semantic guarantee that the final result has been computed and can be dynamically extracted. This solution isn't type-safe, however, since it requires the final extraction of the result from the loop state variable to use the dynamic assumption of the post-condition. While working on DL2 with the intention of transpiling it to imperative languages, I ran into this problem and developed the theory of "stack-recursive" vs "state-recursive" to explain how this leads to a semantic difference between imperative and functional languages. To augment stack-recursive semantics to allow for equivalent behavior to state-recursive semantics, I needed to be able to deal with this issue. I introduced a new "run-state-machine" primitive. A side benefit of this approach that indicates its utility (and correctness) is that the explication of loop state both allows for a purely-functional semantics for iteration and reduces the algorithmic complexity of live-variables analysis to only a single pass over the code. After going back to working on DL1, I re-worked the concept as the "run-repeat" function and "repeat" looping construct. The definition of "run-repeat" and some associated helper functions in ldl1.dl1 is: [lexec [defun run-repeat [step loop-state] [match loop-state [[#t x] [run-repeat step [apply step x]]] [[#f r] r]]]] [lexec [defun next x [cons #t x]]] [lexec [defun finish [r] [cons #f r]]] Whenever run-repeat loop states are constructed by next, step is applied to this intermediate loop state, and then run-repeat is restarted with the new resulting loop state. If the run-repeat loop state is constructed by finish, run-repeat returns the result directly. The definition of next is constructed to simplify intermediate states that either use multiple loop variables at once. At this point, I had both repeat and recurse forms with similar problems. The usages of the two forms at this stage were: [repeat <init-state> <intermediate-state-pattern> <step-body>] [recurse <rec-name> <init-call> <argument-pattern> <rec-body>] Both forms were just slightly awkward. In both cases, the pattern sits awkwardly between expressions, repeat takes three arguments and recurse takes four arguments, which is unpleasant. I then had the idea that I should make them both clausal: [repeat <init-state> [<intermediate-state-pattern> <step-body>] ...] [recurse <rec-name> <init-call> [<argument-pattern> <rec-body>] ...] These forms tie the arguments with the bodies that use them, and provide pattern matching for both. The current definitions of recurse and repeat are: [lexec [define-macro recurse [<vector> argv argv-loc] env _ [let [_ [when [< [length argv] #d2] [error #'recurse-invalid-argv argv-loc argv]]] [[[name use] clauses] [slice-at argv #d2]] [clx [apply lrx-cons clauses]] [lmacro-expand [m-call with-var name [m-call rec-match name clx] use] env]]]] [lexec [define-macro repeat [<vector> argv argv-info] env _ [let [_ [when [< [length argv] #d1]
[Expand Post] [error #'repeat-invalid-argv argv-info]]] [[[init] clauses] [slice-at argv #d1]] [mexp [mr-pcall run-repeat [m-acall lambda-match clauses] init]] [lmacro-expand mexp env]]]] Some recent code using these forms are quasiquote from ldl1.dl1 and id-exec from dl2-mod.dl1: [lexec [define-macro quasiquote [term] env _ [recurse qq-gen [qq-gen term] [[qq-lrx] [lmatch qq-lrx [[#'unquote x] [lmacro-expand x env]] [[<vector> x] [let [qq-all [map qq-gen x]] [if [all? [lambda [y] [match y [[#'quote _] #t] [_ #f]]] term] [cons #'quote [lrx->data qq-lrx]] [cons #'cons qq-all]]]] [_ [cons #'quote [lrx->data qq-lrx]]]]]]]] [defun id-exec [init-exec] [repeat [next #d0 init-exec] [[_ [#'value r]] [finish r]] [[id [#'effect #'gen-id k]] [next [+ id #d1] [continue k id]]] [[id [#'effect eff k]] [next id [continue k [effect eff]]]] [[_ [#'error err]] [error #'id-exec-error err]]]] The "quasiquote" macro definition is able to define the recursive function "qq-gen" right within its (non-recursive) macro definition without a helper. Then qq-gen is recursively called within the higher-order function "map" without needing to define a separate helper. The "id-exec" function definition uses the repeat form to supply an initial id of 0, and uses the clausal pattern matching to branch for each of its cases. I think both of these should be relatively representative of repeat and recurse in practice. I'm satisfied with both of these forms. They provide simple clausal expression forms for sophisticated recursion and looping without the convoluted syntax of either Scheme's "do" or named "let" macros, or the convoluted syntax of Common Lisp's mythical "LOOP" DSL or "LABELS" special form. The two forms are pleasant, and I think I'll keep using repeat and recurse as they are now, and possibly make them part of the "DL1 canon" of expected macros.
>>18 For comparison, I have also implemented an analog of repeat and recurse in Racket. (define-syntax-rule (recurse name use clauses ...) (letrec ((name (match-lambda** clauses ...))) use)) (struct rs-next (x) #:transparent) (struct rs-finish (x) #:transparent) (define (run-repeat step st) (match st ((struct rs-finish (x)) x) ((struct rs-next (x)) (run-repeat step (apply step x))) (_ (error 'run-repeat "run-repeat invalid state ~a" st)))) (define finish rs-finish) (define-syntax-rule (next x ...) (rs-next (list x ...))) (define-syntax-rule (repeat st clauses ...) (run-repeat (match-lambda** clauses ...) st)) Two usage examples: (define (alt-map f l) (recurse map (map l) (((cons x xs)) (cons (f x) (map xs))) (('()) '()))) (define (alt-length init-l) (repeat (next 0 init-l) ((n '()) (finish n)) ((n (cons x xs)) (next (+ n 1) xs))))
For both DL1 and DL2 I'm unclear about what concurrency model would be appropriate for the two. I have had a few ideas, but I'm not satisfied with any of them yet. DL1's copy-on-write semantics effectively requires eager reference counting, and DL2's runtime-agnosticism means that in general it's preferable to have solutions for everything, including whatever memory model is appropriate for concurrent DL1. >Just copy everything I am seriously considering this. A good reason NOT to do this are large allocations. Making specific exceptions to allow for sharing such allocations does not strike me as entirely unreasonable. >Just use Software Transactional Memory (STM) This is almost my favorite solution. It is not a complete solution, but this might become part of a complete solution. In particular, it doesn't provide a solution to shared references, which are the main problem. It assumes a preexisting ability to share immutable structures. >Just use Atomic Reference Counts (ARC) This would be my favorite solution if it weren't a performance disaster under-the-hood. First, there are the straw solutions of only using a single atomic reference count for each object. This would be extremely slow, especially on older computers, even for unshared data with uncontended reference counts. Likelier I would have to use a hybrid atomic reference count, which is more common in practice anyways in cases when sharing is not guaranteed. There are two issues, one just a matter of performance but one which is semantic. The first issue is that using hybrid reference counts means that for shared objects I have to deal with either time overhead of checking whether an object's reference count needs an atomic update or accept the memory overhead of an additional reference count when decrementing the reference count. The second issue is that it doesn't allow an appropriate performance semantics. If I don't want certain objects to become "tainted" by sharing, then I need to be able to re-claim that memory when it is used by only one thread. Single-word hybrid reference counts can only do so when there is exactly one reference remaining and one more reference is made. Dual reference counts can only do so when one thread relinquishes all references to an object, then will only allow one lucky reference to be re-claimed as a special case. A realistic approach allowing reclamation would likely require each thread keep its own reference counts to shared objects, then reclaim on writes, but it's not clear that such book-keeping would be practical. >Approximate borrowing One idea I've been considering is a form of "approximate borrowing" as an automatic solution. The intuition is this basic idea: parallelism is usually a form of non-determinism, so efficient solutions must yield to this non-determinism via approximation. While it is necessary to have eager reference counting for CoW structures which aren't shared, a looser constraint may be appropriate for shared structures. Inter-thread communication could explicate reference counting events. It's not clear that it's possible to write an efficient general solution even with this relaxation. >Explicit memory pools with cross-pool references I've been leaning towards a solution where cross-thread references are explicit rather than implicit. This makes the most sense in terms of explicit memory pools and resource management. The strategy is that, rather than data being immediately visible between threads, it has to be prepared in its own memory space before it can be sent from one thread to another. Concurrent data structures can be formed from a mix of STM and cross-references. This has some similarities to an "opt-in" ARC system, and would likely be complementary to the instrumentation necessary to track memory use. The biggest issue is which operations would make the most sense. For better or worse, a cross-thread "copy" operation would be necessary to support closures and other opaque data types. Likely, this operation would do a few things: >check if the copy is being made from the sole reference to a memory pool, if so no copy needs to be made at all, as the whole memory pool can be subsumed >maybe record the copy to such a reference, so that only one copy ever needs to be made for any given reference >maybe go a bit further, and maintain a full lazy copy of the other memory pool, since such a copy has to be built anyways It'd be nice to have pattern-matching operations to examine internal memory, so maybe the DL1 core primitives also need to have both ordinary and concurrent explicit pattern matching added in. The alternative is doing it all manually through the use of operations like a concurrent analog of "index". The "index" operation would be a necessity to work with large vectors. The "copy" operation could be appropriate for words, but even discussing it makes me think I really ought to just reach for the explicit pattern matching, since I have to type-test it anyways. I think this approach would be complementary with other concurrency notions, or could subsume them. This approach should have immediate inter-thread communication at the cost of making the communicated data harder to use. I believe it also has the flexibility to support many underlying runtime models of concurrency. If I'm right about that, it would give several points in favor of the approach.
In the process of writing DL1, I also wrote two extension systems, which eventually converged into one extension system called Octext. The basic premise of Octext is based on some observations: >difficulties in library use stem from various type incompatibilities between languages >if only a few fixed types are used, there's no need to write FFI bindings for untyped languages >statically-typed languages can use it directly, layer their own types based on specifications or expose library access to other programs >the overhead of library calls is proportional to the number of times libraries are called divided by the work done by those libraries per call >structured binary data is a universal data format >adding domain elements tends to be sufficient for most practical tasks >if the user provides a runtime themselves, their own data structures can be used directly by the extension, saving a pass of data marshaling More obscurely: >C libraries give the illusion of control by exposing every element of their domain as its own call, while Haskell libraries will casually include DSLs promoting correct use >the Haskell approach puts less pressure on the API proper and promotes both internal and external correctness Octext allows a different mode of language and program extension, providing a single agreed interface between language and extension based on structured binary data alongside extension-provided types that remain opaque to the extension's user. Octext takes an additional step to make this practical, it allows the extension user to provide its own runtime to the extension, allowing the native types of the extension user to supply the interface of the extension. My intention is to allow the user to supply their runtime as one of several vtables, with method gaps filled in programmatically. This way, the extension user can specify its most appropriate set of methods and the extension can use its most appropriate set of methods, with Octext bridging the gap. Currently, Octext is going some re-design, which I'm hoping should be its last. Currently, I have two major goals. First, I want to include is a direct memory access (DMA) interface, allowing the user runtime to lend it direct pointers to its data structures. Second, I want to update the system so that it can handle concurrency. The DMA interface I have in mind is simple, but I really should get comments on it. Concurrency is another point to deal with later, but there's a short-term need to resolve that concurrency can be done with the current system. The runtime's complete vtable currently looks like this: struct octext_runSpec_proto { octext_dmaWrite_t (*dmaWriteAlloc)(octext_user_t, octext_reqTok_t, size_t); octext_obj_t (*dmaWriteCons)(octext_user_t, octext_reqTok_t, octext_dmaWrite_t); octext_obj_t (*vectorAlloc)(octext_user_t, octext_reqTok_t, size_t); octext_obj_t (*opaqueCons)(octext_user_t, octext_reqTok_t, octext_tok_t, octext_opq_t); bool (*isString)(octext_user_t, octext_reqTok_t, octext_obj_t); bool (*isVector)(octext_user_t, octext_reqTok_t, octext_obj_t); bool (*isOpaque)(octext_user_t, octext_reqTok_t, octext_tok_t, octext_obj_t); size_t (*stringLength)(octext_user_t, octext_reqTok_t, octext_obj_t); void (*stringRead)(octext_user_t, octext_reqTok_t, size_t, uint8_t *, octext_obj_t, size_t); octext_obj_t (*stringUpdate)(octext_user_t, octext_reqTok_t, octext_obj_t, size_t, size_t, const uint8_t *); octext_dmaRead_t (*dmaRead)(octext_user_t, octext_reqTok_t, octext_obj_t); void (*dmaReadFree)(octext_user_t, octext_reqTok_t, octext_dmaRead_t); octext_dmaWrite_t (*dmaWrite)(octext_user_t, octext_reqTok_t, octext_obj_t); size_t (*vectorLength)(octext_user_t, octext_reqTok_t, octext_obj_t); void (*vectorRead)(octext_user_t, octext_reqTok_t, size_t, octext_obj_t *, octext_obj_t, size_t); octext_obj_t (*vectorUpdate)(octext_user_t, octext_reqTok_t, octext_obj_t, size_t, size_t, const octext_obj_t *); octext_opq_t (*unwrap)(octext_user_t, octext_reqTok_t, octext_tok_t,
[Expand Post] octext_obj_t); octext_obj_t (*copy)(octext_user_t, octext_reqTok_t, octext_obj_t); void (*free)(octext_user_t, octext_reqTok_t, octext_obj_t); }; Right now, I have the slightly ugly notion of giving "request tokens" to every request for concurrency, and I have to think about if it will work and if there is a better way to do it. I also have to settle exactly how DMA should work from a reference-counting perspective.


Forms
Delete
Report
Quick Reply