April 21, 2024 — The source code for this blog post contains a ScrollSet about the planets and generates this HTML file as well as a CSV, a TSV, and a JSON file. This page demonstrates ScrollSets.
ScrollSets are useful for small single day projects and large multi-year projects with thousands of concepts like PLDB (a Programming Language Database).
ScrollSets are normal plain text files written in Scroll that also contain measurements of concepts and output that data into formats ready for data visualization and analysis tools.
ScrollSets are line oriented but represent a table(s). You might call them deconstructed csvs or deconstructed spreadsheets.
This ScrollSets has 2 measures (columns) and 2 concepts (rows).
Documentation, column definitions, rows and *any notes/markup/content* can go in the same file.
# Measures (aka Header, aka Columns, aka Schema)
idParser
// Every concept needs an "id" (or other concept delimiter)
extends abstractIdParser
moonsParser
extends abstractIntegerParser
# Concepts (aka Rows)
id mars
moons 2
// I verified moon count with Google. - BY
id jupiter
moons 63
// Note: the moons of Jupiter have their own Wikipedia Page
https://en.wikipedia.org/wiki/Moons_of_Jupiter moons of Jupiter
buildConcepts demo.csv
id,moons
mars,2
jupiter,63
id
moonsParser
id [conceptId]
, Scroll knows that is the beginning of a new concept.measures.scroll
.appeared 2024
Almost certainly. Using ScrollSets will be much slower and worse than future spreadsheet apps with carefully crafted LLM integrations.
However, it's important to also have simple, lower tech, timeless tools and ScrollSets is one of those.
Yes! You can easily achieve the same thing as LLMs & ScrollSets using LLMs & YAML, or LLMs & YAML & Markdown.
For YAML, just put your documentation and schema in YAML comments up top and then have a tiny script to read that YAML and dump CSV/TSV/JSON or whatever. YAML gives you loads of data structures to use and is widely supported in many languages. But generating HTML from the same file would require more work.
If you want to intermix markup content with your data, you can use Markdown to add the marked up content and then have code sections embedding the YAML and a tiny script to parse out those YAML blocks and write your data to disk.
Either can do the job. I expect the Scroll design to end up being more ergonomic, but that might not be true or may be unimportant.
If you don't like Scroll's (evolving) version and want to switch it will always be straightforward to automatically refactor to YAML.
This is a simple pattern to implement, so I'm sure it is likely it has been done a few times before. Please let me know so I can include links to--and learn from--any other prior art.
+ Planned.
LLM dataset generation is a major breakthrough in datasets. ScrollSets are, at best, a minor improvement. They are designed to work alongside LLMs to help solve the Dataset Needed problem.
ScrollSets evolved out of TrueBase. ScrollSets have eliminated the need for the TrueBase software (and existing TrueBase sites should be migrated to ScrollSets), but were informed by the TrueBase build experience.
Although ScrollSets are designed for a world with LLMs, the design is meant to be useful without them as well, and would also have been mildly useful 30 years ago.
import
parser).The normal way to implement this in Scroll would be something like:
measures
id string
moons int
concept
id mars
moons 2
concept
id jupiter
moons 63
The flat design was chosen for ergonomic reasons. ScrollSets seem like they might be useful enough to be worth breaking from Scroll convention a bit. Like all things in Scroll, ScrollSets are an experiment, and maybe this design will evolve.
Below is the ScrollSet embedded in this Scroll file.
idParser
extends abstractIdParser
diameterParser
extends abstractIntegerMeasureParser
description What is the diameter of the planet?
surfaceGravityParser
extends abstractIntegerMeasureParser
description What is the surface gravity of the planet?
yearsToOrbitSunParser
extends abstractFloatMeasureParser
description How many Earth years does it take for the planet to orbit the Sun?
moonsParser
extends abstractIntegerMeasureParser
description How many moons does the planet have?
boolean isMeasureRequired true
float sortIndex 1.1
akaParser
extends abstractStringMeasureParser
description What are the alternative names for the planet?
ageParser
extends abstractIntegerMeasureParser
description How old is this planet?
hasLifeParser
extends abstractBooleanMeasureParser
description Does this planet have life?
wikipediaParser
extends abstractUrlMeasureParser
description URL to the Wikipedia page.
// end measures
id Mars
moons 2
// Til Mars has 2 moons!
diameter 6794
surfaceGravity 4
yearsToOrbitSun 1.881
hasLife false
id Jupiter
moons 63
// The moons of Jupiter have their own Wikipedia Page
https://en.wikipedia.org/wiki/Moons_of_Jupiter moons of Jupiter
diameter 142984
surfaceGravity 25
yearsToOrbitSun 11.86
hasLife false
id Earth
moons 1
diameter 12756
surfaceGravity 10
yearsToOrbitSun 1
aka Pale Blue Dot
hasLife true
wikipedia https://en.wikipedia.org/wiki/Earth
age 4500000000
// Note: It was only during the 19th century that geologists realized Earth's age was at least many millions of years.
id Mercury
moons 0
diameter 4879
surfaceGravity 4
yearsToOrbitSun 0.241
hasLife false
id Saturn
moons 64
diameter 120536
surfaceGravity 9
yearsToOrbitSun 29.46
hasLife false
id Uranus
moons 27
diameter 51118
surfaceGravity 8
yearsToOrbitSun 84.01
hasLife false
id Venus
moons 0
diameter 12104
surfaceGravity 9
yearsToOrbitSun 0.615
hasLife false
id Neptune
moons 14
diameter 49572
surfaceGravity 11
yearsToOrbitSun 164.79
hasLife false
// end concepts