This tutorial will walk you through how to use Scroll for data analysis and visualization, from basic concepts to advanced techniques.
Scroll combines the simplicity of markdown-style syntax with powerful data transformation and visualization capabilities. You can:
Let's dive in!
Scroll comes with several sample datasets. Let's start with the famous iris dataset:
iris
printTable
sepal_length | sepal_width | petal_length | petal_width | species |
---|---|---|---|---|
6.1 | 3 | 4.9 | 1.8 | virginica |
5.6 | 2.7 | 4.2 | 1.3 | versicolor |
5.6 | 2.8 | 4.9 | 2 | virginica |
6.2 | 2.8 | 4.8 | 1.8 | virginica |
7.7 | 3.8 | 6.7 | 2.2 | virginica |
5.3 | 3.7 | 1.5 | 0.2 | setosa |
6.2 | 3.4 | 5.4 | 2.3 | virginica |
4.9 | 2.5 | 4.5 | 1.7 | virginica |
5.1 | 3.5 | 1.4 | 0.2 | setosa |
5 | 3.4 | 1.5 | 0.2 | setosa |
You can also load datasets from Vega's collection:
sampleData zipcodes.csv
limit 0 5
printTable
"{ Folder Not Found" {The folder ""ohayo.scroll.pub"" does not exist on this ScrollHub instance. " "{If you'd like to create this folder"`, d3.autoType) const get = (col, index ) => col !== "undefined" ? col : (index === undefined ? undefined : Object.keys(data[0])[index]) document.querySelector("#plot44").append(Plot.plot({ title: "Maximum Temperature in Seattle", subtitle: "", caption: "", symbol: {legend: false}, color: {legend: true}, grid: true, marks: [Plot.line(data, { x: get("date", 0), y: get("temp_max", 1), stroke: "steelblue", fill: get("undefined"), strokeWidth: 2, strokeLinecap: "round" })], width: 640, height: 400, })) } loadChart() } Bar ChartsLet's create a bar chart showing precipitation: sampleData seattle-weather.csv
groupBy weather
reduce precipitation mean precip_avg
barchart
x weather
y precip_avg
fill teal
title Average Precipitation by Weather Type
Part 3: Advanced Data TransformationsGrouping and AggregationLet's look at some more complex transformations: sampleData weather.csv
groupBy weather
reduce temp_max mean avg_max_temp
reduce temp_min mean avg_min_temp
orderBy -avg_max_temp
printTable
Creating New ColumnsLet's add some computed columns: iris
compute ratio {sepal_length}/{sepal_width}
where ratio > 2
printTable
Part 4: Advanced VisualizationsHeatmapsLet's create a heatmap of annual precipitation values: sampleData seattle-weather.csv
splitYear
groupBy year
reduce precipitation mean precipitation_mean
select year precipitation_mean
transpose
heatrix
Multiple ViewsYou can create multiple visualizations: iris
scatterplot
x sepal_length
y sepal_width
fill species
barchart
x species
y sepal_length
fill teal
title Sepal Length by Species
ConclusionThis tutorial covered the basics of data science with Scroll. Some key takeaways:
⁂
|
---|