The write-up below isn't exactly the same as the one found in the Data Sketches book. For the book we've tightened up our wording, added more explanations, extra images, separated out lessons and more.
I was reading the quarterly magazine of the Dutch branch of the World Wildlife Fund (WWF), which I get for being a donor. Suddenly I wanted to make a visualization that related to something the WWF might do. I "pitched" the idea to Shirley and we went back and forth a bit on what general topic would work for both of us. And then Shirley found her angle, the data visualization survey that had just come out, which gave us our topic of "community".
At the start of April I asked twitter for help on datasets that one might associate with the WWF and got a whole lot of links (thank you very much dear followers!). However, due to being in the US the entire month, doing conferences & meet-ups and creating the presentation that Shirley and I gave at OpenVisConf in Boston on April 24th (which was another amazing conference!), I didn't get to do anything with the links until I was waiting at the gate of my flight home to Amsterdam on April 26th.
I received a lot of tracking data links, either animal or buoy in the water. But I noticed that the search functionalities of these data repositories is aimed at researchers. I could search datasets based on the id of a paper or name of a scientist. But I couldn't request all tracking data of, say, whales... Another type of dataset that was very prevalent were the choropleths, filled regions on a map, representing things such as protected areas, or animal habitations zones.
I started to meander through the links, going down the "Earth" related datasets, and I don't know how I got there, but at some point I found myself on the website of NOAA STAR, the Center for satellite applications and research. Again, just clicking around, and then I came across an image of the Earth, colored by vegetation health. STAR calls it "No noise (smoothed) Normalized Difference Vegetation Index (SMN)" or "Greenness" for short :). This is what STAR had to say about the data: "[Greenness] can be used to estimate the start and senescence of vegetation, start of the growing season, phenological phases. For areas without vegetation (desert, high mountains, etc.), the displayed values characterize surface conditions."
And they had a map like this for every week in the year of 2016. There was also an option that showed all the maps in a very rough animation (as in, going through 52 images within a minute or two). Even though the animation was crude, and the color palette, well, not optimal, I really liked seeing the changes of vegetation health throughout the year. I wanted to visualize the same thing, but do it in my own style. To show a continuously "breathing Earth".
Furthermore, after seeing what Dominikus Baur did with the Pixi.JS library, while we were both presenting at the INCH conference last March, I knew I wanted to try it out as well. I just needed a project where I had to visualize a lot of points. And this idea seemed like the perfect opportunity to give Pixi a try, since there are often thousands if not millions of "pixels" in an image.
I was very happy to see that STAR also shared the data behind the images. However, I had never worked with these levels of sophisticated geo-data before; hdf and GeoTiff files. Thankfully, I had just seen the wonderful presentation of Rob Simmon on GDAL (the Geospatial Data Abstraction Library) at OpenVisConf. And according to Google, GDAL should be able to open these kinds of files. I followed the quick and hassle free installation steps as outlined in Rob's blog. However, instead of trying to parse the files in the command-line, which Rob's talk was about, I took to Google again to see if there was an R package instead. And of course there was! The appropriately named rgdal.
After also getting rgdal to work (you can read my steps at the top of this R file preparing the data), my next few hours were filled with understanding how to read in a GeoTiff file, what it contained, how I could play with it and finally how I could map it (& how to switch map projections!). These blogs really helped to understand the different aspects of working with the data: 1, 2 and 3 all from the wonderful website of neon data skills. My first goal was to recreate one of the images from the STAR website, so I knew I had done and understood the steps. It took about 6-8 hours, but I have to admit, even with the sub-optimal color palette, I think the image below is just amazing in its detailed nature (just check out the bigger version by clicking on the image below).
Great, but these images were about 22 million pixels/datapoints, per week! There was no way I could load that amount of data into the browser, and do that 52 times. I therefore had to create lower resolution files. I did some tests and eventually reducing the resolution to about 50 000 (non-water representing) pixels looked like a good middle ground. Small enough for the browser to handle/read, but high enough to still see interesting details.
As a note to somebody who I had this particular talk with: Yes, I could've made many highly detailed images and then turned these into a mp4 movie. Maybe that would've given the smallest file size with the highest resolution. But! That wasn't my goal this time! I wanted to learn something completely new; WebGL (or libraries build on top of WebGL) and this seemed like a good, but interesting starting project. Therefore, out of a purist approach, I wanted the eventual visualization to consist of actual numeric data read in and then made visual as thousands of small circles.
My next challenge was to think of a way to save the data in the smallest file(s) possible. Even 52 weeks of 50.000 datapoints each week is going to take a several megabytes. During my time trying to understand the file & data I noticed that > 90% of the weeks contained exactly the same number of datapoints. After some more investigation I saw that the x and y locations of these points were exactly the same. The first four weeks in the year had missing data, but they didn't contain any x-y locations that weren't in the other 48 weeks. I therefore made a separate file containing the x and y variable, and 52 separate files only containing 1 variable; the level of vegetation health. The row number would then connect the vegetation health value to the correct x and y location on the map (you can find all the data here). This made the file size of 1 week (i.e. 50.000 values) about 250 kB (I also checked out gzip'ing the files, but I couldn't find a way to then unzip them in the browser. Sad, because the files were reduced to only 35 kB)
Sketching was super simple this time around, since my idea was very simple. I wanted to turn the image/pixel based data about vegetation health into thousands of circles. And these circles would animate through the 52 weeks of data, giving the idea of pulsation; getting bigger and darker, although more transparent, when more healthy and smaller, more yellow for low values. And it was just going to be these circles, no other types of mapping "markers" such as country borders, our Earth is beautiful in itself :) I really had to do the more design based aspects (colors, sizes, etc.) with all of the actual data on my screen, so I didn't even brainstorm about more details during the sketch. Instead, most of the page below is filled with ideas on how to create the final datasets. How to make the files as small as possible (in the end I created even more minimal versions that I jotted down in the bottom right section of the left page).
I started out getting the data on the screen with HTML5 canvas. I knew that d3's standard approach of SVGs was definitely going to fail here, since I also wanted to implement transitions. Therefore, I didn't even try SVGs (however, I always use d3 for things such as scales, colors, the stuff to prepare the visual). Thankfully, canvas is quite straightforward, having done a few other projects in canvas over the past year (like Marble Butterflies). And I only wanted to plot circles at certain locations, so I got that working for 1 map/week's worth of data quite easily. Below are some steps in the process; first just the circles in the right location (all having the same size and color, but differing opacity); adding colors; adding a multiply blend mode and different circle sizes.
I then made a simple interval function that would switch between the 52 maps as fast as it could. So not even animating the whole. And as expected, that took about 2-3 seconds per map. Definitely not a "frame rate" that I could use for natural looking animations.
Therefore, I dove into Pixi. I opened up a whole bunch of examples, especially those that I could find on blockbuilder.org, combining d3 with Pixi, like this one by Irene Ros. I started out using PIXI.Graphics, but let me spare you any more details of the code, in short, it was surprisingly easy to pick up (especially compared to regl and WebGL that I went into later o_O). However, there were some weird pixel rounding "things" going on...
And it was slow...
I couldn't really find a solution to making Pixi faster for my specific case through Google, so I did the next best thing; I asked it on Twitter
And damn, I was so amazed by all of the people providing ideas! Even when I explained more, or asked more, practically all would reply and even share some sandbox examples! I had several conversations going on, some of them focusing on using regl or ThreeJS, but I also got some interesting ideas on Pixi. For example, I learned that to get the fastest performance with Pixi (even when it uses WebGL) you have to use something called "Sprites". You can sort of see this as small "images". This example with bunnies shows it quite well, you can have hundreds of thousands of the same 5 "bunny" images bouncing around. Or this example with the same green square png moving around. But I didn't have images, I had thousands of slightly different circles. But then I got the following tweet from Matt DesLauriers
Which I tried and it worked and it seemed fast enough! However, when you looked closely the circles weren't very circular. They looked rather pixelated, bummer...
And that's when I decided to give regl a try. I saw a very inspiring presentation by Mikola Lysenko at OpenVisConf featuring bouncing bunnies. And then when Peter Beshai uploaded this block that animates 100.000 points with regl (which is also mesmerizing to look at) I knew it was enough to get started with. A bit later Ricky Reusser send me the same demo, but coded slightly different, for those interested.
At first I hoped to get the hang of regl by going through the above (and other) examples. But after an hour or so I acknowledged the fact that I really didn't understand anything yet and that I had to read some introductions to WebGL, GLSL and shaders, hehe - It took a while, but my brain slowly started to wrap it's head around the concepts of shaders, fragments, vertices, attributes, uniforms and varyings (some sites that helped me 1, 2, 3, 4 and of course the Book of Shaders (although I only skimmed through the first 2 chapters)).
I started out with this very simple example block by Adam Pearce which creates a triangle. I then slowly starting adjusting it, relying heavily on Peter Beshai's block, to show circles on a map. Some things could stump me for quite some time, like opacity not acting like I expected (lower left image)...
It wasn't only Pixi that had it's difficulties in placing the circles on the map without strange effects. I got the below interesting circular-ish pattern in regl at some point. I eventually fixed the whole issue by making sure the size of the map would be an exact multiple of the number of points in both the horizontal and vertical direction.
Well, it took a lot of browsing through example code, but eventually I had a map in regl with circles and opacities (although no "multiply" blending going on, I didn't yet know how to get that working). But again, if I zoomed in, I saw the same pixelated effect, AARGH! And here I was really hoping that regl would not have the same issue as Pixi... I did notice that it was a bit faster in rendering than Pixi though.
While trying to find info on getting anti-aliased circles in regl I came across a snippet that showed that Pixi actually has an "anti alias" setting! And not long before, Alastair Dant made this animated Pixi example that I tested with 50.000 circles which still seemed to work smoothly. These two interesting avenues to explore brought me back to my Pixi based map.
Btw, at some point during the data preparation I made a big change in how I saved the final files. However, I made an error, which gave the result below where the locations are randomly shuffled, oops... Not such an interesting map anymore (*^▽^*)ゞ
Another hour or two of work adjusting the example by Alastair to my data, playing with some anti-aliasing things and I was finally looking at a smoothly changing map, YAY!
After I got Pixi working I started another Twitter request to help me with the anti-aliasing and "multiply" blend mode in regl (because I had noticed before that regl was faster than Pixi, so I wanted to give it another try). It wasn't long before Yannick Assogba send a block that was a remix on Peter Beshai's version, but then with circles instead of squares. And Alastair helped out again by making a block that animated a lot of circles using ReGL with anti-aliasing (looking amazing btw!). And a day before Robert Monfera had send a block showing how to transition between anti-aliased shapes in regl. These examples increased my understanding of how to tackle the anti-aliasing, which eventually led me to this blog that worked perfectly in my map. Look at those nice circles (even after zooming in):
Check! That only left the "multiply" blending that was missing from the the regl version. But that turned out to be one step too far. After a Twitter DM chat with Robert Monfera I had a collection of websites about WebGL blending functions, premultiplied alpha (don't ask...) and some Pixi source code files (Pixi was able to do "multiply" in WebGL, so maybe I could find a clue there). I was very surprised that I could not find a single example of a multiply blend in WebGL through Google (where the multiply was based on many elements overlapping, not just two predefined images). And I guess I shouldn't be surprised that I couldn't figure it out either, having only started learning about shaders 2 days before, hehe. I did get a lot of interesting other color combinations... Well, actually I did get one result where multiply was working (top right image), but I could not combine that with opacity/circle shapes (which, more than 2 days ago, I would've thought was weird, "opacity is separate from color right?"... well no, not quite in WebGL I've since learned). For those interested, these sites really helped me to understand the different blending functions: 1, 2, and a blending example by Alistair.
At some point the creator of regl, Mikola Lysenko, even started helping me, which was awesome, but we didn't manage to replicate the results that canvas & Pixi were showing. Eventually I had spend enough time experimenting with the blending. I therefore decided to leave it without any blending. The Pixi version was therefore going to be my "main" project site. However, since I had the regl version animating perfectly fine I decided to clean up the page; better color palette, adding titles and such, and link to it from the eventual main page. That way people can compare the different tool-based versions.
UPDATE | Apparently Ricky Reusser hadn't given up the "multiply" fight yet, and about a week after I published "A Breathing Earth" he tweeted a regl demo that had both multiply and opacity working (below is what I turned the demo into so I could compare it to, say, Illustrator). How amazing is that! So now I can say that all 3 versions are exactly the same ^_^ and regl is definitely the fastest. I even had to "slow" it down a bit otherwise it would just whoosh through a year in no time.
I had learned many things while going through all the wonderful examples people had send me. I therefore returned to the canvas version to see if I could make it faster. After a bit of messing around, I managed to get it to switch between maps every second, but that is still too slow for a smooth animation (nonetheless, I also cleaned up the canvas version and added it to the main page for comparison).
Btw, another new thing I implemented for this project is breathe.js. I heard about it when attending the d3.bayArea meetup last April. I noticed that the final map preparation loop was freezing up the browser for ±10 seconds. And with some very minor changes using breathe.js that wasn't happening anymore, without it seeming to take longer to prepare the data.
Eventually I added a bit more text to the main final version (and a legend), keeping it very minimal so the focus is on the map. And quick links to the minimal versions that use canvas, Pixi and regl. For those interested, I've been timing myself since January, and this project took me 57 hours to complete (from ideation, data, sketching and coding (3x, once for each "tool")), of which I'm guessing the regl "multiply" stuff took at least 10 hours...
Having chosen the regl library to work with WebGL I decided not to go into ThreeJS as well. It was just too much to handle in one week, so many new programming libraries, hehe. Nonetheless, I still want to share these 3 examples that show the concept of changing circles that I received through Twitter: a block with 50.000 circles animated with WebGL custom made by Robin Houston and two codepens with 50.000 animated circles, one using circle "pentagons" in ThreeJS and another version using DataTextures in ThreeJS both custom made by Matt DesLauriers.
This was a very technical project for me. Sure, the visual in itself isn't so out of the box as the typical Data Sketchesproject, it's been done before in different ways, but as I often tell others, you can't expect to create wonderful, crazy new things when you're just starting out with a tool. Instead, I haven't learned so many new (coding) languages/libraries within a week, since, well, maybe ever. And I couldn't have done it without the help of a lot of people. I would specifically like to thank Robert Monfera, Alastair Dant, Ricky Reusser, Matt DesLauriers, Peter Beshai, Mikola Lysenko, Yannick Assogba, Robin Houston, Amelia Bellamy-Royds, Jan Willem Tulp, Mike Brondbjerg, Paul Murray, and Mathieu Henri who have all helped me in different areas of either Pixi, regl, ThreeJS and more. I don't think I would've been able to get my map(s) working without the ideas and examples they shared. THANK YOU!
One random morning I was reading the World Wildlife Fund’s (WWF) magazine, which I receive for being a donor. Suddenly I realized that I would really want to make a visualization that related to something the WWF might do. I pitched the idea to Shirley and we went back and forth a bit on what general topic would work for both of us. And when Shirley found her angle, the data visualization survey that had just come out, we had our topic: “Community”.