Matthew Petroff Wed, 15 May 2019 21:34:48 +0000 en-US hourly 1 Preliminary Color Cycle Set Ranking Results Wed, 15 May 2019 21:30:17 +0000 Continue reading ]]> Since I launched my color cycle survey in December, it has collected ~9.7k responses across ~800 user sessions. Although the responses are not as numerous as I’d like, there’s currently enough data for preliminary analysis. The data are split between sets of six, eight, and ten colors with ratios of approximately 2:2:1; there are fewer ten-color color set responses as I disabled that portion of the survey months ago, to more quickly record six- and eight-color color set responses. So far, I’ve focused on analyzing the set ranking of the six-color color sets, for which there are ~4k responses, using artificial neural networks. The gist of the problem is to use the survey’s pair-wise responses to train a neural network such that it can rank 10k previously-generated color sets; these colors sets each have a minimum perceptual distance between colors, both with and without color vision deficiency simulations applied.

➡ Click Here to Take Color Cycle Survey ⬅

As inputs with identical structure are being compared, a network architecture that is invariant to input order, i.e., one that produces identical output for inputs (A, B) and (B, A), is desirable. Conjoined neural networks1 satisfy this property; they consist of two identical neural networks with shared weights, the outputs of which are combined to produce a single result. In this case, each network takes a single color set as input and produces a single scalar output, a “score” for the input color set. The two scores are then compared, with the better scoring color set of the input pair chosen as the preferred set; put more concretely, the difference of the two scores is computed and used to calculate binary cross-entropy during network training. The architecture of the network appears in the figure below and contains 2077 trainable parameters.

Artificial Neural Network Architecture Diagram

Each color set consists of six colors, which are each encoded in the perceptually-uniform CAM02-UCS colorspace, with J encoding the lightness and a and b encoding the chromacity. The first two layers of the network are used to fit an optimal encoding to each of the color inputs; this is achieved by using a pair of three-neuron fully-connected layers for each of the six colors, with network weights shared between each sub-layer. The outputs of these color-encoding layers are then concatenated and fed to two more fully-connected layers, consisting of thirty-six neurons each. A final fully-connected layer consisting of single neuron is then use to produce a single scalar output. The entire network is then duplicated for the second color set being compared, and the difference between the two outputs is computed. Exponential linear unit (ELU) activation functions are used on the interior layers, and a sigmoid activation function is used on the final layer of each network.

The colors in each color set are ordered by hue, then chromacity, then lightness. This is a sensible ordering, but since hue is cyclic, the starting color is fairly arbitrary. Thus, before training the network, the data are augmented by performing cyclic shifts on the ordering of the six colors in each set. As this augmentation is performed on each of the two color sets in each survey response pair, the total training and test data set sizes are augmented by a factor of thirty-six. Prior to data augmentation, the survey response data are split, with 80% used as the training set and 20% used as the test set. In order to reduce overfitting, Gaussian dropout is used on both of the 36-neuron layers, with a rate of 0.4; L2 kernel regularizers are used on all layers, with a penalty of 0.001. The network was implemented using Keras, with the TensorFlow backend, and trained using binary-crossentropy and the Nesterov Adam optimizer, using default optimizer parameters.

Unfortunately, training this network proved to be problematic, with it often converging into a local minimum with a loss of 0.6931 ≈ ln(0.5); the network was learning to ignore the inputs and always produce the same output, resulting in an output of zero from the conjoined network. Previous work with conjoined networks did not run into this problem, since either higher dimensionality output was used to compute a similarity metric2 or non-binary training data were used.3 To resolve this issue, the output comparison was removed as well as the last fully-connected layer of each network; this was replaced with a single-neuron fully-connected layer with sigmoid activation, joining the two existing networks into a single network with a single output. As this is no longer a conjoined architecture but instead a single network, the input order matters, so the data were additionally augmented such that both ordering of each survey response pair would be used, doubling the number of training and test pairs.

With this change, the network could be successfully trained. However, this new network only worked with pair-wise data, which was troublesome. The 10k color sets to be ranked can be paired close to fifty million ways, which grows to more than three billion inputs to evaluate once the data augmentation is applied. The conjoined network, however, requires only 60k evaluations for the ranking, since a single instance of the network, without the output comparison, can be used to directly score a given color set. Thus, a hybrid approach was devised. The single-output non-conjoined network was first trained for fifty epochs. Its last layer was then removed, and the change to the original conjoined network was undone, but the existing training weights were kept. This partially pre-trained conjoined network was then trained for an additional fifty epochs. Due to the pre-training, the conjoined network no longer became stuck in the local minimum, allowing the advantages of the conjoined network to be reaped, while avoiding the training dilemma.

Since the training data only very sparsely cover the space of possible pairing and since the network does not always training consistently well, I decided it was best to train an ensemble of model instances. To this end, I trained the model 100 times, chose the best fifty instances as determined by the metric training accuracy + test accuracy - abs(training accuracy - test accuracy), calculated scores for each of the 10k color sets using these fifty trained model instances, and averaged the resulting scores for each color set. For both the training and test sets, the average accuracy was 58%. While considerably better than guessing randomly, it does seem a bit low at first glance. However, many of the color sets are similar and aesthetic preference is subjective, so perfect accuracy isn’t possible. To approximate an upper limit on achievable accuracy, I created a modified version of the color cycle survey that always presents the same six-color color sets in the same order and then entered 100 responses each of two consecutive days; 83 / 100 of my answers were consistent for the color set preference between the two days. Thus, I think 80% is a conservative upper limit on possible accuracy; including aesthetic preference differences between individuals, I think ~70% is a more practical upper limit for achievable accuracy.

A few variants of the network were evaluated, such as increasing or decreasing the number of layers or the size of the layers, as well as changing the activation functions. Adding additional layers or increasing the size of the existing layers did not appear to have an effect on the accuracy; removing one each of the color encoding and set encoding layers only led to at most a marginal decrease in accuracy. Using rectified linear unit (ReLU) activations on the interior layers led to marginally decreased accuracy. Adjusting the Gaussian dropout rate by 0.1 or 0.2 had little effect, and Gaussian dropout seems to work slightly better than standard dropout. Originally, a hue-chromacity-luminance representation was used for the color inputs, as is used to sort the input color order, but this had noticeably decreased accuracy; I suspect that the cyclic nature of hue values was the source of this reduced accuracy.

In addition to making the results more stable, this ensemble also allows for estimating the uncertainty between training runs; the plot below shows the average color set scores as a function of rank, with a 1-sigma error band.

This shows that according to the model, that while the best color sets are definitely better than the worst color sets, color sets that are close in ranking are not necessarily any better or worse than the hundreds of color sets with similar rankings. Given the sparsity of the input data, this result is not surprising. The results can also be evaluated qualitatively; the figure below shows the fifteen lowest ranked color sets on the left and the fifteen highest ranked color sets on the right.

Ranked Color Sets Visualization

To my eye, the best color sets definitely look better than the worst color sets. The worst sets appear to be darker, more saturated, and generally a bit garish; note that the lightness and color distance limits applied when the color sets were generated excluded the vast majority of truly awful color sets for this evaluation. I find the highest-ranked color set, as well as many of the other highly-ranked color sets, to be quite pleasant; some of the other highly-ranked color sets contain blueish purplish colors that I find to be a bit over-saturated, so there’s definitely still room for improvement.

I hope that this post convincingly shows the validity of the data-driven premise on which the color cycle survey is based. It was certainly a relief to me when I was first able to get test accuracy results consistently above 50%, since it meant there wasn’t an egregious mistake in the survey code; seeing consistent color set rankings between training runs gave further relief, since it showed that the concept was working as I had hoped. Moving forward, I plan to next consider color cycle ordering for the six-color color sets. The initial plan is to use the same network architecture but to train it with the color cycle ordering responses (three pairs per response); the trained network could then be used to determine an optimal ordering by ranking the 720 possible six-color cycle orderings for a given color set and choosing the highest-ranked ordering. Once I have a workable cycle ordering analysis technique, I’ll apply both the set choice and cycle ordering analyses to the eight-color color set data, which will hopefully be straightforward.

Another interesting avenue to pursue would be to try to create a single network that can handle various sized color cycles, as this would allow all of the survey results to be used at once and would allow the results to be generalized beyond the number of colors used in the survey; however, I’m not yet sure how to approach this. An additional thought is to devise a metric that combines the network-derived score with some sort of color-nameability criterion, probably derived from the xkcd color survey, and use that to rank the color sets, favoring colors that can more easily be named, instead of just using the network-derived score directly. As I mentioned at the beginning of this post, I’d really like more data with which to improve the analysis; with increased confidence from these preliminary results, I’ll try to further promote the color cycle survey.

If you haven’t yet taken the color cycle survey (or even if you have), please consider taking it:

  1. Bromley, Jane, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. “Signature verification using a ‘Siamese’ time delay neural network.” In Advances in neural information processing systems, pp. 737-744. 1994. 

  2. Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. “Siamese neural networks for one-shot image recognition.” In ICML deep learning workshop, vol. 2. 2015. 

  3. Burges, Christopher, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Gregory N. Hullender. “Learning to rank using gradient descent.” In Proceedings of the 22nd International Conference on Machine learning (ICML-05), pp. 89-96. 2005. doi:10.1145/1102351.1102363 

]]> 0
Hilbert Curve Cake Tue, 02 Apr 2019 03:37:36 +0000 Continue reading ]]> Three years ago, I entered an Ashley Book of Knots Cake into the Johns Hopkins University Sheridan Libraries’ third annual Edible Book Festival. For this year’s contest, I figured I could apply my 3D-printed Hilbert curve microwave absorber research to craft a cake for Hans Sagan’s Space-Filling Curves book1 on the eponymous topic. Thus began an endeavor involving thermoplastic, silicone, and sugar.

Hilbert curve cake

I saw two ways to make a cake shaped as a Hilbert curve, using an appropriately shaped baking mold or painstakingly carving the appropriate shape out of a baked cake, with the former option being the logical path to pursue. This raised the question, how does one create such a mold? Baking molds are generally either metal or silicone, with silicone having the distinct advantage of being much easier to work with for such a shape, since it can be cast at room temperature. Thus, one needs to create a mold with which to cast the silicone baking mold. Fortunately, 3D-printing is well suited for this, and I already had experience 3D-printing Hilbert curve geometries.

Starting from my existing Hilbert Curve solid models, I designed a two-part mold for a third-order geometric approximation of the Hilbert Curve. Compared to a single-part mold, a two-part mold allows for a thinner silicone wall thickness, which reduces silicone material usage and makes it easier to turn the mold inside-out, a necessary step in removing the eventual cake from the baking mold. This mold was printed from PETG—for no particular reason besides having it around—on a Lulzbot TAZ 6 printer; as the mold is rather large, a printer with a large build volume is necessary. A 1.2 mm nozzle was used to reduce the printing time, a single wall extrusion2 and 10% infill were used to reduce material usage, and a raft was printed below the part to aid with removal from the printer’s bed. When generating the G-code, a solid layer was added just below the Hilbert curve geometry to ensure that it would print correctly with the low infill percentage.

Hilbert curve cake plastic mold top

Hilbert curve cake plastic mold bottom

Once the mold was assembled, 1 kg of food-safe silicone3 was mixed, vacuum degassed, poured into the mold, and allowed to cure.

Assembled Hilbert curve cake plastic mold

Hilbert curve cake plastic mold filled with silicone

While I had hoped that the two-part plastic mold would allow the silicone mold to be easily removed once it had cured, this was an incredibly naive notion. After all attempts to carefully disassemble the plastic mold and remove the cured silicone failed, I ended up smashing the plastic mold to bits in order to free the silicone mold.4

Remnants of Hilbert curve cake plastic mold after removing cured silicone

I then thoroughly washed the silicone mold and was finally ready to begin baking. To increase my chances of success, I decided to use a recipe meant for a Bundt pan, a lemon pound cake.5 The recipe is below, courtesy of my mother.

2.5 cups sugar
1 cup butter
4 eggs
1 teaspoon vanilla extract
0.5 teaspoon lemon extract
1 cup milk
1 tablespoon lemon juice
3 cups flour
0.5 teaspoon baking soda

In mixing bowl, cream together sugar and butter until light and fluffy. Add eggs one at a time, beating well after each. Stir in vanilla and lemon extracts. Mix lemon juice into milk. Thoroughly sift together the flour and baking soda. Add flour mixture to creamed mixture alternately with milk solution, beating well after each addition. Pour into greased mold.

The silicone baking mold was placed into an 8″ square cake pan for support6 and thoroughly greased with shortening; while silicone baking molds don’t need to be greased, in theory, I decided it was best to do so if I were to have any chance of removing the complicated cake geometry from the mold in one piece.

Greased Hilbert curve cake mold

Filling Hilbert curve cake mold with batter

Hilbert curve cake mold filled with batter

The cake was then baked at 350°F for 135 minutes, with a pan of water also in the oven to try to prevent the top crust from hardening too much. As the mold does not have a hole in the center like a Bundt pan does, this baking time is considerably longer than what was specified by the original recipe. I learned this the hard way, since my first attempt came out under-baked, ruining the curve geometry. Once removed from the oven, I allowed the cake to cool for half and hour, before placing it in the freezer for seven hours; I reasoned that a frozen cake would be the least likely to break apart while attempting to remove it from the mold. Once I removed the frozen cake from the freezer, I was able to flip it over and slowly and carefully turn the silicone mold inside-out to remove the cake in one piece, starting from the corners where the curve does not end.7

Baked Hilbert curve cake in mold

Removing Hilbert curve cake from mold

I then flipped the cake back over and sliced the bottom flat.

Slicing bottom off of Hilbert curve cake

Hilbert curve cake before decoration

Finally, the cake was ready for its finishing touch, a lemon glaze, mixed from a ratio of one cup confectioner’s sugar to two tablespoons lemon juice. I carefully and methodically brushed on four coats of glaze using a silicone basting brush, completing the cake.

Hilbert curve cake

Hilbert curve cake

Hilbert curve cake at Edible Book Festival

Sadly, I didn’t win anything, just like last time, but there was stiff competition from many excellent cakes. The model files from the mold are available.

  1. H. Sagan, Space-Filling Curves (Springer-Verlag, 1994). ISBN: 9780387942650. DOI: 10.1007/978-1-4612-0871-6

  2. This led to some gaps in the part’s wall, which allowed some silicone to leak into the interior of the mold, making removal more difficult. 

  3. Smooth-On Smooth-Sil 940 

  4. The single wall extrusion helped here, as did the single bottom layer. 

  5. I also like the taste. 

  6. By happenstance, it fit almost perfectly; I had sized the mold based on silicone and available print volumes. 

  7. One of the ends of the curve did break off partially, but it was easy to reattach. 

]]> 0
3D-Printed Tea Bag Holder Fri, 01 Mar 2019 00:16:29 +0000 Continue reading ]]> When readily available containers do not come in the desired form factor, 3D-printing can be quite useful. In this case, I wanted a tea bag holder that fit on a small ledge, allowed the tea bag labels to be read, and allowed the tea bags to be easily removed. Although there are some similar products available commercially that would fit the space, they either seemed a bit flimsy or looked to be a tight fit around the tea bags, which would make them more difficult to remove. Thus, I designed and 3D-printed a modular holder that can be stacked. The holder was printed out of PLA, and the design files are available.

Tea Bag Holder

]]> 3
Color Cycle Survey Fri, 07 Dec 2018 03:23:01 +0000 Continue reading ]]> A previous post about randomly generating color sets with a minimum perceptual distance addresses the technical aspects of generating sets of colors that are visually distinct for those with normal color vision as well as for those with color vision deficiencies. However, it does not address the aesthetic aspect, which I will start to address here. To create an aesthetically pleasing color cycle—an ordered set of colors for visualizing categorical data—two aspects need to be addressed, the colors that are used and the order that they are used in. While one could take an ontological approach to this by trying to define a set of rules that make a pleasing color cycle, as is done by I Want Hue,1 such a method is error-prone and substantially biased toward the personal preferences of the drafter of the rules. Instead of using ontologies, an alternative approach that is gaining traction in many fields is to infer a pattern from a large data set using machine learning techniques. This is the approach I wish to pursue.

To this end, I’ve created a Color Cycle Survey. After presenting the user with an introduction, colorblindness questionnaire, and directions, the primary survey starts. In it, the user is presented with two color sets and is asked to choose the one that is, in the user’s opinion, more aesthetically pleasing. Then, four orderings of the chosen set are displayed, and the user again makes a selection to taste. This basic process is then repeated again and again, with sets of either six, eight, or ten colors. For the choice of color set, each set is presented ordered by hue, since this makes the two sets easier to compare than if they were randomly ordered (or ordered by RGB values). Only two sets are presented, to make for an easier choice. Additionally, a line plot or scatter plot rendering is shown with each color set. For the choice of ordering, four orders are presented, since multiple orderings are easier to compare than sets, since the colors are all the same. I considered asking the user to order the colors to taste, instead of presenting possible orderings, but I decided that while such an approach yields more information per response, it takes much longer and requires more effort, so each user will likely respond many fewer times. Thus, I went with the simpler approach.

If all goes well, I’ll amass a sizable data set from the survey, which will remain available for at least a few months. Once I have data to experiment with, I’ll work out the exact analysis method. In addition to generating an “optimal” color cycle, it would also be interesting to create a model that allows for additional constraints, such as being able to choose the first color or being able to choose the exact number of colors in the cycle. Once anonymized, I’ll release the survey data under a permissive license, probably CC BY 4.0 (I’m open to suggestions). Any generated color cycles will be release into the public domain via the CC0 public domain dedication.

  1. I Want Hue takes a fairly rudimentary approach to color vision deficiency simulation, which I find lacking; I personally have difficulty differentiating colors in the many of the sets it generates, even when its colorblind mode is turned on. It also doesn’t really address the ordering of the color set into a cycle.  

]]> 2
Randomly Generating Color Sets with a Minimum Perceptual Distance Sat, 27 Oct 2018 16:14:11 +0000 Continue reading ]]> Earlier this year, I released a color cycle picker that enforces a minimum perceptual distance between colors, including color vision deficiency simulations, with the goal of creating a better color cycle to replace the “category 10” color palette used by default in Matplotlib, along with other data visualization packages. While the picker works well for what it was designed for—allowing a user to create a color cycle—it requires user intervention to create color sets or cycles.1 The basic technique used—performing color vision deficiency simulations2 for various types of deficiencies and enforcing a minimum perceptual difference for the simulated colors using the CAM02-UCS3 perceptually uniform color space (where each type of deficiency is treated separately) and a minimum lightness distance (for grayscale)—is still valid for the random generation of color sets; it just needs to be extended to randomly sample the color space.

A few different color sets

To randomly sample the available RGB color space, I started with the excellent Colorspacious Python library, which is capable of doing the requisite color vision deficiency simulations and perceptual distance calculations. However, it’s too slow for what I wanted to accomplish. Thus, I stripped the library down to the bare essentials and optimized it with the Numba JIT compiler. Since RGB to CAM02-UCS conversions are computationally expensive, but the 16.8 million possible 8-bit RGB colors easily fit in memory, the CAM02-UCS colors are precomputed for every possible color, both for normal color vision and the three types of color vision deficiency. Since very dark and very light colors are poor choices for data visualization, only colors with J \in [40, 90] are used, leaving 13.1 million colors to sample from.

To generate a color set, a starting color is chosen at random. Then, each possible color is check to see if it is far enough away in both lightness and perceptual distance, both for normal color vision and for those with color vision deficiency, at the maximum chosen color vision deficiency severity. Of these remaining colors, one is chosen at random. The process is then repeated until the color set contains the desired number of colors. This method has an advantage over rejection sampling, since it is guaranteed to return and was found to be faster. After the color set is generated, it is checked at intermediate levels of color vision deficiency severity to ensure that the minimum perceptual distance requirement is met there as well; it the distance requirement is not met, the color set is thrown out. Checking a coarse color vision deficiency interval during set generation was tried but removed, since the performance penalty outweighs the gains from having to try again fewer times. With this method in place, it is now possible to randomly generate color sets of various sizes that meets various minimum perceptual distance and minimum lightness distance requirements. However, substantial computational resources required to generate a large number of color sets.

Using this code, I’ve generated six, eight, and ten color sets with what I think are reasonable minimum perceptual and lightness distances, where reasonable means that the colors are easy enough to tell apart while still allowing a reasonably large range of different colors to be used. Full deuteranopia, protanopia, and tritanopia simulations were used. For each configuration, 10 000 random sets were generated on a 28-core machine, a process that took from around nine hours for the six color configuration to around three days for the ten color configuration. The code and generated color sets are available in a repository on GitHub.

While the individual colors in the color sets are easy enough to tell apart, the colors and their combinations are not necessarily aesthetically pleasing. I’m currently working on something to address this shortcoming; details will follow in a subsequent blog post.

  1. A color set doesn’t have a defined order, while a color cycle does. 

  2. G. M. Machado, M. M. Oliveira, and L. A. F. Fernandes, “A Physiologically-based Model for Simulation of Color Vision Deficiency,” in IEEE Transactions on Visualization and Computer Graphics, vol. 15, no. 6, pp. 1291-1298, Nov.-Dec. 2009. doi:10.1109/TVCG.2009.113  

  3. Luo M.R., Li C. (2013) CIECAM02 and Its Recent Developments. In: Fernandez-Maloigne C. (eds) Advanced Color Image Processing and Analysis. Springer, New York, NY. doi:10.1007/978-1-4419-6190-7_2  

]]> 0