Last month, I presented a preliminary analysis of ranking color sets using responses collected in the Color Cycle Survey. Now, I extend this analysis to look at color ordering within a given color set. For this analysis, the same artificial neural network architecture was used as was used before, except that batch normalization, with a batch size of 2048, was used after the two Gaussian dropout layers. Determining ordering turned out to be a slightly more difficult problem, in part because the data cannot be augmented, since the ordering, obviously, matters. However, due to the way the survey is structured, with the user picking the best of four potential orderings, there are three pairwise data points per response. The same set of responses was used, ignoring the additional responses collected since the previous analysis was performed (there are now ~10k total responses).
To maximize the information gleaned from the survey responses, the network was trained in four steps. The process started with a single network and ended with a conjoined network, as before, except the single network underwent three stages of training instead of one. First, the color set responses—the responses that were used in the previous analysis—were used to train the network for 50 epochs, to learn color representations. Next, the ordering responses were used with the data augmented with all possible cyclic shifts to train the network for an additional 50 epochs, to learn internal cycle orderings. Then, the non-augmented ordering responses were used to train the network for another 100 epochs, to learn the ideal starting color. Finally, the last layer of the network was replaced, as before, to make a conjoined network, and the new network was trained for a final 100 epochs, again with the non-augmented ordering responses.
As with the previous analysis, an ensemble of 100 network instantiations was trained, and the average and standard deviation of the scores were computed. The accuracy for the ordering was a bit worse than for the color sets, with an accuracy of 56% on the training data and an accuracy of 54% on the test data. Since the ideal ordering depends on the specific color set used, the highest ranked color set from the previous analysis was used in this evaluation. The error band from the trained ensemble for this color set was larger than the error band from the set ranking analysis. While the model could be evaluated for any color set, it is likely more accurate for color sets that were ranked highly in the previous analysis, since the Color Cycle Survey only asks the user about the preferred ordering of the user’s preferred color set, so data are not collected on poorly-liked color sets.
The trained network shows a clear preference for blue / purple as the first color instead of green / yellow; as many existing color cycles start with blue, this seems reasonable. The network also seems fairly confident in picking the third color, since it’s the same for the top fifteen orderings, but there’s more variation in the second color.