Since my last update on the Color Cycle Survey, there have been no drastic changes, but responses have continued to trickle in. There are now ~13.7k total responses, with ~6k responses each for the six color and eight color components. This long-delayed—and somewhat brief—post serves as an update to my previously published six color analysis, while also extending it to eight colors.
I have only made minor changes to the previously detailed analysis procedures (see previous set ranking and order ranking posts for details), but there are now ~50% more responses, which has helped with training stability and has reduced uncertainty between different models in the network ensemble. The figure below shows the fifteen lowest ranked six-color color sets on the left and the fifteen highest ranked six-color color sets on the right.
The accuracy for both the training and tests sets remained at 58%. The plot below shows the average six-color color set scores as a function of rank, with a 1-sigma error band.
Using the highest-ranked six-color set, the figure below shows the fifteen lowest ranked orderings on the left and the fifteen highest ranked orderings on the right.
Accuracy was similar to before, with an accuracy of 55% on the training set and an accuracy of 54% on the test set. The plot below shows the average ordering scores as a function or rank, with a 1-sigma error band.
Next, the same technique was extended to the eight-color color sets. The figure below shows the fifteen lowest ranked eight-color color sets on the left and the fifteen highest ranked eight-color color sets on the right.
The accuracy was 57% for both the training and test sets. The plot below shows the average eight-color color set scores as a function of rank, with a 1-sigma error band.
Using the highest-ranked eight-color set, the figure below shows the fifteen lowest ranked orderings on the left and the fifteen highest ranked orderings on the right.
The accuracy was 55% on the training set and 53% on the test set. The plot below shows the average ordering scores as a function or rank, with a 1-sigma error band.
This is an incremental improvement over the previous results, as it just used extra data, while keeping the analysis procedure the same. The fact that accuracy was similar when the analysis was extended to eight-color color sets and color cycles is promising. I’d like to devise a method that combines both the six-color and eight-color color sets in the training process to maximize the use of the response data; I have a few ideas on how to do this but nothing concrete yet. I’ve also looked more into the idea of devising a color namability criterion by reanalyzing the xkcd Color Survey results. While my reanalysis has led to some interesting tidbits about color names, it didn’t really pan out as far as becoming a useful criterion for ranking the color sets at hand. I’ve been trying to clarify the licensing on the raw xkcd Color Survey responses database dump before writing up my findings, but so far, I have not received a reply from Randall Munroe (which is understandable). As always, more responses would be helpful. I had not originally intended for the survey to go on as long as it has, but as I’ve been busy with my normal (cosmology-related) research and as I’ve not received as many responses as I had hoped for, the survey remains open to responses. I plan on leaving it open until the analysis is close to final (at least a few more months), after which I’ll close the survey to responses and execute the final analysis runs.