We study the problem of coloring a given graph using a small number of colors in several well-established models of computation for big data. These include the data streaming model, the general graph query model, the massively parallel communication (MPC) model, and the CONGESTED-CLIQUE and the LOCAL models of distributed computation. On the one hand, we give algorithms with sublinear complexity, for the appropriate notion of complexity in each of these models. Our algorithms color a graph $G$ using $\kappa(G)\cdot (1+o(1))$ colors, where $\kappa(G)$ is the degeneracy of $G$: this parameter is closely related to the arboricity $\alpha(G)$. As a function of $\kappa(G)$ alone, our results are close to best possible, since the optimal number of colors is $\kappa(G)+1$. For several classes of graphs, including real-world ‘‘big graphs,’’ our results improve upon the number of colors used by the various $(\Delta(G)+1)$-coloring algorithms known for these models, where $\Delta(G)$ is the maximum degree in $G$, since $\Delta(G) \ge \kappa(G)$ and can in fact be arbitrarily larger than $\kappa(G)$.
On the other hand, we establish certain lower bounds indicating that sublinear algorithms probably cannot go much further. In particular, we prove that any randomized coloring algorithm that uses at most $\kappa(G)+O(1)$ colors would require $\Omega(n^2)$ storage in the one pass streaming model, and $\Omega(n^2)$ many queries in the general graph query model, where $n$ is the number of vertices in the graph. These lower bounds hold even when the value of $\kappa(G)$ is known in advance; at the same time, our upper bounds do not require $\kappa(G)$ to be given in advance.