Abstract: A common technique in supervised learning with discrete losses, such as 0-1 loss, is to optimize a convex surrogate loss over R^d, calibrated with respect to the original loss. In particular, recent work has investigated embedding the original predictions (e.g. labels) as points in R^d, showing an equivalence to using polyhedral surrogates. In this work, we study the notion of the embedding dimension of a given discrete loss: the minimum dimension d such that an embedding exists. We characterize d-embeddability for all d, with a particularly tight characterization for d=1 (embedding into the real line), and useful necessary conditions for d>1 in the form of a quadratic feasibility program. We illustrate our results with novel lower bounds for abstain loss.