Hi, Manuel.
I address your questions below. First, though, I'm noticing that you are adjusting hyperparameters, but I'm not certain how you're checking the accuracy of the fitted models. Are you using a test set or the training data?
RE "What am I missing": Suppose that the decision boundary is quadratic. Would you expect a polynomial kernel of degree 9 to do any better than a polynomial kernel of degree 2? I suspect that you're using the training data to check the models' accuracy. Try using a test set or implement cross-validation. In general, a plot of the generalization error with respect to model complexity is bowl shaped, that is, a model with a degree 9 kernel might perform just as well as a model with a degree 2 kernel on the training data, but it might not generalize as well.
RE "setting SIGMA": Use the 'KernelScale' name-value pair. I will clarify this in the documentation.
RE "setting C": Use the 'BoxConstraint' name-value pair.
RE "10 times obs. than predictors": It's very likely that someone else in the community can speak to this better than I can. However, I can say that training an algorithm takes time. It's likely that a simpler data set was used in the example so that you can run the example and get the expected results, and interact with the code, in a reasonable amount of time.