Usually in a transmitter you need good control of the shape of the transmitted spectrum, including out-of-band emissions, adjacent energy suppression. Pulse-shaping for matched filtering is also done for some of the same reasons but also for maximization of SNR at the receiver. All of these tasks are often difficult or impossible with only one sample per symbol in order to meet transmit mask requirements (i.e., the out-of-band emission specs).
The performance after getting upsampled by two must be optimum for them to choose it