In general, you cannot reduce the work by parallel processors. The work is related with the information content which has lower bounds that you cannot cross, regardless whether you use one or more processors. Hence, we consider the minimal work of algorithms used to solve a given problem (note that the work is independent of using one or more processors, since you can simulate any parallel algorithm also with a single processor). Now, keeping the minimal work, we aim at the reduction of the parallel time.
On page 37, that was successful, since the work is kept as the minimal work O(n), while the parallel time became O(1). Hence, with O(n) processors, the problem can be solved in a time independent of n, and with a constant number of processors less than O(n), you can utilize all of these processors. For sequential algorithms, having more than on processor is useless, but for the algorithm on page 37, that is not the case. Here, many processors help!