True, NBU Vault was designed when tape was the primary storage media. However, until SLP has matured, NBU Vault can be deployed instead when you need better control when to duplicate. In tiny to small environments (< 100 clients), SLP won't have the such a huge impact, but in any decent sized environment SLP is a pain point.
The common problem you mention is that duplication is single concurrent job per vault (other dup jobs stay queued). Each defined vault keeps a vault lock, and thus only one dup job can execute per vault. The trick around this is to define several vaults within a tape library definition. This way you can create several profiles split over many vaults, and thus you can utilise more drives. This approach does require more design work to balance the workload...
I'm eagerly waiting for scheduled SLP, or at least blackout windows...
/A